Operating system level virtualization
Virtualization technologies such as KVM and XEN allow individual virtual machines to have their own independent operating systems. Unlike virtualization technologies such as KVM and XEN, so-called operating system level virtualization, also known as containerization, is a feature of the operating system itself that allows the existence of multiple isolated user space instances. These user space instances are also referred to as containers. A normal process can see all the resources of a computer and a process in the container can only see the resources assigned to that container. Generally speaking, operating system level virtualization groups the computer resources managed by the operating system, including processes, files, devices, networks, etc., and then hands them over to different containers. A process running in a container can only see the resources assigned to that container. Thereby achieving the purpose of isolation and virtualization.
Implementing operating system virtualization requires the use of Namespace and cgroups technologies.
Namespace
In programming languages, the concept of introducing namespaces is to reuse variable names or service routine names. Use the same variable name in different namespaces without conflict. The introduction of namespaces in Linux systems has a similar effect. For example, in a Linux system without operating system level virtualization, the user state process starts with a number (PID). After the introduction of operating system virtualization, different containers have different PID namespaces, and processes in each container can be numbered starting from 1 without conflict.
Currently, there are six types of namespaces in Linux, which correspond to the six resources managed by the operating system:
Mount point CLONE_NEWNS
Process (pid) CLONE_NEWPID
Network (net) CLONE_NEWNET
Interprocess communication (ipc) CLONE_NEWIPC
Hostname (uts) CLONE_NEWUTS
User (uid) CLONW_NEWUSER
In the future, the corresponding namespaces of time, device, etc. will be introduced.
Linux 2.4.19 introduced the first namespace - mount point, because there is no other type of namespace at that time, so the flag introduced in the clone system call is called CLONE_NEWNS
Three system calls related to the namespace (system calls)
The following three system calls are used to manipulate the namespace:
Clone() - used to create new processes and new namespaces, new processes will be placed in the new namespace
Unshare() - create a new namespace but not create a new child process, the child process created afterwards will be placed in the newly created namespace
Setns() - join the process to an existing namespace
Note: These three system calls will not change the pid namespace of the calling process, but will affect the pid namespace of its child processes.
The namespace itself does not use the name (囧), and different namespaces are identified by different inode numbers, which is also consistent with the conventions of Linux files. You can view the namespace to which a process belongs in the proc file system. For example, check the namespace to which the process with PID 4123 belongs:
Kelvin@desktop:~$ls -l /proc/4123/ns/
Total usage 0
Lrwxrwxrwx1kelvin kelvin012 month 2616:28cgroup -> cgroup:[4026531835]
Lrwxrwxrwx1kelvin kelvin012 month 2616:28ipc -> ipc:[4026531839]
Lrwxrwxrwx1kelvin kelvin012 month 2616:28mnt -> mnt:[4026531840]
Lrwxrwxrwx1kelvin kelvin012 month 2616:28net -> net:[4026531963]
Lrwxrwxrwx1kelvin kelvin012 month 2616:28pid -> pid:[4026531836]
Lrwxrwxrwx1kelvin kelvin012 month 2616:28user -> user:[4026531837]
Lrwxrwxrwx1kelvin kelvin012 month 2616:28uts -> uts:[4026531838]
The following code demonstrates how to use the above three system calls to manipulate the process's namespace:
#define _GNU_SOURCE
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#define STACK_SIZE (10 * 1024 * 1024)
Charchild_stack[STACK_SIZE];
Intchild_main(void* args){
Pid_t child_pid = getpid();
Printf("I'm child process and my pid is %d ",child_pid);
// The child process will be placed in the newly created pid namespace of the clone system, so its pid should be 1
Sleep(300);
// The inode of the namespace will be deleted after all processes in the namespace exit, leaving it for subsequent operations
Return0;
}
Intmain(){
/* Clone */
Pid_t child_pid = clone(child_main,child_stack + STACK_SIZE,\
CLONE_NEWPID | SIGCHLD, NULL);
If(child_pid < 0){
Perror("clone failed");
}
/* Unshare */
Intret = unshare(CLONE_NEWPID);// The parent process calls unshare, creating a new namespace.
//But will not create a child process. The child process created later will be added to the new namespace.
If(ret < 0){
Perror("unshare failed");
}
Intfpid = fork();
If(fpid < 0){
Perror("fork error");
}elseif(fpid == 0){
Printf("I am child process. My pid is %d ", getpid());
// The child process after Fork will be added to the namespace created by unshare, so pid should be 1
Exit(0);
}else{
}
Waitpid(fpid,NULL,0);
/* Setns */
Charpath[80] = "";
Sprintf(path,"/proc/%d/ns/pid",child_pid);
Intfd = open(path,O_RDONLY);
If(fd == -1)
Perror("open error");
If(setns(fd,0) == -1)
// setns does not change the namespace of the current process, but will set the namespace of the child process created later.
Perror("setns error");
Close(fd);
Intnpid = fork();
If(npid < 0){
Perror("fork error");
}elseif(npid == 0){
Printf("I am child process. My pid is %d ", getpid());
// The new child process will be added to the pid namespace of the first child process, so its pid should be 2
Exit(0);
}else{
}
Return0;
}
operation result:
$sudo./ns
I'mchildprocess andmy pid is1
Iam childprocess.My pid is1
Iam childprocess.My pid is2
Control group (Cgroups)
If the namespace is isolated from the perspective of naming and numbering, the control group is to group the processes and truly limit and isolate the computing resources of each group of processes. A control group is a kernel mechanism that groups processes and tracks the computing resources they use. For each type of computing resource, the control group is controlled by a so-called subsystem. The existing subsystems at this stage include:
Cpusets: is used to allocate a set of CPUs to the specified cgroup. The processes in the cgroup are only scheduled to be executed on the CPU of the group.
Blkio : block IO of cgroup
Cpuacct : used to count CPU usage in cgroup
Devices : used to control the device nodes that cgroups can create and use in black and white lists.
Freezer : used to suspend the specified cgroup, or wake up the suspended cgroup
Hugtlb : Used to limit the use of hugetlb in cgroups
Memory : used to track the use of restricted memory and swap partitions
Net_cls : Used to mark packets according to the cgroup of the sender. The traffic controller assigns priorities based on these tags.
Net_prio : used to set the network communication priority of the cgroup
Cpu: used to set the scheduling parameters of the CPU in the cgroup
Perf_event : used to monitor cgroup CPU performance
Unlike the namespace, the control group does not add system calls. Instead, it implements a file system that manages control groups through file and directory operations. Let's take a look at an example of how a cgroup uses the cpuset subsystem to bind a process to a specified CPU.
1. Create a shell script that is executed all the time
#!/bin/bash
x=0
While[True];do
Done;
2. Execute this script in the background
# bash run.sh &
[1]20553
3. See which CPU the script is running on
# ps -eLo ruser,lwp,psr,args | grep 20553 | grep -v grep
Root 20553 3bash run.sh
You can see that the process with PID 20553 runs on the CPU numbered 3. The following uses cgroups to bind it to the CPU number 2 to execute.
4. Mount the file system of type cgroups into a newly created directory cgroups
# mkdir cgroups
# mount -t cgroup -o cpuset cgroups ./cgroups/
# ls cgroups/
Cgroup.clone_children cpuset.memory_pressure_enabled
Cgroup.procs cpuset.memory_spread_page
Cgroup.sane_behavior cpuset.memory_spread_slab
Cpuset.cpu_exclusive cpuset.mems
Cpuset.cpus cpuset.sched_load_balance
Cpuset.effective_cpus cpuset.sched_relax_domain_level
Cpuset.effective_mems docker
Cpuset.mem_exclusive tasks
Cpuset.mem_hardwall notify_on_release
Cpuset.memory_migrate release_agent
Cpuset.memory_pressure
5. Create a new group group0
# mkdir group0
# ls group0/
Cgroup.clone_children cpuset.mem_exclusive cpuset.mems
Cgroup.procs cpuset.mem_hardwall cpuset.sched_load_balance
Cpuset.cpu_exclusive cpuset.memory_migrate cpuset.sched_relax_domain_level
Cpuset.cpus cpuset.memory_pressure notify_on_release
Cpuset.effective_cpus cpuset.memory_spread_page tasks
Cpuset.effective_mems cpuset.memory_spread_slab
6. Add the above process 20553 to the newly created control group:
# echo 20553 >> group0/tasks
# cat group0/tasks
20553
7. The process that restricts this group can only be run on CPU number 2
# echo 2 > group0/cpuset.cpus
# cat group0/cpuset.cpus
2
8. View the CPU number running by the process with PID 20553
# ps -eLo ruser,lwp,psr,args | grep 20553 | grep -v grep
Root 20553 2bash run.sh
The above example simply shows how to use the control group. The control group is operated by files and directories, and the file system is a tree structure, so if you do not impose restrictions on the use of cgroups, the configuration will become extremely complicated and confusing. Therefore, some restrictions have been made in the new version of cgroups.
summary
This article briefly introduces the concept of operating system virtualization and the technology for implementing operating system virtualization - namespaces and control groups. The use of namespaces and control groups is demonstrated by two simple examples.
Sensor Series
Sensor series include electrode type water inlet detection sensor, Pressure Type Liquid Level Sensor (Marine), pressure type liquid level sensor (Marine side mounted type), Marine High Temperature Pressure Sensor (LED display) Compact Temperature Sensor , explosion-proof temperature sensor, UHC Marine Electrode Water Level Sensor, pressure sensor (shockproof type), marine pressure liquid level sensor (ceramic capacitive type) Radar Level Gauge Sensor , floating ball level sensor, Differential Pressure Sensor , Temperature And Humidity Sensor (dry battery power supply), explosion-proof (high temperature) pressure sensor, Gas Sensor and Marine Pressure Sensor . pipeline oil pollution detection sensor.
Level Sensor For Liquid,Small Temperature Sensor,Low Temperature Sensor,Water Temperature Sensor
Taizhou Jiabo Instrument Technology Co., Ltd. , https://www.taizhoujbcbyq.com