【Linux】进程程序替换

思维导图

学习目标

学习进程替换的原理，掌握一些exec*函数的用法。

一、进程的程序替换的原理

用fork创建子进程后，子进程执行的是和父进程相同的程序（但有可能执行不同的代码分支），若想让子进程执行另一个程序，往往需要调用一种exec函数。

当进程调用一种exec函数时，该进程的用户空间代码和数据完全被新程序替换，并从新程序的启动例程开始执行。这种替换类似于数据修改时的写时拷贝，不会将代码直接覆盖影响到父进程的后续代码。

进程 = 内核数据结构 + 代码 + 数据，代码和数据是要被替换的，而内核数据结构基本不变，没有释放结构，没有创建新的进程。我们可以通过代码来检验是否创建了子进程？

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>int main()
{printf("taskexec.....begin\n");pid_t id = fork();if(id == 0){printf("child pid : %d\n", getpid());sleep(2);execvpe("./pragma", argv, environ);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0){if(WIFEXITED(status)){printf("child quit success, child exit code: %d\n", WEXITSTATUS(status));}else{printf("child quit failed\n");}}printf("taskexec.....end\n");return 0;
}

站在被替换的进程的角度来看，本质上就是这个程序被加载到内存中去的。怎么将程序加载到内存中？？在Linux系统中，exec*函数类似于加载函数。

为什么我们要先将程序加载到内存中呢？？因为冯诺依曼体系结构要求，程序先放入内存中去，CPU只会去内存去寻找数据和代码。

二、替换函数

2.1 exec*系列函数

加载过程中需要操作系统进行，这种函数底层包括系统调用，因为要将程序加载到内存中。exec*系列函数执行完毕之后，后续的代码不见是正常的，因为代码被替换了，如果不想让后续代码被替换，我们可以使用多进程，让子进程区完成一些代码覆盖。exec*函数的返回值不用关心，只要替换成功，就不会向后走，反之，如果没有替换成功，就一定往后走。

2.2 利用多进程来进行函数替换

利用子进程进行函数替换操作，防止父进程的后续代码被覆盖，改成多进程版。创建子进程，让子进程自己去替换，父进程进行等待操作：可以让子进程执行父进程的一份代码，或者让子进程执行一份新的代码。创建子进程是为了让子进程去完成工作，在子进程中程序替换会发生写时拷贝。

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");pid_t id = fork();if(id == 0){sleep(2);execl("/usr/bin/ls", "ls", "-a", "-l", NULL);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0) {printf("father wait success, child exit code: %d\n", WEXITSTATUS(status));}printf("myprocess end.....\n");return 0;
}

2.3 一系列exec*函数（返回值不重要）

2.3.1 execl函数

int execl(const char* path, const char* arg,...);

l(list)：列表。列表来记录命令行中执行的命令。

execl函数的参数：第一个参数path：我们执行的程序需要带路径（怎么找到程序，用户要告诉函数），后面几个参数是可变参数，在命令行中怎么执行，我们就怎么进行传参。

总结：第一个参数的含义是帮我们怎么找到执行的程序，后面几个参数的含义是我们想怎么进行执行程序。

代码：

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");execl("/usr/bin/ls", "ls", "-a", "-l", NULL);printf("myprocess end.....\n");return 0;
}

这种函数不止能替换一些Linux指令，还能替换我们所写的程序：

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");pid_t id = fork();if(id == 0){sleep(2);execl("./test", "test", NULL);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0) {printf("father wait success, child exit code: %d\n", WEXITSTATUS(status));}printf("myprocess end.....\n");return 0;
}

#include <iostream>
#include <algorithm>
#include <unistd.h>
#include <sys/types.h>using namespace std;int main()
{cout << "C++: pid: %d\n" << getpid() << endl;cout << "C++: pid: %d\n" << getpid() << endl;cout << "C++: pid: %d\n" << getpid() << endl;return 0;
}

2.3.2 execv函数

int execv(const char* path, const char* argv[]);

v(vector)：指针数组，将命令全部存储在数组中，再将数组传递给execv函数。

execv函数的参数：由原来的可变参数变成了指针数组。

代码：

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");char* argv[] = {"ls","-a","-l",NULL};pid_t id = fork();if(id == 0){sleep(2);//execl("/usr/bin/ls", "ls", "-a", "-l", NULL);execv("/usr/bin/ls", argv);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0) {printf("father wait success, child exit code: %d\n", WEXITSTATUS(status));}printf("myprocess end.....\n");return 0;
}

2.3.3 execvp函数

v(vector)：指针数组，将命令全部存储在数组中，再将数组传递给execv函数。p(path)：路径，用户可以不传要执行的文件的路径（但是文件名要传递），直接告诉exec*，我要执行谁就可以。p：查找这个程序，系统会自动在环境变量PATH中进行查找。

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");char* argv[] = {"ls","-a","-l",NULL};pid_t id = fork();if(id == 0){sleep(2);execvp("ls", argv);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0) {printf("father wait success, child exit code: %d\n", WEXITSTATUS(status));}printf("myprocess end.....\n");return 0;
}

2.3.4 execvpe函数

v(vector)：指针数组，将命令全部存储在数组中，再将数组传递给execv函数。p(path)：路径，用户可以不传要执行的文件的路径（但是文件名要传递），直接告诉exec*，我要执行谁就可以。p：查找这个程序，系统会自动在环境变量PATH中进行查找。e(environment)：环境变量。

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");char* argv[] = {"test",NULL};char* engv[] = {"haha=1111111","hehe=2222222"};pid_t id = fork();if(id == 0){sleep(2);execvpe("./test", argv, engv);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0) {printf("father wait success, child exit code: %d\n", WEXITSTATUS(status));}printf("myprocess end.....\n");return 0;
}

#include <iostream>
#include <algorithm>
#include <unistd.h>
#include <sys/types.h>using namespace std;int main(int argc, char* argv[], char* engv[])
{int i = 0;for(i = 0; argv[i]; i++){cout << argv[i] << endl;}for(i = 0; engv[i]; i++){cout << engv[i] << endl;}cout << "C++: pid: %d\n" << getpid() << endl;cout << "C++: pid: %d\n" << getpid() << endl;cout << "C++: pid: %d\n" << getpid() << endl;return 0;
}

所以，在一个程序中的环境变量和可执行参数是父进程给予的，我们可以通过extern来观察bash进程给予的环境变量和参数部分。

extern char** environ; // 获取父进程的环境变量

参数环境变量有三种情况：

用新的环境变量整体替换
用老的环境便令
只增加某一个环境变量：putenv函数

putenv函数：

#include <stdlib.h>
int putenv(char* string);

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>int main()
{printf("myprocess begin.....\n");char* argv[] = {"test",NULL};char* engv[] = {"haha=1111111","hehe=2222222"};putenv("papa=333333");pid_t id = fork();if(id == 0){extern char** environ;sleep(2);execvpe("./test", argv, environ);exit(1);}int status = 0;pid_t rid = waitpid(id, &status, 0);if(rid > 0) {printf("father wait success, child exit code: %d\n", WEXITSTATUS(status));}printf("myprocess end.....\n");return 0;
}