下图是一个可执行文件编译,链接的过程.
本篇将通过一个完整的小工程来阐述ELF编译,链接过程,并分析.o和bin文件中各区,符号表之间的关系.从一个崭新的视角去看中间过程.
准备工作
先得有个小工程,麻雀虽小,但五脏俱全,标准的文件夹和Makefile结构,如下:
目录结构
root@5e3abe332c5a:/home/docker/test4harmony/54# tree
.
├── bin
│ └── weharmony
├── include
│ └── part.h
├── Makefile
├── obj
│ ├── main.o
│ └── part.o
└── src├── main.c└── part.c 4 directories, 7 files
看到 .c .h .o 就感觉特别的亲切 : ),项目很简单,但具有代表性,有全局变量/函数,extern
,多文件链接,和动态链接库的printf
,用cat命令看看三个文件内容。
cat .c .h
root@5e3abe332c5a:/home/docker/test4harmony/54# cat ./src/main.c
#include <stdio.h>
#include "part.h"
extern int g_int;
extern char *g_str;int main() {int loc_int = 53;char *loc_str = "harmony os";printf("main 开始 - 全局 g_int = %d, 全局 g_str = %s.\n", g_int, g_str);func_int(loc_int);func_str(loc_str);printf("main 结束 - 全局 g_int = %d, 全局 g_str = %s.\n", g_int, g_str);return 0;
}
root@5e3abe332c5a:/home/docker/test4harmony/54# cat ./src/part.c
#include <stdio.h>
#include "part.h"int g_int = 51;
char *g_str = "hello world";void func_int(int i) {int tmp = i;g_int = 2 * tmp ;printf("func_int g_int = %d,tmp = %d.\n", g_int,tmp);
}
void func_str(char *str) {g_str = str;printf("func_str g_str = %s.\n", g_str);
}
root@5e3abe332c5a:/home/docker/test4harmony/54# cat ./include/part.h
#ifndef _PART_H_
#define _PART_H_
void func_int(int i);
void func_str(char *str);
#endif
cat Makefile
Makefile
采用标准写法,关于makefile系列篇会在编译过程篇中详细说明,此处先看点简单的。
root@5e3abe332c5a:/home/docker/test4harmony/54# cat Makefile
DIR_INC = ./include
DIR_SRC = ./src
DIR_OBJ = ./obj
DIR_BIN = ./binSRC = $(wildcard ${DIR_SRC}/*.c)
OBJ = $(patsubst %.c,${DIR_OBJ}/%.o,$(notdir ${SRC}))TARGET = weharmonyBIN_TARGET = ${DIR_BIN}/${TARGET}CC = gcc
CFLAGS = -g -Wall -I${DIR_INC}${BIN_TARGET}:${OBJ}$(CC) $(OBJ) -o $@${DIR_OBJ}/%.o:${DIR_SRC}/%.c$(CC) $(CFLAGS) -c $< -o $@
.PHONY:clean
clean:find ${DIR_OBJ} -name *.o -exec rm -rf {}
编译.链接.运行.看结果
root@5e3abe332c5a:/home/docker/test4harmony/54# make
gcc -g -Wall -I./include -c src/part.c -o obj/part.o
gcc -g -Wall -I./include -c src/main.c -o obj/main.o
gcc ./obj/part.o ./obj/main.o -o bin/weharmony
root@5e3abe332c5a:/home/docker/test4harmony/54# ./bin/weharmony
main 开始 - 全局 g_int = 51, 全局 g_str = hello world.
func_int g_int = 106,tmp = 53.
func_str g_str = harmony os.
main 结束 - 全局 g_int = 106, 全局 g_str = harmony os.
结果很简单,没什么好说的.
开始分析
准备工作完成,开始了真正的分析. 因为命令输出内容太多,本篇做了精简,去除了干扰项.对这些命令还不行清楚的请翻看系列篇其他文章,此处不做介绍,阅读本篇需要一定的基础.
readelf 大S小s ./obj/main.o
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -S ./obj/main.o
There are 22 section headers, starting at offset 0x1498:Section Headers:[Nr] Name Type Address OffsetSize EntSize Flags Link Info Align[ 0] NULL 0000000000000000 000000000000000000000000 0000000000000000 0 0 0[ 1] .text PROGBITS 0000000000000000 00000040000000000000007b 0000000000000000 AX 0 0 1[ 2] .rela.text RELA 0000000000000000 00000c800000000000000108 0000000000000018 I 19 1 8[ 3] .data PROGBITS 0000000000000000 000000bb0000000000000000 0000000000000000 WA 0 0 1[ 4] .bss NOBITS 0000000000000000 000000bb0000000000000000 0000000000000000 WA 0 0 1[ 5] .rodata PROGBITS 0000000000000000 000000c0000000000000007d 0000000000000000 A 0 0 8......
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -s ./obj/main.oSymbol table '.symtab' contains 22 entries:Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS main.c2: 0000000000000000 0 SECTION LOCAL DEFAULT 1... 15: 0000000000000000 123 FUNC GLOBAL DEFAULT 1 main 16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_str 17: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_int 18: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_19: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf20: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_int21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_str
解读
编译 main.c 后 main.o 告诉了链接器以下信息
- 有一个文件 叫 main.c
(Type=FILE)
- 文件中有个函数叫 main
(Type=FUNC)
,并且这是一个全局函数,(Bind = GLOBAL , Vis = DEFAULT
,全局的意思就是可以被外部文件所引用. - 剩下的
g_str
,printf
,func_int
,…,都是需要外部提供,并未在本文件中定义的符号(Ndx = UND , Type = NOTYPE)
,至于怎么顺藤摸瓜找到这些符号那我不管,.o文件是独立存在,它只是告诉你我用了哪些东西,但我也不知道在哪里. printf
和func_int
对它来说一视同仁,都是外部链接符号,没有特殊对待.
readelf 大S小s ./obj/part.o
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -S ./obj/part.o[ 1] .text PROGBITS 0000000000000000 000000400000000000000078 0000000000000000 AX 0 0 1[ 2] .rela.text RELA 0000000000000000 00000cf000000000000000c0 0000000000000018 I 21 1 8[ 3] .data PROGBITS 0000000000000000 000000b80000000000000004 0000000000000000 WA 0 0 4[ 4] .bss NOBITS 0000000000000000 000000bc0000000000000000 0000000000000000 WA 0 0 1[ 5] .rodata PROGBITS 0000000000000000 000000c00000000000000045 0000000000000000 A 0 0 8[ 6] .data.rel.local PROGBITS 0000000000000000 000001080000000000000008 0000000000000000 WA 0 0 8......
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -s ./obj/part.oSymbol table '.symtab' contains 22 entries:Num: Value Size Type Bind Vis Ndx Name0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND1: 0000000000000000 0 FILE LOCAL DEFAULT ABS part.c2: 0000000000000000 0 SECTION LOCAL DEFAULT 1...16: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 g_int17: 0000000000000000 8 OBJECT GLOBAL DEFAULT 6 g_str18: 0000000000000000 52 FUNC GLOBAL DEFAULT 1 func_int19: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_20: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf21: 0000000000000034 57 FUNC GLOBAL DEFAULT 1 func_str
解读
编译 part.c 后part.o告诉了链接器以下信息
- 有一个文件 叫 part.c
(Type=FILE)
- 文件中有两个函数叫
func_int
,func_str
(Type=FUNC)
,并且都是全局函数,(Bind = GLOBAL , Vis = DEFAULT
,全局的意思就是可以被外部文件所引用. - 文件中有两个对象叫
g_int
,g_str
(Type=OBJECT)
,并且都是全局对象,同样可以被外部使用. - 剩下的
printf
,_GLOBAL_OFFSET_TABLE_
,都是需要外部提供,并未在本文件中定义的符号(Ndx = UND , Type = NOTYPE)
- 另外 part.c的局部变量
tmp
并没有出现在符号表中.因为符号表相当于外交部,只有对外的内容. func_int
,func_str
在1区代码区.text
.g_int
在3区.data
数据区, 打开3区,发现了 0x33 就是源码中 int g_int = 51;的值
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -x 3 ./obj/part.oHex dump of section '.data':0x00000000 33000000 3...
g_str
在6区,.data.rel.local
数据区,打开6区看结果
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -x 6 ./obj/part.oHex dump of section '.data.rel.local':NOTE: This section has relocations against it, but these have NOT been applied to this dump.0x00000000 00000000 00000000 ........
并未发现 char *g_str = “hello world”;的身影,反而抛下一句话 NOTE: This section has relocations against it, but these have NOT been applied to this dump.翻译过来是 注意:此部分已针对它进行重定位,但是尚未将其应用于此转储. 最后在5区 '.rodata’找到了 hello world
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -x 5 ./obj/part.oHex dump of section '.rodata':0x00000000 68656c6c 6f20776f 726c6400 00000000 hello world.....0x00000010 66756e63 5f696e74 20675f69 6e74203d func_int g_int =0x00000020 2025642c 746d7020 3d202564 2e0a0066 %d,tmp = %d...f0x00000030 756e635f 73747220 675f7374 72203d20 unc_str g_str =0x00000040 25732e0a 00 %s..
至于重定向是如何实现的,在系列篇 重定向篇中已有详细说明,不再此展开说.
- 看完两个符号表总结下来就是三句话
- 我是谁,我在哪
- 我能提供什么给别人用
- 我需要别人提供什么给我用.
readelf 大S小s ./bin/weharmony
weharmony
是将 main.o
,part.o
和库文件链接完成后的可执行文件.
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -S ./bin/weharmony
There are 36 section headers, starting at offset 0x4908:Section Headers:[Nr] Name Type Address OffsetSize EntSize Flags Link Info Align......[16] .text PROGBITS 0000000000001060 000010600000000000000255 0000000000000000 AX 0 0 16[17] .fini PROGBITS 00000000000012b8 000012b8000000000000000d 0000000000000000 AX 0 0 4[18] .rodata PROGBITS 0000000000002000 0000200000000000000000cd 0000000000000000 A 0 0 8 ......[25] .data PROGBITS 0000000000004000 000030000000000000000020 0000000000000000 WA 0 0 8[26] .bss NOBITS 0000000000004020 000030200000000000000008 0000000000000000 WA 0 0 1
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -s ./bin/weharmony Symbol table '.dynsym' contains 7 entries:Num: Value Size Type Bind Vis Ndx Name0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND1: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5 (2)3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (2)4: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable 6: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GLIBC_2.2.5 (2) Symbol table '.symtab' contains 75 entries:Num: Value Size Type Bind Vis Ndx Name0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND1: 0000000000000318 0 SECTION LOCAL DEFAULT 12: 0000000000000338 0 SECTION LOCAL DEFAULT 23: 0000000000000358 0 SECTION LOCAL DEFAULT 3....33: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c34: 0000000000001090 0 FUNC LOCAL DEFAULT 16 deregister_tm_clones35: 00000000000010c0 0 FUNC LOCAL DEFAULT 16 register_tm_clones36: 0000000000001100 0 FUNC LOCAL DEFAULT 16 __do_global_dtors_aux37: 0000000000004020 1 OBJECT LOCAL DEFAULT 26 completed.806038: 0000000000003dc0 0 OBJECT LOCAL DEFAULT 22 __do_global_dtors_aux_fin39: 0000000000001140 0 FUNC LOCAL DEFAULT 16 frame_dummy40: 0000000000003db8 0 OBJECT LOCAL DEFAULT 21 __frame_dummy_init_array_41: 0000000000000000 0 FILE LOCAL DEFAULT ABS part.c42: 0000000000000000 0 FILE LOCAL DEFAULT ABS main.c43: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c44: 000000000000225c 0 OBJECT LOCAL DEFAULT 20 __FRAME_END__45: 0000000000000000 0 FILE LOCAL DEFAULT ABS46: 0000000000003dc0 0 NOTYPE LOCAL DEFAULT 21 __init_array_end47: 0000000000003dc8 0 OBJECT LOCAL DEFAULT 23 _DYNAMIC48: 0000000000003db8 0 NOTYPE LOCAL DEFAULT 21 __init_array_start49: 00000000000020c0 0 NOTYPE LOCAL DEFAULT 19 __GNU_EH_FRAME_HDR50: 0000000000003fb8 0 OBJECT LOCAL DEFAULT 24 _GLOBAL_OFFSET_TABLE_51: 0000000000001000 0 FUNC LOCAL DEFAULT 12 _init52: 00000000000012b0 5 FUNC GLOBAL DEFAULT 16 __libc_csu_fini53: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab54: 0000000000004000 0 NOTYPE WEAK DEFAULT 25 data_start55: 0000000000004020 0 NOTYPE GLOBAL DEFAULT 25 _edata56: 00000000000012b8 0 FUNC GLOBAL HIDDEN 17 _fini57: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@@GLIBC_2.2.558: 0000000000004010 4 OBJECT GLOBAL DEFAULT 25 g_int59: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@@GLIBC_60: 0000000000004000 0 NOTYPE GLOBAL DEFAULT 25 __data_start61: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__62: 0000000000004008 0 OBJECT GLOBAL HIDDEN 25 __dso_handle63: 0000000000004018 8 OBJECT GLOBAL DEFAULT 25 g_str64: 0000000000002000 4 OBJECT GLOBAL DEFAULT 18 _IO_stdin_used65: 0000000000001240 101 FUNC GLOBAL DEFAULT 16 __libc_csu_init66: 0000000000001149 52 FUNC GLOBAL DEFAULT 16 func_int67: 0000000000004028 0 NOTYPE GLOBAL DEFAULT 26 _end68: 0000000000001060 47 FUNC GLOBAL DEFAULT 16 _start69: 000000000000117d 57 FUNC GLOBAL DEFAULT 16 func_str70: 0000000000004020 0 NOTYPE GLOBAL DEFAULT 26 __bss_start71: 00000000000011b6 123 FUNC GLOBAL DEFAULT 16 main72: 0000000000004020 0 OBJECT GLOBAL HIDDEN 25 __TMC_END__73: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable74: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@@GLIBC_2.2
解读
链接后的可执行文件 weharmony
将告诉加载器以下信息
- 涉及文件有哪些
Type = FILE
- 涉及函数有哪些
Type = FUNC
func_str,func_int,_start,main - 涉及对象有哪些
Type = OBJECT
g_int,g_str,…它将这些数据统一归到了25区.
前往25区查看下数据,同样只发现了 int g_int = 51; 的数据.
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -x 25 ./bin/weharmony Hex dump of section '.data':0x00004000 00000000 00000000 08400000 00000000 .........@......0x00004010 33000000 00000000 08200000 00000000 3........ ......
是不是和part.o一样也被放在了.rodata
区,再反查 18区,果然发了 main.c和part.c的数据都放在了这里.
root@5e3abe332c5a:/home/docker/test4harmony/54# readelf -x 18 ./bin/weharmonyHex dump of section '.rodata':0x00002000 01000200 00000000 68656c6c 6f20776f ........hello wo0x00002010 726c6400 00000000 66756e63 5f696e74 rld.....func_int0x00002020 20675f69 6e74203d 2025642c 746d7020 g_int = %d,tmp 0x00002030 3d202564 2e0a0066 756e635f 73747220 = %d...func_str 0x00002040 675f7374 72203d20 25732e0a 00000000 g_str = %s......0x00002050 6861726d 6f6e7920 6f730000 00000000 harmony os......0x00002060 6d61696e 20e5bc80 e5a78b20 2d20e585 main ...... - ..0x00002070 a8e5b180 20675f69 6e74203d 2025642c .... g_int = %d,0x00002080 20e585a8 e5b18020 675f7374 72203d20 ...... g_str = 0x00002090 25732e0a 00000000 6d61696e 20e7bb93 %s......main ...0x000020a0 e69d9f20 2d20e585 a8e5b180 20675f69 ... - ...... g_i0x000020b0 6e74203d 2025642c 20e585a8 e5b18020 nt = %d, ...... 0x000020c0 675f7374 72203d20 25732e0a 00 g_str = %s...
- 另外还有注意
printf
的变化,从Type = NOTYPE
变成了Type = FUNC
,告诉了后续的动态链接这是个函数
57: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@@GLIBC_2.2.5
但是内容依然是Ndx=UND
,weharmony也提供不了,内容需要运行时环境提供.并在需要动态链接表中也已经注明了内容清单,运行环境必须提供以下内容才能真正跑起来weharmony.
Symbol table '.dynsym' contains 7 entries:Num: Value Size Type Bind Vis Ndx Name0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND1: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5 (2)3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (2)4: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable 6: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GLIBC_2.2.5 (2)