在内核中调试会经常使用crash工具,此工具可以调试死锁,假死等问题,之前《RK平台上使用crash进行live debug》上已经分析了crash工具的安装和基本使用,本文作为加强理解篇,以读取系统的task_struct的tasks字段,从而获取当前进程的所有进程的task_struct。
我们通过ps可以查看到当前系统的进程信息,以前10个为例
PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffc00a6d23c0 RU 0.0 0 0 [swapper/0] > 0 0 1 ffffff81f0856580 RU 0.0 0 0 [swapper/1] > 0 0 2 ffffff81f0898000 RU 0.0 0 0 [swapper/2] 0 0 3 ffffff81f0898e80 RU 0.0 0 0 [swapper/3] 0 0 4 ffffff81f0899d00 RU 0.0 0 0 [swapper/4] > 0 0 5 ffffff81f089ab80 RU 0.0 0 0 [swapper/5] > 0 0 6 ffffff81f089ba00 RU 0.0 0 0 [swapper/6] > 0 0 7 ffffff81f089c880 RU 0.0 0 0 [swapper/7] 1 0 2 ffffff81f0808000 IN 0.1 245056 6372 systemd 2 0 6 ffffff81f0808e80 IN 0.0 0 0 [kthreadd] 3 2 0 ffffff81f0809d00 ID 0.0 0 0 [rcu_gp] 4 2 0 ffffff81f080ab80 ID 0.0 0 0 [rcu_par_gp] 8 2 0 ffffff81f080e580 ID 0.0 0 0 [mm_percpu_wq] 9 2 0 ffffff81f0850000 IN 0.0 0 0 [rcu_tasks_rude_]
通过上面信息可以发现,cpu0,1,2,5,6,7都是idle状态,只有cpu3和4是运行的状态。
从TASK一列,我们能够拿到struct task_struct的结构体地址,接下来我们基于此来进行实践crash工具
struct task_struct init_task start_kernel sched_init init_idle sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu); #define INIT_TASK_COMM "swapper"
对于此结构体我们关注tasks链表,所以我们需要得到pid和comm和tasks list,以swapper/0为例
crash> struct task_struct.pid,comm ffffffc00a6d23c0 pid = 0 comm = "swapper/0\000\000\000\000\000\000" tasks = { next = 0xffffff81f0808438, prev = 0xffffff80b2a82fb8 }
根据上面代码,我们知道每个cpu都有一个idle进程,所以包含cpu 1-7的信息如下
crash> struct task_struct.pid,comm,tasks ffffff81f0856580 pid = 0 comm = "swapper/1\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks ffffff81f0898000 pid = 0 comm = "swapper/2\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks ffffff81f0898e80 pid = 0 comm = "swapper/3\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks ffffff81f0899d00 pid = 0 comm = "swapper/4\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks ffffff81f089ab80 pid = 0 comm = "swapper/5\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks ffffff81f089ba00 pid = 0 comm = "swapper/6\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks ffffff81f089c880 pid = 0 comm = "swapper/7\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> }
我们知道所有的task通过tasks串起来,所以我们可以先定位tasks位于task_struct的举例,如下
crash> struct task_struct.tasks -o -x struct task_struct { [0x438] struct list_head tasks; }
这里可以知道位于task_struct的0x438个字节。我们打印tasks的链表,这里先以swapper/0为例如下
crash> struct task_struct.pid,comm,tasks ffffffc00a6d23c0 pid = 0 comm = "swapper/0\000\000\000\000\000\000" tasks = { next = 0xffffff81f0808438, prev = 0xffffff81f09e8438 }
此时我们知道其next指针是0xffffff81f0808438,它是其他进程task_struct.tasks的指针,所以我们可以通过计算偏移量来获得 task_struct。如下
>>> hex(0xffffff81f0808438-0x438) '0xffffff81f0808000'
此时我们获得了next的 task_struct指针,所以我们打印如下
crash> struct task_struct.pid,comm,tasks 0xffffff81f0808000 pid = 1 comm = "systemd\000\000\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> }
同样的,对于非swapper/0上的idle进程,我们可以获取其next的进程信息,它们默认是kthreadd如下
crash> struct task_struct.pid,comm,tasks ffffff81f0856580 pid = 0 comm = "swapper/1\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> }
计算地址
>>> hex(0xffffff81f08092b8-0x438) '0xffffff81f0808e80'
打印task_struct
crash> struct task_struct.pid,comm,tasks 0xffffff81f0808e80 pid = 2 comm = "kthreadd\000\000\000\000\000\000\000" tasks = { next = 0xffffff81f080a138, prev = 0xffffff81f0808438 }
根据上面我们可以简单通过tasks来获取next的task_struct,接下来我们使用list命令。对于swapper/0,我们list获取链表所有成员如下
crash> list -h 0xffffff81f0808438 ffffff81f0808438 ffffff81f08092b8 ffffff81f080a138
这样可以直接计算出所有的task_struct,通过如下
for addr in addresses: print("struct task_struct.pid,comm,tasks", hex(int(addr, 16) - 0x438))
这里粘贴前五个打印如下
crash> struct task_struct.pid,comm,tasks 0xffffff81f0808000 pid = 1 comm = "systemd\000\000\000\000\000\000\000\000" tasks = { next = 0xffffff81f08092b8, prev = 0xffffffc00a6d27f8 <init_task+1080> } crash> struct task_struct.pid,comm,tasks 0xffffff81f0808e80 pid = 2 comm = "kthreadd\000\000\000\000\000\000\000" tasks = { next = 0xffffff81f080a138, prev = 0xffffff81f0808438 } crash> struct task_struct.pid,comm,tasks 0xffffff81f0809d00 pid = 3 comm = "rcu_gp\000d\000\000\000\000\000\000\000" tasks = { next = 0xffffff81f080afb8, prev = 0xffffff81f08092b8 } crash> struct task_struct.pid,comm,tasks 0xffffff81f080ab80 pid = 4 comm = "rcu_par_gp\000\000\000\000\000" tasks = { next = 0xffffff81f080e9b8, prev = 0xffffff81f080a138 } crash> struct task_struct.pid,comm,tasks 0xffffff81f080e580 pid = 8 comm = "mm_percpu_wq\000\000\000" tasks = { next = 0xffffff81f0850438, prev = 0xffffff81f080afb8 }
对于swapper/1-7,这里以3个作为示例,如下
crash> struct task_struct.pid,comm,tasks 0xffffff81f0809d00 pid = 3 comm = "rcu_gp\000d\000\000\000\000\000\000\000" tasks = { next = 0xffffff81f080afb8, prev = 0xffffff81f08092b8 } crash> struct task_struct.pid,comm,tasks 0xffffff81f080ab80 pid = 4 comm = "rcu_par_gp\000\000\000\000\000" tasks = { next = 0xffffff81f080e9b8, prev = 0xffffff81f080a138 } crash> struct task_struct.pid,comm,tasks 0xffffff81f080e580 pid = 8 comm = "mm_percpu_wq\000\000\000" tasks = { next = 0xffffff81f0850438, prev = 0xffffff81f080afb8 }
可以看到,这里信息和最上面的ps得到的进程信息完全一致。
至此,我们根据crash做了一个简单的实验,通过task_struct的tasks遍历查找所有的pid和comm,它可以方便的实时查看内核的结构体数据,从而学习内核和定位内核问题。