编辑
2025-09-18
记录知识
0

目录

配置内核
测试验证
解释日志
越界检测
使用已销毁对象
rcu形式的uaf
wq形式的uaf
总结
参考链接

我在《内核的内存泄露调试办法》中简单介绍了kasan,实际上当时对kasan的实现原理理解尚为浅薄,最近计划全面的学习和了解kasan,然后通过编写此系列文章来巩固记忆。

了解kasan之前,必须要了解asan的工作原理,asan通过poison的方式检测代码的内存问题。asan的工作原理和实践我在文章《使用ASAN调试内存问题》有介绍。

配置内核

关于kasan的内核配置,主要开如下选项

# zcat /proc/config.gz | grep KASAN CONFIG_KASAN_SHADOW_OFFSET=0xdfffffd000000000 CONFIG_HAVE_ARCH_KASAN=y CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y CONFIG_HAVE_ARCH_KASAN_VMALLOC=y CONFIG_CC_HAS_KASAN_GENERIC=y CONFIG_KASAN=y CONFIG_KASAN_GENERIC=y # CONFIG_KASAN_OUTLINE is not set CONFIG_KASAN_INLINE=y CONFIG_KASAN_STACK=y CONFIG_KASAN_VMALLOC=y CONFIG_KASAN_MODULE_TEST=m

简单解释如下

  1. CONFIG_KASAN_SHADOW_OFFSET
    这是影子区域内存的偏移值,按照8比1的方式生成影子区域,计算方式和asan一致,如下
Shadow = (Mem >> 3) + offset;
  1. CONFIG_HAVE_ARCH_KASAN
    这是内核判断支持kasan平台的条件,如下
select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)

也就是在arm64上48位平台如果是16k的页,那么不支持KASAN,其他均支持KASAN

  1. CONFIG_HAVE_ARCH_KASAN_SW_TAGS
    仅作为标记当前架构(arm64)是否支持软件标记的asan
  2. CONFIG_HAVE_ARCH_KASAN_VMALLOC
    仅作为标记当前架构(arm64)是否支持vmalloc区域的kasan检查
  3. CONFIG_CC_HAS_KASAN_GENERIC
    gcc的参数-fsanitize=kernel-address添加,可查询如下
https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Instrumentation-Options.html#Instrumentation-Options
config CC_HAS_KASAN_GENERIC def_bool $(cc-option, -fsanitize=kernel-address)
  1. CONFIG_KASAN
    kasan的内核主配置
  2. CONFIG_KASAN_GENERIC
    选择通用的asan配置,其原理和用户asan相同,但实际上可以配置sw tag asan和hw tag asan。如下
config KASAN_GENERIC bool "Generic mode" depends on HAVE_ARCH_KASAN && CC_HAS_KASAN_GENERIC depends on CC_HAS_WORKING_NOSANITIZE_ADDRESS select SLUB_DEBUG if SLUB select CONSTRUCTORS help Enables generic KASAN mode. This mode is supported in both GCC and Clang. With GCC it requires version 8.3.0 or later. Any supported Clang version is compatible, but detection of out-of-bounds accesses for global variables is supported only since Clang 11. This mode consumes about 1/8th of available memory at kernel start and introduces an overhead of ~x1.5 for the rest of the allocations. The performance slowdown is ~x3. Currently CONFIG_KASAN_GENERIC doesn't work with CONFIG_DEBUG_SLAB (the resulting kernel does not boot). config KASAN_SW_TAGS bool "Software tag-based mode" depends on HAVE_ARCH_KASAN_SW_TAGS && CC_HAS_KASAN_SW_TAGS depends on CC_HAS_WORKING_NOSANITIZE_ADDRESS select SLUB_DEBUG if SLUB select CONSTRUCTORS help Enables software tag-based KASAN mode. This mode require software memory tagging support in the form of HWASan-like compiler instrumentation. Currently this mode is only implemented for arm64 CPUs and relies on Top Byte Ignore. This mode requires Clang. This mode consumes about 1/16th of available memory at kernel start and introduces an overhead of ~20% for the rest of the allocations. This mode may potentially introduce problems relating to pointer casting and comparison, as it embeds tags into the top byte of each pointer. Currently CONFIG_KASAN_SW_TAGS doesn't work with CONFIG_DEBUG_SLAB (the resulting kernel does not boot). config KASAN_HW_TAGS bool "Hardware tag-based mode" depends on HAVE_ARCH_KASAN_HW_TAGS depends on SLUB help Enables hardware tag-based KASAN mode. This mode requires hardware memory tagging support, and can be used by any architecture that provides it. Currently this mode is only implemented for arm64 CPUs starting from ARMv8.5 and relies on Memory Tagging Extension and Top Byte Ignore.

这里总结如下

  • 通用模式和用户层asan一致,按照1/8的方式设置影子内存
  • sw tags的模式利用ARMv8的TBI实现的asan
  • hw tags的模式利用ARMv8的MTE实现的asan

当前我的环境均无法支持下面两种,暂时使用通用asan
8. CONFIG_KASAN_INLINE
默认情况下内核开启INLINE模式,也可以手动开启OUTLINE模式,关于这个后面分析代码的时候会详细解释,这两种模式是代码插桩的方式不同。

  • INLINE模式按照指令插桩
  • OUTLINE模式按照函数插桩

所以可以知道,使用OUTLINE模式内核代码段会很大,但优点是兼容性更好。当然INLINE模式是当前trace界最合适的方式,通过指令插桩。
9. CONFIG_KASAN_STACK
内核支持定位Stack buffer overflow的功能配置

  1. CONFIG_KASAN_VMALLOC
    内核支持vmalloc区域的影子内存区域,因为vmalloc是整个内核访问碎片化内存的地址范围通常很大,以内核文档为例,大概在93TB,如下
ffffa00010000000 fffffdffbffeffff ~93TB vmalloc

这种情况下,支持vmalloc的影子内存需要11TB内存,所以提供了内核配置,可选的来控制是否调试vmalloc的内存问题
11. CONFIG_KASAN_MODULE_TEST
这是内核提供的测试kasan的默认模块验证示例。

测试验证

测试非常简单,直接make modules即可获得test_kasan_module.ko。 此时直接insmod即可。对应日志如下

[ 28.681187] kasan test: copy_user_test out-of-bounds in copy_from_user() [ 28.681203] ================================================================== [ 28.681212] BUG: KASAN: slab-out-of-bounds in copy_user_test+0xc0/0x340 [test_kasan_module] [ 28.681217] Write of size 11 at addr ffffff80833f8500 by task insmod/1953 insmod: ERROR: could not insert module test_kasan_module.ko: Resource temporarily unavailable [ 28.681221] root@kylin:~# [ 28.681227] CPU: 4 PID: 1953 Comm: insmod Tainted: G B 5.10.198 #92 [ 28.681231] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT) [ 28.681235] Call trace: [ 28.681243] dump_backtrace+0x0/0x3bc [ 28.681248] show_stack+0x1c/0x24 [ 28.681254] dump_stack_lvl+0x130/0x168 [ 28.681260] print_address_description.constprop.0+0x74/0x2b8 [ 28.681265] kasan_report+0x1e8/0x200 [ 28.681270] kasan_check_range+0xf4/0x1a0 [ 28.681273] __kasan_check_write+0x30/0x50 [ 28.681279] copy_user_test+0xc0/0x340 [test_kasan_module] [ 28.681284] test_kasan_module_init+0x18/0xa78 [test_kasan_module] [ 28.681289] do_one_initcall+0xb0/0x4e0 [ 28.681294] do_init_module+0x14c/0x600 [ 28.681298] load_module+0x5714/0x71fc [ 28.681303] __do_sys_finit_module+0x110/0x1a0 [ 28.681307] __arm64_sys_finit_module+0x70/0xa0 [ 28.681312] el0_svc_common.constprop.0+0xf0/0x464 [ 28.681317] do_el0_svc+0x44/0x5c [ 28.681321] el0_svc+0x1c/0x30 [ 28.681325] el0_sync_handler+0xa8/0xac [ 28.681328] el0_sync+0x158/0x180 [ 28.681331] [ 28.681335] Allocated by task 1953: [ 28.681340] kasan_save_stack+0x24/0x50 [ 28.681343] __kasan_kmalloc+0x88/0xb0 [ 28.681347] kmem_cache_alloc_trace+0x1d0/0x3c0 [ 28.681352] copy_user_test+0x48/0x340 [test_kasan_module] [ 28.681357] test_kasan_module_init+0x18/0xa78 [test_kasan_module] [ 28.681361] do_one_initcall+0xb0/0x4e0 [ 28.681365] do_init_module+0x14c/0x600 [ 28.681369] load_module+0x5714/0x71fc [ 28.681373] __do_sys_finit_module+0x110/0x1a0 [ 28.681377] __arm64_sys_finit_module+0x70/0xa0 [ 28.681381] el0_svc_common.constprop.0+0xf0/0x464 [ 28.681385] do_el0_svc+0x44/0x5c [ 28.681388] el0_svc+0x1c/0x30 [ 28.681392] el0_sync_handler+0xa8/0xac [ 28.681396] el0_sync+0x158/0x180 [ 28.681398] [ 28.681402] The buggy address belongs to the object at ffffff80833f8500 [ 28.681402] which belongs to the cache kmalloc-128 of size 128 [ 28.681407] The buggy address is located 0 bytes inside of [ 28.681407] 128-byte region [ffffff80833f8500, ffffff80833f8580) [ 28.681411] The buggy address belongs to the page: [ 28.681417] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8 [ 28.681421] head:00000000064eb9ca order:1 compound_mapcount:0 [ 28.681426] flags: 0x10200(slab|head) [ 28.681431] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80 [ 28.681436] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 28.681439] page dumped because: kasan: bad access detected [ 28.681442] [ 28.681445] Memory state around the buggy address: [ 28.681449] ffffff80833f8400: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681452] ffffff80833f8480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681456] >ffffff80833f8500: 00 02 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681459] ^ [ 28.681463] ffffff80833f8580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681466] ffffff80833f8600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681470] ================================================================== ...... [ 28.683111] kasan test: kasan_rcu_uaf use-after-free in kasan_rcu_reclaim [ 28.683416] kasan test: kasan_workqueue_uaf use-after-free on workqueue [ 28.683433] ================================================================== [ 28.683454] BUG: KASAN: use-after-free in kasan_workqueue_uaf+0x140/0x158 [test_kasan_module] [ 28.683473] Read of size 8 at addr ffffff80833f8800 by task insmod/1953 [ 28.683484] [ 28.683501] CPU: 4 PID: 1953 Comm: insmod Tainted: G B 5.10.198 #92 [ 28.683504] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT) [ 28.683508] Call trace: [ 28.683513] dump_backtrace+0x0/0x3bc [ 28.683532] show_stack+0x1c/0x24 [ 28.683552] dump_stack_lvl+0x130/0x168 [ 28.683566] print_address_description.constprop.0+0x74/0x2b8 [ 28.683579] kasan_report+0x1e8/0x200 [ 28.683592] __asan_report_load8_noabort+0x30/0x5c [ 28.683608] kasan_workqueue_uaf+0x140/0x158 [test_kasan_module] [ 28.683633] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 28.683646] do_one_initcall+0xb0/0x4e0 [ 28.683661] do_init_module+0x14c/0x600 [ 28.683674] load_module+0x5714/0x71fc [ 28.683689] __do_sys_finit_module+0x110/0x1a0 [ 28.683710] __arm64_sys_finit_module+0x70/0xa0 [ 28.683724] el0_svc_common.constprop.0+0xf0/0x464 [ 28.683737] do_el0_svc+0x44/0x5c [ 28.683750] el0_svc+0x1c/0x30 [ 28.683763] el0_sync_handler+0xa8/0xac [ 28.683776] el0_sync+0x158/0x180 [ 28.683781] [ 28.683789] Allocated by task 1953: [ 28.683793] kasan_save_stack+0x24/0x50 [ 28.683797] __kasan_kmalloc+0x88/0xb0 [ 28.683801] kmem_cache_alloc_trace+0x1d0/0x3c0 [ 28.683805] kasan_workqueue_uaf+0x80/0x158 [test_kasan_module] [ 28.683810] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 28.683814] do_one_initcall+0xb0/0x4e0 [ 28.683818] do_init_module+0x14c/0x600 [ 28.683822] load_module+0x5714/0x71fc [ 28.683825] __do_sys_finit_module+0x110/0x1a0 [ 28.683830] __arm64_sys_finit_module+0x70/0xa0 [ 28.683834] el0_svc_common.constprop.0+0xf0/0x464 [ 28.683838] do_el0_svc+0x44/0x5c [ 28.683841] el0_svc+0x1c/0x30 [ 28.683845] el0_sync_handler+0xa8/0xac [ 28.683848] el0_sync+0x158/0x180 [ 28.683851] [ 28.683855] Freed by task 676: [ 28.683859] kasan_save_stack+0x24/0x50 [ 28.683868] kasan_set_track+0x24/0x34 [ 28.683872] kasan_set_free_info+0x24/0x44 [ 28.683876] __kasan_slab_free+0xd8/0x134 [ 28.683879] kfree+0xe0/0x500 [ 28.683884] kasan_workqueue_work+0xc/0x14 [test_kasan_module] [ 28.683889] process_one_work+0x624/0x1240 [ 28.683893] worker_thread+0x3b8/0xe90 [ 28.683897] kthread+0x2c0/0x344 [ 28.683901] ret_from_fork+0x10/0x18 [ 28.683904] [ 28.683907] Last potentially related work creation: [ 28.683910] kasan_save_stack+0x24/0x50 [ 28.683914] kasan_record_aux_stack+0xbc/0xd0 [ 28.683919] insert_work+0x54/0x2e0 [ 28.683923] __queue_work+0x3a8/0xca0 [ 28.683926] queue_work_on+0x9c/0xd0 [ 28.683931] kasan_workqueue_uaf+0x114/0x158 [test_kasan_module] [ 28.683936] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 28.683939] do_one_initcall+0xb0/0x4e0 [ 28.683944] do_init_module+0x14c/0x600 [ 28.683949] load_module+0x5714/0x71fc [ 28.683953] __do_sys_finit_module+0x110/0x1a0 [ 28.683962] __arm64_sys_finit_module+0x70/0xa0 [ 28.683967] el0_svc_common.constprop.0+0xf0/0x464 [ 28.683970] do_el0_svc+0x44/0x5c [ 28.683974] el0_svc+0x1c/0x30 [ 28.683978] el0_sync_handler+0xa8/0xac [ 28.683981] el0_sync+0x158/0x180 [ 28.683984] [ 28.683988] The buggy address belongs to the object at ffffff80833f8800 [ 28.683988] which belongs to the cache kmalloc-128 of size 128 [ 28.683992] The buggy address is located 0 bytes inside of [ 28.683992] 128-byte region [ffffff80833f8800, ffffff80833f8880) [ 28.683995] The buggy address belongs to the page: [ 28.684001] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8 [ 28.684005] head:00000000064eb9ca order:1 compound_mapcount:0 [ 28.684009] flags: 0x10200(slab|head) [ 28.684014] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80 [ 28.684018] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 28.684021] page dumped because: kasan: bad access detected [ 28.684026] [ 28.684028] Memory state around the buggy address: [ 28.684036] ffffff80833f8700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 28.684040] ffffff80833f8780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.684044] >ffffff80833f8800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 28.684047] ^ [ 28.684050] ffffff80833f8880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.684054] ffffff80833f8900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.684056] ==================================================================

解释日志

kasan的module测试主要测两个部分

  • out of bounds
  • use after free

下面按照两部分来解释日志信息

越界检测

这里先从越界的第一个错误解释,如下

[ 28.681187] kasan test: copy_user_test out-of-bounds in copy_from_user() [ 28.681203] ================================================================== [ 28.681212] BUG: KASAN: slab-out-of-bounds in copy_user_test+0xc0/0x340 [test_kasan_module] [ 28.681217] Write of size 11 at addr ffffff80833f8500 by task insmod/1953 insmod: ERROR: could not insert module test_kasan_module.ko: Resource temporarily unavailable [ 28.681221] root@kylin:~# [ 28.681227] CPU: 4 PID: 1953 Comm: insmod Tainted: G B 5.10.198 #92 [ 28.681231] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT) [ 28.681235] Call trace: [ 28.681243] dump_backtrace+0x0/0x3bc [ 28.681248] show_stack+0x1c/0x24 [ 28.681254] dump_stack_lvl+0x130/0x168 [ 28.681260] print_address_description.constprop.0+0x74/0x2b8 [ 28.681265] kasan_report+0x1e8/0x200 [ 28.681270] kasan_check_range+0xf4/0x1a0 [ 28.681273] __kasan_check_write+0x30/0x50 [ 28.681279] copy_user_test+0xc0/0x340 [test_kasan_module] [ 28.681284] test_kasan_module_init+0x18/0xa78 [test_kasan_module] [ 28.681289] do_one_initcall+0xb0/0x4e0 [ 28.681294] do_init_module+0x14c/0x600 [ 28.681298] load_module+0x5714/0x71fc [ 28.681303] __do_sys_finit_module+0x110/0x1a0 [ 28.681307] __arm64_sys_finit_module+0x70/0xa0 [ 28.681312] el0_svc_common.constprop.0+0xf0/0x464 [ 28.681317] do_el0_svc+0x44/0x5c [ 28.681321] el0_svc+0x1c/0x30 [ 28.681325] el0_sync_handler+0xa8/0xac [ 28.681328] el0_sync+0x158/0x180 [ 28.681331] [ 28.681335] Allocated by task 1953: [ 28.681340] kasan_save_stack+0x24/0x50 [ 28.681343] __kasan_kmalloc+0x88/0xb0 [ 28.681347] kmem_cache_alloc_trace+0x1d0/0x3c0 [ 28.681352] copy_user_test+0x48/0x340 [test_kasan_module] [ 28.681357] test_kasan_module_init+0x18/0xa78 [test_kasan_module] [ 28.681361] do_one_initcall+0xb0/0x4e0 [ 28.681365] do_init_module+0x14c/0x600 [ 28.681369] load_module+0x5714/0x71fc [ 28.681373] __do_sys_finit_module+0x110/0x1a0 [ 28.681377] __arm64_sys_finit_module+0x70/0xa0 [ 28.681381] el0_svc_common.constprop.0+0xf0/0x464 [ 28.681385] do_el0_svc+0x44/0x5c [ 28.681388] el0_svc+0x1c/0x30 [ 28.681392] el0_sync_handler+0xa8/0xac [ 28.681396] el0_sync+0x158/0x180 [ 28.681398] [ 28.681402] The buggy address belongs to the object at ffffff80833f8500 [ 28.681402] which belongs to the cache kmalloc-128 of size 128 [ 28.681407] The buggy address is located 0 bytes inside of [ 28.681407] 128-byte region [ffffff80833f8500, ffffff80833f8580) [ 28.681411] The buggy address belongs to the page: [ 28.681417] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8 [ 28.681421] head:00000000064eb9ca order:1 compound_mapcount:0 [ 28.681426] flags: 0x10200(slab|head) [ 28.681431] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80 [ 28.681436] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 28.681439] page dumped because: kasan: bad access detected [ 28.681442] [ 28.681445] Memory state around the buggy address: [ 28.681449] ffffff80833f8400: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681452] ffffff80833f8480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681456] >ffffff80833f8500: 00 02 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681459] ^ [ 28.681463] ffffff80833f8580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681466] ffffff80833f8600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681470] ==================================================================

第一个信息,kasan直接告诉了错误类型是越界,函数位置在copy_user_test+0xc0/0x340,这个位置可以gdb计算。

BUG: KASAN: slab-out-of-bounds in copy_user_test+0xc0/0x340 [test_kasan_module]

第二个信息,告诉我们错误地址ffffff80833f8500写了11个字节

Write of size 11 at addr ffffff80833f8500 by task insmod/1953

第三个信息,打印了函数调用堆栈,kasan主动上报了错误,非常清晰

28.681235] Call trace: [ 28.681243] dump_backtrace+0x0/0x3bc [ 28.681248] show_stack+0x1c/0x24 [ 28.681254] dump_stack_lvl+0x130/0x168 [ 28.681260] print_address_description.constprop.0+0x74/0x2b8 [ 28.681265] kasan_report+0x1e8/0x200 [ 28.681270] kasan_check_range+0xf4/0x1a0 [ 28.681273] __kasan_check_write+0x30/0x50 [ 28.681279] copy_user_test+0xc0/0x340 [test_kasan_module] [ 28.681284] test_kasan_module_init+0x18/0xa78 [test_kasan_module] [ 28.681289] do_one_initcall+0xb0/0x4e0 [ 28.681294] do_init_module+0x14c/0x600 [ 28.681298] load_module+0x5714/0x71fc [ 28.681303] __do_sys_finit_module+0x110/0x1a0 [ 28.681307] __arm64_sys_finit_module+0x70/0xa0 [ 28.681312] el0_svc_common.constprop.0+0xf0/0x464 [ 28.681317] do_el0_svc+0x44/0x5c [ 28.681321] el0_svc+0x1c/0x30 [ 28.681325] el0_sync_handler+0xa8/0xac [ 28.681328] el0_sync+0x158/0x180

第四个信息,给我们打印了存在问题的内存申请堆栈,从kmem cache就知道这是slab内的问题

[ 28.681335] Allocated by task 1953: [ 28.681340] kasan_save_stack+0x24/0x50 [ 28.681343] __kasan_kmalloc+0x88/0xb0 [ 28.681347] kmem_cache_alloc_trace+0x1d0/0x3c0 [ 28.681352] copy_user_test+0x48/0x340 [test_kasan_module] [ 28.681357] test_kasan_module_init+0x18/0xa78 [test_kasan_module] [ 28.681361] do_one_initcall+0xb0/0x4e0 [ 28.681365] do_init_module+0x14c/0x600 [ 28.681369] load_module+0x5714/0x71fc [ 28.681373] __do_sys_finit_module+0x110/0x1a0 [ 28.681377] __arm64_sys_finit_module+0x70/0xa0 [ 28.681381] el0_svc_common.constprop.0+0xf0/0x464 [ 28.681385] do_el0_svc+0x44/0x5c [ 28.681388] el0_svc+0x1c/0x30 [ 28.681392] el0_sync_handler+0xa8/0xac [ 28.681396] el0_sync+0x158/0x180

第五个信息,帮我解析了slab的信息,位于cache name为kmalloc-128的某个object

[ 28.681402] The buggy address belongs to the object at ffffff80833f8500 [ 28.681402] which belongs to the cache kmalloc-128 of size 128

第六个信息,和用户态asan一致的方式帮我们打印了内存范围

[ 28.681407] The buggy address is located 0 bytes inside of [ 28.681407] 128-byte region [ffffff80833f8500, ffffff80833f8580)

第七个信息,给我们展示了page信息,如page结构体,引用计数,映射计数,映射情况,pfn号,头页地址,order值,复合页计数,页种类,和元数据,最后是总结的错误类型

[ 28.681411] The buggy address belongs to the page: [ 28.681417] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8 [ 28.681421] head:00000000064eb9ca order:1 compound_mapcount:0 [ 28.681426] flags: 0x10200(slab|head) [ 28.681431] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80 [ 28.681436] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 28.681439] page dumped because: kasan: bad access detected

第八个信息,这个和asan一致,告诉我们出错的地方在第8-16个字节处,因为影子映射是1/8,所以只知道是8-16的位置,这里02代表可以访问的地方是2,也就是总共能够访问的地址是10个字节,那么错误访问地方应该在第11到第16字节处出现了错误访问。

[ 28.681445] Memory state around the buggy address: [ 28.681449] ffffff80833f8400: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681452] ffffff80833f8480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681456] >ffffff80833f8500: 00 02 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681459] ^ [ 28.681463] ffffff80833f8580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.681466] ffffff80833f8600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

首先我们看地址在是ffffff80833f8500,明显这里是内核空间地址,但是我们值得shadow的offset不是这个地址,CONFIG_KASAN_SHADOW_OFFSET=0xdfffffd000000000,所以这也是asan和kasan不同的地方,kasan默认给计算好了内核地址,但asan提供的影子区域地址,所以在调试kasan的时候,我们不要一位的认为下毒区域的值是无论内存地址的值,而是下毒的值代表是影子区域的值,而内核打印只是默认给我们做好了影子内存地址和实际内存地址的转换。

还有需要注意的是,kasan的下毒和asan的下毒种类也是不一样的,这个0xfc不能以asan来解析,我们需要查询内核代码定义如下

#ifdef CONFIG_KASAN_GENERIC #define KASAN_FREE_PAGE 0xFF /* page was freed */ #define KASAN_PAGE_REDZONE 0xFE /* redzone for kmalloc_large allocations */ #define KASAN_KMALLOC_REDZONE 0xFC /* redzone inside slub object */ #define KASAN_KMALLOC_FREE 0xFB /* object was freed (kmem_cache_free/kfree) */ #define KASAN_KMALLOC_FREETRACK 0xFA /* object was freed and has free track set */ #else #define KASAN_FREE_PAGE KASAN_TAG_INVALID #define KASAN_PAGE_REDZONE KASAN_TAG_INVALID #define KASAN_KMALLOC_REDZONE KASAN_TAG_INVALID #define KASAN_KMALLOC_FREE KASAN_TAG_INVALID #define KASAN_KMALLOC_FREETRACK KASAN_TAG_INVALID #endif #define KASAN_GLOBAL_REDZONE 0xF9 /* redzone for global variable */ #define KASAN_VMALLOC_INVALID 0xF8 /* unallocated space in vmapped page */ /* * Stack redzone shadow values * (Those are compiler's ABI, don't change them) */ #define KASAN_STACK_LEFT 0xF1 #define KASAN_STACK_MID 0xF2 #define KASAN_STACK_RIGHT 0xF3 #define KASAN_STACK_PARTIAL 0xF4 /* * alloca redzone shadow values */ #define KASAN_ALLOCA_LEFT 0xCA #define KASAN_ALLOCA_RIGHT 0xCB #define KASAN_ALLOCA_REDZONE_SIZE 32 /* * Stack frame marker (compiler ABI). */ #define KASAN_CURRENT_STACK_FRAME_MAGIC 0x41B58AB3 /* Don't break randconfig/all*config builds */ #ifndef KASAN_ABI_VERSION #define KASAN_ABI_VERSION 1 #endif

可以看到,0xfc的poison种类是 KASAN_KMALLOC_REDZONE,故名思意,这是kmalloc的红区,稍微翻一下代码,如下

case KASAN_KMALLOC_REDZONE: bug_type = "slab-out-of-bounds"; break;

最后还要一个值得注意的是,asan提供的地址信息是

所以根据上面的信息总结下来可以这么理解:

  • kasan检测通过0xfc的下毒方式检测到了slab的越界访问,对应slab的name是kmalloc-128,其申请size是10字节,但是写了11字节,并提供了这个slab页面的页面详情,还提供了检测越界的堆栈和这个page的申请堆栈

使用已销毁对象

关于use after free的测试提供了两种验证,一种是rcu的情况,另一种是wq的情况,主要原因是这两种情况下的uaf是比较隐蔽的,如下:

  1. rcu 访问自带宽限期,宽限期内不应该报错,但是同步后应该报错 (避免误报)
  2. workqueue 涉及任务的异步(实际场景更容易遇到)

rcu形式的uaf

首先我们看一下uaf的测试代码,如下

static struct kasan_rcu_info { int i; struct rcu_head rcu; } *global_rcu_ptr; static noinline void __init kasan_rcu_reclaim(struct rcu_head *rp) { struct kasan_rcu_info *fp = container_of(rp, struct kasan_rcu_info, rcu); kfree(fp); fp->i = 1; } static noinline void __init kasan_rcu_uaf(void) { struct kasan_rcu_info *ptr; pr_info("use-after-free in kasan_rcu_reclaim\n"); ptr = kmalloc(sizeof(struct kasan_rcu_info), GFP_KERNEL); if (!ptr) { pr_err("Allocation failed\n"); return; } global_rcu_ptr = rcu_dereference_protected(ptr, NULL); call_rcu(&global_rcu_ptr->rcu, kasan_rcu_reclaim); }

可以看到,这里通过kmalloc申请了一个内存ptr,然后用rcu_dereference来确保指针加载,最后通过call_rcu等到宽限期完成(所有读者完成)后,调用回调函数,测试代码主要在回调函数中做了uaf的操作。
比较遗憾的是,kasan没有检测到rcu宽限期后的uaf问题,反而是rcuos上报了IABT的指令错误。这个有点想不明白,如有明白的可以指教。rcuos的IABT很清晰易懂日志如下,这里不多解释了。

[ 205.038853] Unable to handle kernel paging request at virtual address ffffffd003815000 [ 205.039568] Mem abort info: [ 205.039826] ESR = 0x86000007 [ 205.040107] EC = 0x21: IABT (current EL), IL = 32 bits [ 205.040581] SET = 0, FnV = 0 [ 205.040861] EA = 0, S1PTW = 0 [ 205.041154] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000004bfc000 [ 205.041748] [ffffffd003815000] pgd=00000001ff7ff003, p4d=00000001ff7ff003, pud=00000001ff7ff003, pmd=000000000fa27003, pte=0000000000000000 [ 205.042945] Internal error: Oops: 86000007 [#1] SMP [ 205.043380] Modules linked in: [ 205.043673] CPU: 1 PID: 22 Comm: rcuos/1 Tainted: G B 5.10.198 #92

wq形式的uaf

rcu的uaf问题,没搞明白,但是workqueue的uaf的测试用例展示了很常见的uaf问题,相关代码如下

static noinline void __init kasan_workqueue_work(struct work_struct *work) { kfree(work); } static noinline void __init kasan_workqueue_uaf(void) { struct workqueue_struct *workqueue; struct work_struct *work; workqueue = create_workqueue("kasan_wq_test"); if (!workqueue) { pr_err("Allocation failed\n"); return; } work = kmalloc(sizeof(struct work_struct), GFP_KERNEL); if (!work) { pr_err("Allocation failed\n"); return; } INIT_WORK(work, kasan_workqueue_work); queue_work(workqueue, work); destroy_workqueue(workqueue); pr_info("use-after-free on workqueue\n"); ((volatile struct work_struct *)work)->data; }

这里典型的触发了在wq场景下的uaf问题,其日志如下。下面逐一解释

[ 37.721631] ================================================================== [ 37.721639] BUG: KASAN: use-after-free in kasan_workqueue_uaf+0x140/0x158 [test_kasan_module] [ 37.721644] Read of size 8 at addr ffffff80775af000 by task insmod/2047 [ 37.721646] [ 37.721651] CPU: 5 PID: 2047 Comm: insmod Tainted: G B 5.10.198 #92 [ 37.721655] Hardware name: Firefly ROC-RK3588S-PC V13 MIPI(Linux) (DT) [ 37.721658] Call trace: [ 37.721664] dump_backtrace+0x0/0x3bc [ 37.721668] show_stack+0x1c/0x24 [ 37.721672] dump_stack_lvl+0x130/0x168 [ 37.721678] print_address_description.constprop.0+0x74/0x2b8 [ 37.721689] kasan_report+0x1e8/0x200 [ 37.721694] __asan_report_load8_noabort+0x30/0x5c [ 37.721699] kasan_workqueue_uaf+0x140/0x158 [test_kasan_module] [ 37.721704] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 37.721708] do_one_initcall+0xb0/0x4e0 [ 37.721713] do_init_module+0x14c/0x600 [ 37.721717] load_module+0x5714/0x71fc [ 37.721721] __do_sys_finit_module+0x110/0x1a0 [ 37.721725] __arm64_sys_finit_module+0x70/0xa0 [ 37.721730] el0_svc_common.constprop.0+0xf0/0x464 [ 37.721734] do_el0_svc+0x44/0x5c [ 37.721737] el0_svc+0x1c/0x30 [ 37.721741] el0_sync_handler+0xa8/0xac [ 37.721745] el0_sync+0x158/0x180 [ 37.721747] [ 37.721752] Allocated by task 2047: [ 37.721756] kasan_save_stack+0x24/0x50 [ 37.721765] __kasan_kmalloc+0x88/0xb0 [ 37.721769] kmem_cache_alloc_trace+0x1d0/0x3c0 [ 37.721774] kasan_workqueue_uaf+0x80/0x158 [test_kasan_module] [ 37.721779] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 37.721783] do_one_initcall+0xb0/0x4e0 [ 37.721787] do_init_module+0x14c/0x600 [ 37.721790] load_module+0x5714/0x71fc [ 37.721794] __do_sys_finit_module+0x110/0x1a0 [ 37.721799] __arm64_sys_finit_module+0x70/0xa0 [ 37.721803] el0_svc_common.constprop.0+0xf0/0x464 [ 37.721807] do_el0_svc+0x44/0x5c [ 37.721810] el0_svc+0x1c/0x30 [ 37.721814] el0_sync_handler+0xa8/0xac [ 37.721817] el0_sync+0x158/0x180 [ 37.721820] [ 37.721823] Freed by task 227: [ 37.721828] kasan_save_stack+0x24/0x50 [ 37.721832] kasan_set_track+0x24/0x34 [ 37.721841] kasan_set_free_info+0x24/0x44 [ 37.721844] __kasan_slab_free+0xd8/0x134 [ 37.721848] kfree+0xe0/0x500 [ 37.721853] kasan_workqueue_work+0xc/0x14 [test_kasan_module] [ 37.721858] process_one_work+0x624/0x1240 [ 37.721861] worker_thread+0x3b8/0xe90 [ 37.721865] kthread+0x2c0/0x344 [ 37.721869] ret_from_fork+0x10/0x18 [ 37.721872] [ 37.721874] Last potentially related work creation: [ 37.721878] kasan_save_stack+0x24/0x50 [ 37.721882] kasan_record_aux_stack+0xbc/0xd0 [ 37.721886] insert_work+0x54/0x2e0 [ 37.721890] __queue_work+0x3a8/0xca0 [ 37.721893] queue_work_on+0x9c/0xd0 [ 37.721898] kasan_workqueue_uaf+0x114/0x158 [test_kasan_module] [ 37.721903] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 37.721908] do_one_initcall+0xb0/0x4e0 [ 37.721917] do_init_module+0x14c/0x600 [ 37.721921] load_module+0x5714/0x71fc [ 37.721925] __do_sys_finit_module+0x110/0x1a0 [ 37.721929] __arm64_sys_finit_module+0x70/0xa0 [ 37.721933] el0_svc_common.constprop.0+0xf0/0x464 [ 37.721937] do_el0_svc+0x44/0x5c [ 37.721940] el0_svc+0x1c/0x30 [ 37.721944] el0_sync_handler+0xa8/0xac [ 37.721947] el0_sync+0x158/0x180 [ 37.721950] [ 37.721954] The buggy address belongs to the object at ffffff80775af000 [ 37.721954] which belongs to the cache kmalloc-128 of size 128 [ 37.721963] The buggy address is located 0 bytes inside of [ 37.721963] 128-byte region [ffffff80775af000, ffffff80775af080) [ 37.721971] The buggy address belongs to the page: [ 37.721979] page:00000000e417a6e1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x775ae [ 37.721986] head:00000000e417a6e1 order:1 compound_mapcount:0 [ 37.721996] flags: 0x10200(slab|head) [ 37.722011] raw: 0000000000010200 ffffffff013e2500 0000000300000003 ffffff8007003c80 [ 37.722018] raw: 0000000000000000 0000000080200020 00000001ffffffff ffffff803619b601 [ 37.722025] page dumped because: kasan: bad access detected [ 37.722031] page->mem_cgroup:ffffff803619b601 [ 37.722038] [ 37.722043] Memory state around the buggy address: [ 37.722050] ffffff80775aef00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 37.722056] ffffff80775aef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 37.722063] >ffffff80775af000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 37.722069] ^ [ 37.722078] ffffff80775af080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 37.722090] ffffff80775af100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 37.722096] ==================================================================

第一个信息,提示了kasan检测到了uaf问题,在地址ffffff80775af000读取了8字节

[ 37.721639] BUG: KASAN: use-after-free in kasan_workqueue_uaf+0x140/0x158 [test_kasan_module] [ 37.721644] Read of size 8 at addr ffffff80775af000 by task insmod/2047

这里8字节的原因是访问了work_strcut的第一个成员data,如下

((volatile struct work_struct *)work)->data; (gdb) ptype /o struct work_struct /* offset | size */ type = struct work_struct { /* 0 | 8 */ atomic_long_t data; /* 8 | 16 */ struct list_head { /* 8 | 8 */ struct list_head *next; /* 16 | 8 */ struct list_head *prev; /* total size (bytes): 16 */ } entry; /* 24 | 8 */ work_func_t func; /* total size (bytes): 32 */ }

第二个信息,提供了检测存在问题的堆栈和存在问题的内存地址申请堆栈以及内存释放堆栈和最相关的workqueue的堆栈

[ 28.683508] Call trace: [ 28.683513] dump_backtrace+0x0/0x3bc [ 28.683532] show_stack+0x1c/0x24 [ 28.683552] dump_stack_lvl+0x130/0x168 [ 28.683566] print_address_description.constprop.0+0x74/0x2b8 [ 28.683579] kasan_report+0x1e8/0x200 [ 28.683592] __asan_report_load8_noabort+0x30/0x5c [ 28.683608] kasan_workqueue_uaf+0x140/0x158 [test_kasan_module] [ 28.683633] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 28.683646] do_one_initcall+0xb0/0x4e0 [ 28.683661] do_init_module+0x14c/0x600 [ 28.683674] load_module+0x5714/0x71fc [ 28.683689] __do_sys_finit_module+0x110/0x1a0 [ 28.683710] __arm64_sys_finit_module+0x70/0xa0 [ 28.683724] el0_svc_common.constprop.0+0xf0/0x464 [ 28.683737] do_el0_svc+0x44/0x5c [ 28.683750] el0_svc+0x1c/0x30 [ 28.683763] el0_sync_handler+0xa8/0xac [ 28.683776] el0_sync+0x158/0x180 [ 28.683781] [ 28.683789] Allocated by task 1953: [ 28.683793] kasan_save_stack+0x24/0x50 [ 28.683797] __kasan_kmalloc+0x88/0xb0 [ 28.683801] kmem_cache_alloc_trace+0x1d0/0x3c0 [ 28.683805] kasan_workqueue_uaf+0x80/0x158 [test_kasan_module] [ 28.683810] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 28.683814] do_one_initcall+0xb0/0x4e0 [ 28.683818] do_init_module+0x14c/0x600 [ 28.683822] load_module+0x5714/0x71fc [ 28.683825] __do_sys_finit_module+0x110/0x1a0 [ 28.683830] __arm64_sys_finit_module+0x70/0xa0 [ 28.683834] el0_svc_common.constprop.0+0xf0/0x464 [ 28.683838] do_el0_svc+0x44/0x5c [ 28.683841] el0_svc+0x1c/0x30 [ 28.683845] el0_sync_handler+0xa8/0xac [ 28.683848] el0_sync+0x158/0x180 [ 28.683851] [ 28.683855] Freed by task 676: [ 28.683859] kasan_save_stack+0x24/0x50 [ 28.683868] kasan_set_track+0x24/0x34 [ 28.683872] kasan_set_free_info+0x24/0x44 [ 28.683876] __kasan_slab_free+0xd8/0x134 [ 28.683879] kfree+0xe0/0x500 [ 28.683884] kasan_workqueue_work+0xc/0x14 [test_kasan_module] [ 28.683889] process_one_work+0x624/0x1240 [ 28.683893] worker_thread+0x3b8/0xe90 [ 28.683897] kthread+0x2c0/0x344 [ 28.683901] ret_from_fork+0x10/0x18 [ 28.683904] [ 28.683907] Last potentially related work creation: [ 28.683910] kasan_save_stack+0x24/0x50 [ 28.683914] kasan_record_aux_stack+0xbc/0xd0 [ 28.683919] insert_work+0x54/0x2e0 [ 28.683923] __queue_work+0x3a8/0xca0 [ 28.683926] queue_work_on+0x9c/0xd0 [ 28.683931] kasan_workqueue_uaf+0x114/0x158 [test_kasan_module] [ 28.683936] test_kasan_module_init+0x20/0xa78 [test_kasan_module] [ 28.683939] do_one_initcall+0xb0/0x4e0 [ 28.683944] do_init_module+0x14c/0x600 [ 28.683949] load_module+0x5714/0x71fc [ 28.683953] __do_sys_finit_module+0x110/0x1a0 [ 28.683962] __arm64_sys_finit_module+0x70/0xa0 [ 28.683967] el0_svc_common.constprop.0+0xf0/0x464 [ 28.683970] do_el0_svc+0x44/0x5c [ 28.683974] el0_svc+0x1c/0x30 [ 28.683978] el0_sync_handler+0xa8/0xac [ 28.683981] el0_sync+0x158/0x180

为什么是最可能的wq堆栈呢,这主要是kasan通过打印的是最后一个wq的堆栈,它没办法准确的知道最后一个wq堆栈是否就是存在uaf的wq堆栈(万一多个workqueue都访问这个对象,但是真正存在uaf的是第二个堆栈)。 第三个信息,给我们展示了page信息,如page结构体,引用计数,映射计数,映射情况,pfn号,头页地址,order值,复合页计数,页种类,元数据,和总结的错误类型

[ 28.683988] The buggy address belongs to the object at ffffff80833f8800 [ 28.683988] which belongs to the cache kmalloc-128 of size 128 [ 28.683992] The buggy address is located 0 bytes inside of [ 28.683992] 128-byte region [ffffff80833f8800, ffffff80833f8880) [ 28.683995] The buggy address belongs to the page: [ 28.684001] page:00000000064eb9ca refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x833f8 [ 28.684005] head:00000000064eb9ca order:1 compound_mapcount:0 [ 28.684009] flags: 0x10200(slab|head) [ 28.684014] raw: 0000000000010200 dead000000000100 dead000000000122 ffffff8007003c80 [ 28.684018] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 28.684021] page dumped because: kasan: bad access detected

第四个信息,打印了影子区域下毒的值,这里是fa,对应KASAN_KMALLOC_FREETRACK,专门用作uaf的检测,可以看到,kasan为了支持uaf的检测,将slab的128字节全部poison成fa/fb

[ 28.684028] Memory state around the buggy address: [ 28.684036] ffffff80833f8700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 28.684040] ffffff80833f8780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.684044] >ffffff80833f8800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 28.684047] ^ [ 28.684050] ffffff80833f8880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 28.684054] ffffff80833f8900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

总结

至此,本文简单实战了kasan的测试内容,kasan和asan很多原理相似,但又有不同,需要注意区分。
经过实践可以发现kasan的其他测试用例合并到kunit测试框架去了,虽然通过《Linux在x86上的uml》可以运行kunit,但是为了方便,我还是基于现有硬件环境测试,毕竟修改代码比重新编译uml的内核来说费时更少。 接下来基于其他的测试用例实现内核代码,进行进一步的实践工作。

参考链接

https://www.kernel.org/doc/html/latest/translations/zh_CN/dev-tools/kasan.html