2025-03-03

在网络负载非常重的情况下，对于文件服务器、高流量Web服务器这样的应用来说，把不同的网卡IRQ均衡地绑定到不同的CPU核心上，将会减轻单个CPU的负担，提高多CPU、多核心的整体处理中断的能力。对于数据库服务器这样的应用来说，把磁盘控制器绑到一个CPU核心，把网卡绑定到另一个CPU核心上，将会提高数据库的响应时间，达到优化性能的目的。合理地根据自己的生产环境和应用的特点来平衡IRQ中断有助于提高系统的整体吞吐能力和性能。这里介绍一下网络设备的绑核操作。

一：中断的affinity

在proc文件系统中，为中断提供了smp_affinity和smp_affinity_list接口，允许给指定的IRQ源绑定目标的CPU，而在/proc/irq/default_smp_affinity中，通过掩码的方式指定了IRQ的默认配置掩码。一般是ff，也就是所有CPU（0-15）。主要示例如下：


# cat /proc/irq/default_smp_affinity
ff
# cat /proc/irq/1/smp_affinity
ff
# cat /proc/irq/1/smp_affinity_list
0-7

二：实践绑核

2.1 查看硬中断号


# cat /proc/interrupts | grep eth1
110: 69925 0 0 0 0 0 0 14520 GICv3 259 Level eth1
111: 0 0 0 0 0 0 0 0 GICv3 258 Level eth1

2.2 smp_affinity_list设置


echo 7 > /proc/irq/110/smp_affinity_list
echo 7 > /proc/irq/111/smp_affinity_list

这里意思为将110和111中断绑定在CPU序号为7上

2.3 smp_affinity 设置

如smp_affinity_list已经设置，则smp_affinity 可无需设置

smp_affinity 按照CPU掩码计算，如下


cpu0 0001         0 1
cpu1 0010         1 2
cpu2 0100         2 4
cpu3 1000         3 8
cpu4 10000        4 10.
cpu5 100000      5 40
cpu6 1000000    6 80
cpu7 10000000 7 100

如果绑定为第8个CPU，则设置


echo 100 > /proc/irq/110/smp_affinity
echo 100 > /proc/irq/111/smp_affinity

2.4 测试

ping 测试


$ ping -I eth1 0.0.0.0
PING 0.0.0.0 (172.25.80.124) from 172.25.80.124 eth1: 56(84) bytes of data.
64 bytes from 172.25.80.124: icmp_seq=1 ttl=64 time=0.064 ms
64 bytes from 172.25.80.124: icmp_seq=2 ttl=64 time=0.035 ms

查看中断


# cat /proc/interrupts | grep eth1
110: 69925 0 0 0 0 0 0 17112 GICv3 259 Level eth1
111: 0 0 0 0 0 0 0 0 GICv3 258 Level eth1

这里确定已经绑定成功了

阅读全文

使用vscode开发linux内核以及系统

编辑

2025-03-03

工作知识

如果工作机器是windows，在windows上需要跨系统进行系统开发，往往按照之前的办法是通过设置samba来共享工作目录，本地通过source insight工具来进行软件开发。但是缺陷是

1.si是收费软件
2.samba需要配置好固定的工作目录，工程目录需要自行配置和加载。

也就是每次配置都比较麻烦。针对此问题，vscode + ssh remote + clangd 可以提供更好的交互效果。这里介绍一下vscode 的配置步骤

一：安装工具

1.1 服务器端


apt install bear 
apt install clangd

这里注意clangd需要是llvm11及以上的二进制

1.2 客户端

vscode下载


https://code.visualstudio.com/Download

vscode插件配置

必要插件


Clangd
Remote SSH

可选插件


C/C++ 
C/C++ Extension Pack 
C/C++ Snippets
DeviceTree 
Rainbow Highlighter
Arm Assembly 
Hex Editor 
Markdown All in OneMarkdown Preview Enhanced

1.3 ssh密钥配置

在windows默认目录 C:\Users\XXX.ssh 内拿到公钥id_rsa.pub

将其传入服务器上，并运行如下


cat id_rsa.pub >> ~/.ssh/authorized_keys

验证ssh可以通过 cmd 上运行 ssh xxxxxxx 可自动免密进入即可

二：使用步骤

2.1 ssh连接

vscode如下点击

在SSH右端可点击添加host，输入连接命令即可创建远程连接，这里图片连接了 172.25.130.130 172.25.130.31

2.2 打开目录

vscode打开工程目录和普通方式一样，只不过这里打开的是远程的目录文件

2.3 为remote host添加扩展

点击设置--->扩展-→为remote安装

扩展选择clangd即可

2.4 生成compile_commands.json文件

如果是内核代码，可以通过如下命令


scripts/clang-tools/gen_compile_commands.py

如果是自己的工程，可以如下


bear make

如果是ninja工程，可以如下


ninja -t compdb > compile_commands.json

2.5 查看clangd生效

在工程界面，状态栏如果显示clangd插件已经加载即可。

接下来就能正常的支持跳转了

2.6 快捷键


输入文件名打开文件: Ctrl + P 
跳到某行: Ctrl + G + 行号 
打开文件并跳到某行: Ctrl + p 文件名:行号 
列出文件里的函数 : Ctrl + Shift + O，可以输入函数名跳转 
函数/变量跳转: 按住Ctrl同时使用鼠标左键点击、F12 
前进: Ctrl + Shift + - 
后退: Ctrl + Alt + - 
列出引用 : Shift + F12 
查找所有引用 : Alt + Shift + F12 
切换侧边栏展示/隐藏: Ctrl + B 
打开命令菜单: Ctrl + Shift + P 
手动触发建议: Ctrl + Space 
手动触发参数提示: Ctrl + Shift + Space 
打开/隐藏终端: Ctrl + `(Tab上方的那个键) 
重命名符号: F2 
当前配置调试: F5 
上/下滚编辑器: Ctrl + ↑/↓ 
搜索/替换 : Ctrl + F/H 
高亮文字：shift + alt + z 
取消高亮：shift + alt + a

阅读全文

TTBR0_ELx 和TTBR1_ELx的选择

编辑

2025-02-18

工作知识

地址范围

根据图片，我们可以知道

如果虚拟地址在0x0000000000000000到0x0000FFFFFFFFFFFF则使用ttbr0_elx
如果虚拟地址在0xFFFF000000000000到0xFFFFFFFFFFFFFFFF则使用ttbr1_elx

根据linux内核的内存划分我们可以知道
前256TB是提供给user space的
后256TB是提供的kernel space的

有效位

va的最高有效位决定了ttbrX的使用，如下

如果va的bit63是0，则使用ttbr0
如果va的bit63是1，则使用ttbr1

2025-02-13

Linux内核常见内存错误

内核里面的内存错误通常比较难处理，一般情况的内存错误有如下几点：

越界访问
访问已释放的内存
重复释放
内存泄漏
栈溢出

通常情况下，内核检测内存泄漏的方式有三种，分别如下：

slub_debug
kmemleak
kasan

接下来基于这三种方式来谈谈上述五种内存错误情况

slub_debug

我们知道，内核关于小块内存分配是通过slab/slub分配器处理，我们可以在slub中利用slub_debug来检测如下错误：

访问已经释放的内存
越界访问
释放已经释放过的内存

首先我们需要打开slub的配置项如下：

CONFIG_SLUB=y
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
CONFIG_SLUB_STATS=y

其次，我们需要在开机bootargs中添加slub_debug字符，如下


Parameters may be given to ``slub_debug``. If none is specified then full
debugging is enabled. Format:

slub_debug=<Debug-Options>
        Enable options for all slabs

slub_debug=<Debug-Options>,<slab name1>,<slab name2>,...
        Enable options only for select slabs (no spaces
        after a comma)

Multiple blocks of options for all slabs or selected slabs can be given, with
blocks of options delimited by ';'. The last of "all slabs" blocks is applied
to all slabs except those that match one of the "select slabs" block. Options
of the first "select slabs" blocks that matches the slab's name are applied.

Possible debug options are::

        F               Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS
                        Sorry SLAB legacy issues)
        Z               Red zoning
        P               Poisoning (object and padding)
        U               User tracking (free and alloc)
        T               Trace (please only use on single slabs)
        A               Enable failslab filter mark for the cache
        O               Switch debugging off for caches that would have
                        caused higher minimum slab orders
        -               Switch all debugging off (useful if the kernel is
                        configured with CONFIG_SLUB_DEBUG_ON)

F.e. in order to boot just with sanity checks and red zoning one would specify::

        slub_debug=FZ

Trying to find an issue in the dentry cache? Try::

        slub_debug=,dentry

to only enable debugging on the dentry cache.  You may use an asterisk at the
end of the slab name, in order to cover all slabs with the same prefix.  For
example, here's how you can poison the dentry cache as well as all kmalloc
slabs::

        slub_debug=P,kmalloc-*,dentry

Red zoning and tracking may realign the slab.  We can just apply sanity checks
to the dentry cache with::

        slub_debug=F,dentry
Debugging options may require the minimum possible slab order to increase as
a result of storing the metadata (for example, caches with PAGE_SIZE object
sizes).  This has a higher liklihood of resulting in slab allocation errors
in low memory situations or if there's high fragmentation of memory.  To
switch off debugging for such caches by default, use::

        slub_debug=O

You can apply different options to different list of slab names, using blocks
of options. This will enable red zoning for dentry and user tracking for
kmalloc. All other slabs will not get any debugging enabled::

        slub_debug=Z,dentry;U,kmalloc-*

You can also enable options (e.g. sanity checks and poisoning) for all caches
except some that are deemed too performance critical and don't need to be
debugged by specifying global debug options followed by a list of slab names
with "-" as options::

        slub_debug=FZ;-,zs_handle,zspage

The state of each debug option for a slab can be found in the respective files
under::

        /sys/kernel/slab/<slab name>/

If the file contains 1, the option is enabled, 0 means disabled. The debug
options from the ``slub_debug`` parameter translate to the following files::

        F       sanity_checks
        Z       red_zone
        P       poison
        U       store_user
        T       trace
        A       failslab

Careful with tracing: It may spew out lots of information and never stop if
used on the wrong slab.

然后我们需要编译slabinfo程序，如下


# cd tools/vm/
# scp slabinfo xxx@xxx:destination/

这样，如果对于越界访问，则会提示 Redzone overwritten 如下


 BUG kmalloc-32 (Tainted: G           O     ): Redzone overwritten

对于重复释放，则会提示 Object already free 如下


 BUG kmalloc-128 (Tainted: G B O ): Object already free

对于访问已经释放的内存，则会提示 Poison overwritten 如下


 BUG kmalloc-128 (Tainted: G B O ): Poison overwritten

kmemleak

kmemleak的作用是开启一个单独的扫描内存的内核线程，然后打印发现的新的未引用的对象数量，正因为只是打印未引用的对象，所有kmemleak存在误报的情况，得到的信息仅供参考
对于kmemleak，需要打开配置如下


CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=4096

然后在bootargs添加参数如下:


kmemleak=on

进入系统后，我们需要主动在问题触发前开启扫描，如下


echo scan > /sys/kernel/debug/kmemleak

等待问题出现之后，通过节点查看问题，如下


cat /sys/kernel/debug/kmemleak

存在问题则出现如下打印


unreferenced object 0xede22dc0 (size 128):

kasan

kasan是一个动态检查内存错误的工具，它可以检查如下内存问题

越界访问
使用已释放内存
重复释放

对于内核打开kasan可以通过如下


CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN=y
CONFIG_KASAN_OUTLINE=y
CONFIG_KASAN_INLINE=y

对于kasan来说，内核提供了测试程序，位置如下：


mm/kasan/kasan_test.c

我们可以利用检测如下错误

堆栈越界访问

如果产生，则出现如下日志


BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa4/0xe0 [kasan] at addr ffff800066539c7b

使用已释放内存

如果产生，则出现如下日志


BUG: KASAN: use-after-free in kmalloc_uaf+0xac/0xe0 [kasan] at addr ffff800066539e08

栈越界访问

如果产生，则出现如下日志


BUG: KASAN: stack-out-of-bounds in kasan_stack_oob+0xa8/0xf0 [kasan] at addr ffff800066acb95a

全局变量越界访问

如果产生，则出现如下日志


BUG: KASAN: global-out-of-bounds in kasan_global_oob+0x9c/0xe8 [kasan] at addr ffff7ffffc001c8d

总结

kasan总体效率比slub_debug高效，如果可以的话，能用kasan检测的错误就可以不用slub_debug。

2025-02-11

cache的结构介绍

根据上面的图可以知道，我们需要留意如下信息：

offset
line
index
way
set
tag

offset

对于cache而言，offset代表了cache的便宜，假设offset占用了4位，则我们知道cache line大小是16 Byte

line

和offset对应，offset作为PA的低四位，则cache line共计大小是16Byte

index

索引是指的有多少个cache line，index作为索引组合起来可以计算为一个way，假设index占用8位，则一个way占用256个cache line，则16*256=4096 Byte大小作为一个way。

way

我们计算了offset和index的乘积也就是4096Byte，way这里指的是路，有多少个路就是代表整个cache总大小多少个4096 Byte，假设cache的总大小是16KB，那么我们16/4=4，这里就是四路cache
结合来看，那么一个总cache大小是16KB的情况下，假设way是4路，则每一路是4KB，如果cache line是16Byte，则我们知道index就是256个

set

根据上面的计算，我们再把每个way上index相同的cache line称之为一个set，也就是一组。那么按照上面的例子，同一个index的组一共有8个，因为我们有8个way

tag

对于PA物理内存上，将除掉offset和index的位剩余的为作为tag标记，用于判断cache line存放的数据是否和处理器想要的一致。

阅读全文