ukui-window-switch是麒麟系统上的后台任务管理工具,其实际上是通过kwin来加载的特效动态库。最近我们的git仓库都通过git上ci,由ci来执行构建,但是发现ukui-window-switch软件包在本地构建正常,在ci上构建后功能异常,本文根据此问题介绍排查思路和解决办法
为了了解此问题,我们需要清楚ukui-window-switch的运行机制,对于ukui-window-switch包,其内容如下:
root@kylin:~/1# dpkg -L ukui-window-switch /. /usr /usr/bin /usr/bin/ukui-window-switch /usr/lib /usr/lib/aarch64-linux-gnu /usr/lib/aarch64-linux-gnu/qt5 /usr/lib/aarch64-linux-gnu/qt5/plugins /usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin /usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects /usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins /usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins/libwindowsview.so /usr/share /usr/share/doc /usr/share/doc/ukui-window-switch /usr/share/doc/ukui-window-switch/changelog.Debian.gz /usr/share/doc/ukui-window-switch/copyright /usr/share/kservices5 /usr/share/kservices5/ukui-kwin /usr/share/kservices5/ukui-kwin/kwin4_window_switcher_thumbnail_grid.desktop /usr/share/ukui-kwin /usr/share/ukui-kwin/tabbox /usr/share/ukui-kwin/tabbox/thumbnail_grid /usr/share/ukui-kwin/tabbox/thumbnail_grid/contents /usr/share/ukui-kwin/tabbox/thumbnail_grid/contents/ui /usr/share/ukui-kwin/tabbox/thumbnail_grid/contents/ui/main.qml /usr/share/ukkcoreaddons/ui-kwin/tabbox/thumbnail_grid/metadata.desktop
可以发现,其关键点是libwindowsview.so的动态库文件。而加载此动态库的程序是kwin,所以我们留意kwin的代码如下:
QList<KPluginMetaData> ScriptedEffectLoader::findAllEffects() const { #if defined(QT_NO_DEBUG) QString packageRoot = QStringLiteral("ukui-kwin/effects"); #else QString packageRoot = kwinApp()->applicationDirPath() + QLatin1String("/../../effects"); if (access(packageRoot.toStdString().c_str(), F_OK) == -1) packageRoot = QStringLiteral("ukui-kwin/effects"); qDebug() << "Load effects from:" << packageRoot; #endif return KPackage::PackageLoader::self()->listPackages(s_serviceType, packageRoot); }
而对于effect的load,则在loadEffect函数
bool PluginEffectLoader::loadEffect(const QString &name) { const auto info = findEffect(name); if (!info.isValid()) { return false; } return loadEffect(info, LoadEffectFlag::Load); }
此时我们关注factory函数,如下:
EffectPluginFactory *PluginEffectLoader::factory(const KPluginMetaData &info) const { if (!info.isValid()) { return nullptr; } QString fileName = info.fileName(); if (0 == info.pluginId().compare("UKUI-KWin-Windows-View")) { QString tmpFile = qEnvironmentVariableIsSet("UKUI-KWin-Windows-View_LIBRARY") ? qgetenv("UKUI-KWin-Windows-View_LIBRARY") : info.fileName(); if (QFile::exists(tmpFile)) fileName = tmpFile; } KPluginLoader loader(fileName); if (loader.pluginVersion() != KWIN_EFFECT_API_VERSION) { qDebug() << info.pluginId() << " has not matching plugin version, expected " << KWIN_EFFECT_API_VERSION << "got " << loader.pluginVersion(); return nullptr; } KPluginFactory *factory = loader.factory(); if (!factory) { qDebug() << "Did not get KPluginFactory for " << info.pluginId(); return nullptr; } return dynamic_cast< EffectPluginFactory* >(factory); }
根据上面的代码,我们可以知道其so的加载过程,接下来我们查看报错信息
我们知道ukui_kwin的错误日志在ukui_kwin_0.log中,所以找到错误如下:
"UKUI-KWin-Windows-View" has not matching plugin version, expected 229 got 4294967295
这里我们可以发现,我们期望的so的version是229,但是得到的是4294967295。我们计算4294967295的值是0xffffffff
而针对代码,我们留意宏如下:
KWIN_EFFECT_API_VERSION
然后留意函数
loader.pluginVersion()
对于kwin,我们知道其KWIN_EFFECT_API_VERSION是229,而对于ukui-window-switch,我们需要找到so的version填入代码,如下
windowsview/multitaskviewmanagerpluginfactory.cpp
class MultitaskViewManagerPluginFactory : public KWin::EffectPluginFactory { Q_OBJECT Q_INTERFACES(KPluginFactory) Q_PLUGIN_METADATA(IID KPluginFactory_iid FILE "windowsview.json") public: MultitaskViewManagerPluginFactory() {} ~MultitaskViewManagerPluginFactory() override {} KWin::Effect* createEffect() const override { return new MultitaskView::MultitaskViewManager(); } }; K_EXPORT_PLUGIN_VERSION(KWIN_EFFECT_API_VERSION)
这里我们看到K_EXPORT_PLUGIN_VERSION会导出KWIN_EFFECT_API_VERSION
回到kwin。我们也同样查看定义 K_EXPORT_PLUGIN_VERSION
#define KWIN_EFFECT_API_MAKE_VERSION( major, minor ) (( major ) << 8 | ( minor )) #define KWIN_EFFECT_API_VERSION_MAJOR 0 #define KWIN_EFFECT_API_VERSION_MINOR 229 #define KWIN_EFFECT_API_VERSION KWIN_EFFECT_API_MAKE_VERSION( \ KWIN_EFFECT_API_VERSION_MAJOR, KWIN_EFFECT_API_VERSION_MINOR ) K_EXPORT_PLUGIN_VERSION(quint32(KWIN_EFFECT_API_VERSION))
这里我们注意K_EXPORT_PLUGIN_VERSION宏的实现,如下
/** * \relates KPluginLoader * Use this macro if you want to give your plugin a version number. * You can later access the version number with KPluginLoader::pluginVersion() */ #define K_EXPORT_PLUGIN_VERSION(version) \ Q_EXTERN_C Q_DECL_EXPORT const quint32 kde_plugin_version = version;
这里可以发现,K_EXPORT_PLUGIN_VERSION实际上是const关键字的kde_plugin_version。
现在的疑问是ukui-kwin和ukui-window-switch的定义一模一样,而且都是229,为什么会出现问题呢。
我们虽然代码都确定了是229,但是我们报错信息很明显是一个229.一个是0xffffffff。我们现在发现了kde_plugin_version是一个const值,所以可以通过二进制工具直接查看值大小。
对于本地编译的so,我们查找kde_plugin_version的真实值
# readelf -s libwindowsview.so | grep kde_plugin_version 491: 0000000000039e6c 4 OBJECT GLOBAL DEFAULT 13 kde_plugin_version
此时得到0x0000000000039e6c的偏移地址,然后objdump这个动态库如下:
# objdump -s libwindowsview.so | grep "^ 39e" 39e0 00000000 00000000 0b9d0000 12000000 ................ 39e08 4d756c74 69746173 6b566965 774d616e MultitaskViewMan 39e18 61676572 506c7567 696e4661 63746f72 agerPluginFactor 39e28 79000000 00000000 08000000 00000000 y............... 39e38 00000000 00000000 00000000 00000000 ................ 39e48 00000000 00000000 00000000 00000000 ................ 39e58 00000000 00000000 00000000 00000000 ................ 39e68 00000000 e5000000 001b00ec 37fd0075 ............7..u 39e78 006b0075 0069002d 00770069 006e0064 .k.u.i.-.w.i.n.d 39e88 006f0077 002d0073 00770069 00740063 .o.w.-.s.w.i.t.c 39e98 0068005f 007a0068 005f0043 004e002e .h._.z.h._.C.N.. 39ea8 0071006d 001b07ec 323d0075 006b0075 .q.m....2=.u.k.u 39eb8 0069002d 00770069 006e0064 006f0077 .i.-.w.i.n.d.o.w 39ec8 002d0073 00770069 00740063 0068005f .-.s.w.i.t.c.h._ 39ed8 0062006f 005f0043 004e002e 0071006d .b.o._.C.N...q.m 39ee8 00060703 7dc30069 006d0061 00670065 ....}..i.m.a.g.e 39ef8 00730003 0000783c 0071006d 006c000f .s....x<.q.m.l..
我们发现0x0000000000039e6c的值是0x000000e5,也就是229。
对于ci编译的so,我们查找kde_plugin_version的值:
# readelf -s libwindowsview.so | grep kde_plugin_version 660: 000000000003a8e4 4 OBJECT GLOBAL DEFAULT 13 kde_plugin_version
可以看到,其偏移值是0x000000000003a8e4,然后我们将其objdump出来
# objdump -s libwindowsview.so | grep "^ 3a8" 3a80 00000000 00000000 18280000 12000000 .........(...... 3a800 56003100 30000000 6f72672e 756b7569 V.1.0...org.ukui 3a810 2e4b5769 6e000000 2f4d756c 74697461 .KWin.../Multita 3a820 736b5669 65770000 6f72672e 6b64652e skView..org.kde. 3a830 4b506c75 67696e46 6163746f 72790000 KPluginFactory.. 3a840 33334d75 6c746974 61736b56 6965774d 33MultitaskViewM 3a850 616e6167 6572506c 7567696e 46616374 anagerPluginFact 3a860 6f727900 00000000 ffffffff 21000000 ory.........!... 3a870 00000000 00000000 18000000 00000000 ................ 3a880 4d756c74 69746173 6b566965 774d616e MultitaskViewMan 3a890 61676572 506c7567 696e4661 63746f72 agerPluginFactor 3a8a0 79000000 00000000 08000000 00000000 y............... 3a8b0 00000000 00000000 00000000 00000000 ................ 3a8c0 00000000 00000000 00000000 00000000 ................ 3a8d0 00000000 00000000 00000000 00000000 ................ 3a8e0 00000000 e5000000 001b00ec 37fd0075 ............7..u 3a8f0 006b0075 0069002d 00770069 006e0064 .k.u.i.-.w.i.n.d
我们找到0x000000000003a8e4 的值是0x000000e5其十进制也是229
所以我们知道,这个问题和编译构建没关系,二进制生成出来都是229,那问题出在加载时的匹配逻辑上,那个0xffffffff应该是被强制设置的。
为了确认运行时的函数,我们需要留意如下:
KPluginLoader loader(fileName); if (loader.pluginVersion() != KWIN_EFFECT_API_VERSION) {
我们关注类KPluginLoader
其实现在kcoreaddons的src/lib/plugin/kpluginloader.cpp,如下
quint32 KPluginLoader::pluginVersion() { Q_D(const KPluginLoader); if (!load()) { return qint32(-1); } return d->pluginVersion; }
可以发现,确实被人设置为-1了,正好-1就是0xffffffff那就证明了load()函数失败了。
我们留意这个load函数
bool KPluginLoader::load() { Q_D(KPluginLoader); if (!d->loader->load()) { return false; } if (d->pluginVersionResolved) { return true; } Q_ASSERT(!fileName().isEmpty()); QLibrary lib(fileName()); Q_ASSERT(lib.isLoaded()); // already loaded by QPluginLoader::load() // TODO: this messes up KPluginLoader::errorString(): it will change from unknown error to could not resolve kde_plugin_version quint32 *version = reinterpret_cast<quint32 *>(lib.resolve("kde_plugin_version")); if (version) { d->pluginVersion = *version; } else { d->pluginVersion = ~0U; } d->pluginVersionResolved = true; return true; }
我们可以知道一定是d->loader→load()返回失败,我们注意这个loader的类型,如下:
class KPluginLoaderPrivate { Q_DECLARE_PUBLIC(KPluginLoader) protected: KPluginLoaderPrivate(const QString &libname) : name(libname), loader(nullptr), pluginVersion(~0U), pluginVersionResolved(false) {} ~KPluginLoaderPrivate() {} KPluginLoader *q_ptr; const QString name; QString errorString; QPluginLoader *loader; quint32 pluginVersion; bool pluginVersionResolved; };
可以发现loader是QPluginLoader *loader;
至此我们知道了是qt的plugin加载时存在问题,导致229被强制赋值为-1.
为了能够监听到ukui-kwin加载时load动态库的过程,也就是查看QPluginLoader 的加载过程,我们有一个宏配置可以查看,如下测试验证:
kill -9 $(pidof /usr/bin/ukui-kwin_x11) QT_DEBUG_PLUGINS=1 /usr/bin/ukui-kwin_x11
此时我们的日志在/home/kylin/.log/ukui_kwin_0.log
我们得到如下信息:
250211 09:56:08.820 Debug[19478]: 无法加载库/usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins/libwindowsview.so:(/usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins/libwindowsview.so: undefined symbol: glXGetFBConfigAttrib) 250211 09:56:08.821 Warning[19478]: QLibraryPrivate::loadPlugin failed on "/usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins/libwindowsview.so" : "无法加载库/usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins/libwindowsview.so:(/usr/lib/aarch64-linux-gnu/qt5/plugins/ukui-kwin/effects/plugins/libwindowsview.so: undefined symbol: glXGetFBConfigAttrib)" 250211 09:56:08.821 Debug[19478]: "UKUI-KWin-Windows-View" has not matching plugin version, expected 229 got 4294967295 250211 09:56:08.821 Debug[19478]: Couldn't get an EffectPluginFactory for: "UKUI-KWin-Windows-View"
我们抓住关键信息:undefined symbol: glXGetFBConfigAttrib
我们可以知道这个是glx相关的函数,但是我们的系统使用的是glesv2,故我们可以屏蔽。
我们找到glXGetFBConfigAttrib的调动地址如下:
grep -nr glXGetFBConfigAttrib 匹配到二进制文件 obj-aarch64-linux-gnu/windowsview/CMakeFiles/windowsview.dir/glxtexturehandler.cpp.o windowsview/glxtexturehandler.cpp:319: glXGetFBConfigAttrib(dpy, configs[i], GLX_RED_SIZE, &red); windowsview/glxtexturehandler.cpp:320: glXGetFBConfigAttrib(dpy, configs[i], GLX_GREEN_SIZE, &green); windowsview/glxtexturehandler.cpp:321: glXGetFBConfigAttrib(dpy, configs[i], GLX_BLUE_SIZE, &blue); windowsview/glxtexturehandler.cpp:327: glXGetFBConfigAttrib(dpy, configs[i], GLX_VISUAL_ID, (int *) &visual); windowsview/glxtexturehandler.cpp:333: glXGetFBConfigAttrib(dpy, configs[i], GLX_BIND_TO_TEXTURE_RGBA_EXT, &bind_rgba); windowsview/glxtexturehandler.cpp:334: glXGetFBConfigAttrib(dpy, configs[i], GLX_BIND_TO_TEXTURE_RGB_EXT, &bind_rgb); windowsview/glxtexturehandler.cpp:340: glXGetFBConfigAttrib(dpy, configs[i], GLX_BIND_TO_TEXTURE_TARGETS_EXT, &texture_targets); windowsview/glxtexturehandler.cpp:346: glXGetFBConfigAttrib(dpy, configs[i], GLX_DEPTH_SIZE, &depth); windowsview/glxtexturehandler.cpp:347: glXGetFBConfigAttrib(dpy, configs[i], GLX_STENCIL_SIZE, &stencil);
这里可以发现windowsview/glxtexturehandler.cpp会调用libgl.so的api。引入此符号的原因是glxtexturehandler.cpp.o被成功链接到libwindowsview.so中了,但实际上我们并不需要。我们查看构建日志如下:
https://dev.kylinos.cn/+librarian/14396573/buildlog_kylin-desktop-v101-arm64.ukui-window-switch_3.1.0.1-0k0.1tablet8rk1.egf0.1build1_BUILDING.txt.gz
我们留意ld这一步,如下:
可以发现其可重定向文件.o被引用到so中。
根据cmakelists.txt的描述,我们可以根据HAVE_GLX来判断构建时是否加入
set(SRCS abstracthandler.cpp concretetexturehandler.cpp glxtexturehandler.cpp egltexturehandler.cpp windowthumbnail.cpp desktopbackground.cpp multitaskviewmanagerpluginfactory.cpp ) # glxtexturehandler.cpp is discarded when HAVE_GLX is not set if (${HAVE_GLX}) list(APPEND SRCS glxtexturehandler.cpp) endif() # translation find_package(QT NAMES Qt6 Qt5 COMPONENTS LinguistTools REQUIRED) find_package(Qt${QT_VERSION_MAJOR} COMPONENTS LinguistTools REQUIRED)
修改之后,我们再次构建如下:
可以发现ld链接libwindowsview.so的时候,不会加入glxtexturehandler.cpp.o了。此问题解决。