[dylink] Normalize library paths to prevent duplicate loading#26399
[dylink] Normalize library paths to prevent duplicate loading#26399pepijndevos wants to merge 3 commits intoemscripten-core:mainfrom
Conversation
When a shared library in a subdirectory references a dependency via $ORIGIN/.. rpath, findLibraryFS resolves it to a non-canonical path (e.g. "sub/../lib.so"). Since loadDynamicLibrary uses the raw path as the LDSO key, this causes the same library to be loaded twice. This complements patch 0003 which added findLibraryFS for nested dependency resolution but didn't normalize the resolved paths. Upstream: emscripten-core/emscripten#26399 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6a15c59 to
0661a9e
Compare
|
Can you add a test @pepijndevos? |
When a shared library in a subdirectory references a dependency via $ORIGIN/.. rpath, findLibraryFS resolves it to a non-canonical path containing ".." (e.g. "sub/../lib.so"). Since loadDynamicLibrary uses the raw path as the LDSO key, this causes the same library to be loaded twice under different names, running constructors twice. Fix by normalizing libName with PATH.normalize() at the top of loadDynamicLibrary, matching what dlopenInternal already does. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0661a9e to
894496e
Compare
|
I have made an attempt 🫣 |
test/test_other.py
Outdated
| printf("base: %d\n", base_func()); | ||
| printf("plugin: %d\n", plugin_func()); | ||
| printf("init_count: %d\n", get_init_count()); |
There was a problem hiding this comment.
Can we use assert(base_func() == 42) etc here? I think that's easier to understand than matching the output agaisnt base: 42\nplugin: 43\ninit_count: 1\n.
There was a problem hiding this comment.
You can print success at end and match on that.
|
Can we now remove from |
test/test_other.py
Outdated
| create_file('libbase.cpp', r''' | ||
| #include <stdio.h> | ||
| static int init_count = 0; | ||
| struct Init { Init() { init_count++; } }; |
There was a problem hiding this comment.
Can you make this a C file using __attribute__((constructor)) instead
test/test_other.py
Outdated
| self.run_process([EMCC, '-o', 'libbase.so', 'libbase.cpp', '-sSIDE_MODULE']) | ||
| self.run_process([EMCC, '-o', 'subdir/libplugin.so', 'libplugin.c', '-sSIDE_MODULE', './libbase.so', '-Wl,-rpath,$ORIGIN/..']) | ||
| self.do_runf('main.c', 'base: 42\nplugin: 43\ninit_count: 1\n', | ||
| cflags=['--profiling-funcs', '-sMAIN_MODULE=2', '-sINITIAL_MEMORY=32Mb', |
There was a problem hiding this comment.
Is -sINITIAL_MEMORY=32Mb needed here?
sbc100
left a comment
There was a problem hiding this comment.
lgtm % comments.
Thanks for working on this!
- Convert test library from C++ to C using __attribute__((constructor)) - Use assert() checks instead of printf output matching - Remove unnecessary -sINITIAL_MEMORY=32Mb flag - Remove redundant PATH.normalize() from dlopenInternal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Did the code size check prevent auto merge? |
|
Yup. |
|
Apologies. Code size test are often in flux. I just landed an auto-update to the code size test this minute, for example. Can you run |
|
If you are having trouble getting the codesize test results to match I can make a commit to this PR, but please try to it yourself first. |
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (9) test expectation files were updated by running the tests with `--rebaseline`: ``` codesize/test_codesize_cxx_ctors1.json: 152368 => 152420 [+52 bytes / +0.03%] codesize/test_codesize_cxx_ctors2.json: 151765 => 151817 [+52 bytes / +0.03%] codesize/test_codesize_cxx_lto.json: 120968 => 121028 [+60 bytes / +0.05%] codesize/test_codesize_cxx_noexcept.json: 154375 => 154427 [+52 bytes / +0.03%] codesize/test_codesize_cxx_wasmfs.json: 179944 => 179996 [+52 bytes / +0.03%] codesize/test_codesize_hello_dylink.json: 44326 => 44334 [+8 bytes / +0.02%] codesize/test_codesize_hello_dylink_all.json: 822383 => 822379 [-4 bytes / -0.00%] codesize/test_minimal_runtime_code_size_hello_embind.json: 14891 => 14908 [+17 bytes / +0.11%] codesize/test_minimal_runtime_code_size_hello_embind_val.json: 11687 => 11703 [+16 bytes / +0.14%] Average change: +0.05% (-0.00% - +0.14%) ```
Head branch was pushed to by a user without write access
|
Well that was fun now everything conflicts 🙈 I could rebase and do it again but uh that seems like the kind of system best touched by a contributor who can merge it before it gets out of date. |
Sorry yes these file to tend to change often. I just revert + rebase + ./tools/maint/rebaseline_tests.py each time it happens. |
Summary
When a shared library in a subdirectory references a dependency via
$ORIGIN/..rpath,findLibraryFSresolves it to a non-canonical path containing..(e.g.sub/../lib.so). SinceloadDynamicLibraryuses the raw path as theLDSO.loadedLibsByNamekey, this causes the same library to be loaded twice under different names, running__wasm_call_ctorstwice.This was discovered while building KLayout for Pyodide. KLayout has plugins in a
db_plugins/subdirectory that depend on shared libraries in the parent directory via-Wl,-rpath,$ORIGIN/... The rpath gets embedded in the dylink.0RUNTIME_PATHsection. When Pyodide loads the plugins,findLibraryFSresolves the dependency to e.g./lib/.../db_plugins/../lib_gsi.so, which doesn't match the already-loaded/lib/.../lib_gsi.soin LDSO.Note that
dlopenInternalalready callsPATH.normalize(filename)before callingloadDynamicLibrary. However, nested dependencies loaded via theneededDynlibsreduce chain bypassdlopenInternaland callloadDynamicLibrarydirectly with the non-canonical path fromfindLibraryFS.Fix
Add
libName = PATH.normalize(libName)at the top ofloadDynamicLibrary, matching whatdlopenInternalalready does.Related
findLibraryFSfor nested dependency resolution (the path it returns is not normalized)🤖 Generated with Claude Code