-
Notifications
You must be signed in to change notification settings - Fork 6
Add -fno-plt optimisation to CFLAGS #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add -fno-plt optimisation to CFLAGS #135
Conversation
Bypass the PLT for external function calls, reducing call overhead.
|
Are there any semantic consequences of not using PLT stubs? |
From I read (I might be missing something here) With PLT, LD_PRELOAD and similar mechanisms can intercept function calls at runtime. Without PLT, symbol interposition may not work for those functions. What's affected:ltrace (relies on PLT interception to trace library calls) What still works fine:gdb (breakpoints, stepping, stack traces - actually cleaner without PLT frames) Note on malloc interceptionMemory debuggers that intercept malloc/free are unaffected because those calls go to libc, not libykcapi. The -fno-plt flag only affects calls from yklua to libykcapi functions (_yk_promote*, etc.), not calls to system libraries. |
|
We intercept a couple of functions like (search for |
|
@vext01 Do we have tests that cover these? If not, how might Pavel tell if things still work? |
|
These two tests might cover this:
I think we wrap thread creation so that we can create a shadow stack for new threads. Maybe for destruction too(?). I'm not sure though -- I didn't implement this part of the system. What I'd do is make the wrapper functions crash and then check they still crash on your branch. |
|
I was assuming that if yk tests are passing with this yklua change then this is a safe change :) |
|
(and if the wrapper functions don't crash before this branch, then we don't have good test coverage) |
|
Ah. yklua doesn't use threads, so you will probably get away with this for now. Sorry, I thought this was a |
|
Also my understanding is that PLT elimination (-fno-plt) only affects how runtime dynamic linking resolves external symbols in shared libraries. Since the --wrap redirection is already baked into the binary at link time, it's should be safe. |
Hang on, are we saying this will break in the future? |
|
I was worried that if the lua interpreter ever introduces calls to |
|
@Pavel-Durov Can you double check that this really works? @vext01 Would this be exercised by the thread tests in |
|
The --wrap mechanism is completely orthogonal to PLT, even if Lua introduces pthread_create calls in the future, they would still be wrapped. |
Not currently, because C tests are not linked with this PLT optimisation. |
Add
-fno-pltcompiler flag to yklua's Makefile to eliminate PLT (Procedure Linkage Table) indirection for calls from yklua to libykcapi functions.This change partially eliminates PLT overhead, with improvements for 3 benchmarks when running with the interpreter only (JIT disabled):
Summary
Key reductions:
__yk_promote_ptr@pltoverhead: 1.2–1.5% → 0% (eliminated)__yk_promote_usize@pltoverhead: 0.8–1.5% → 0% (eliminated)Note:
__yk_idempotent_promote_i32@pltand__ykrt_control_point@pltstubs persist - I think we'll need to update llvm pass to optimise it (I have a branch for that).Perf stats
Note: perf data collected with
YKD_JITC=noneLuLPeg
__yk_promote_ptr@plt__yk_promote_usize@plt__yk_idempotent_promote_i32@plt__ykrt_control_point@pltRichards
__yk_promote_ptr@plt__yk_promote_usize@plt__yk_idempotent_promote_i32@plt__ykrt_control_point@pltBaseline total: 5.4K samples, This Change total: 5.3K samples
Havlak
__yk_promote_ptr@plt__yk_promote_usize@plt__yk_idempotent_promote_i32@plt__ykrt_control_point@plt