Skip to content

Conversation

@Pavel-Durov
Copy link
Contributor

Add -fno-plt compiler flag to yklua's Makefile to eliminate PLT (Procedure Linkage Table) indirection for calls from yklua to libykcapi functions.

This change partially eliminates PLT overhead, with improvements for 3 benchmarks when running with the interpreter only (JIT disabled):

Summary

Key reductions:

  • __yk_promote_ptr@plt overhead: 1.2–1.5% → 0% (eliminated)
  • __yk_promote_usize@plt overhead: 0.8–1.5% → 0% (eliminated)

Note: __yk_idempotent_promote_i32@plt and __ykrt_control_point@plt stubs persist - I think we'll need to update llvm pass to optimise it (I have a branch for that).

Perf stats

Note: perf data collected with YKD_JITC=none

LuLPeg

PLT Stub Baseline % Baseline Samples This Change % This Change Samples
__yk_promote_ptr@plt 1.47% 3,319 0% 0
__yk_promote_usize@plt 1.46% 3,301 0% 0
__yk_idempotent_promote_i32@plt 1.53% 3,443 1.43% 3,324
__ykrt_control_point@plt 1.43% 3,226 1.48% 3,430

Richards

PLT Stub Baseline % Baseline Samples This Change % This Change Samples
__yk_promote_ptr@plt 1.24% 67 0% 0
__yk_promote_usize@plt 0.94% 51 0% 0
__yk_idempotent_promote_i32@plt 1.14% 62 1.33% 70
__ykrt_control_point@plt 1.27% 69 1.00% 53

Baseline total: 5.4K samples, This Change total: 5.3K samples

Havlak

PLT Stub Baseline % Baseline Samples This Change % This Change Samples
__yk_promote_ptr@plt 0.82% 2,983 0% 0
__yk_promote_usize@plt 0.81% 2,956 0% 0
__yk_idempotent_promote_i32@plt 0.85% 3,111 0.81% 2,867
__ykrt_control_point@plt 0.80% 2,934 0.81% 2,888

Bypass the PLT for external function calls, reducing call overhead.
@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

Are there any semantic consequences of not using PLT stubs?

@Pavel-Durov
Copy link
Contributor Author

Are there any semantic consequences of not using PLT stubs?

From I read (I might be missing something here)

With PLT, LD_PRELOAD and similar mechanisms can intercept function calls at runtime. Without PLT, symbol interposition may not work for those functions.

What's affected:

ltrace (relies on PLT interception to trace library calls)
Tools using LD_PRELOAD to intercept specific functions in libykcapi

What still works fine:

gdb (breakpoints, stepping, stack traces - actually cleaner without PLT frames)
valgrind (memory checking, profiling)
perf (profiling)

Note on malloc interception

Memory debuggers that intercept malloc/free are unaffected because those calls go to libc, not libykcapi. The -fno-plt flag only affects calls from yklua to libykcapi functions (_yk_promote*, etc.), not calls to system libraries.

@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

We intercept a couple of functions like pthread_create. Can we check that these are still going to work?

(search for wrap in yk-config)

@ltratt
Copy link
Contributor

ltratt commented Jan 22, 2026

@vext01 Do we have tests that cover these? If not, how might Pavel tell if things still work?

@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

These two tests might cover this:

  • tests/c/many_threads_many_locs.c
  • tests/c/many_threads_one_loc.c

I think we wrap thread creation so that we can create a shadow stack for new threads. Maybe for destruction too(?). I'm not sure though -- I didn't implement this part of the system.

What I'd do is make the wrapper functions crash and then check they still crash on your branch.

@Pavel-Durov
Copy link
Contributor Author

I was assuming that if yk tests are passing with this yklua change then this is a safe change :)

@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

(and if the wrapper functions don't crash before this branch, then we don't have good test coverage)

@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

Ah. yklua doesn't use threads, so you will probably get away with this for now.

Sorry, I thought this was a yk change.

@vext01 vext01 added this pull request to the merge queue Jan 22, 2026
@Pavel-Durov
Copy link
Contributor Author

Also my understanding is that --wrap transformation happens during linking - all references to pthread_create are redirected to __wrap_pthread_create

PLT elimination (-fno-plt) only affects how runtime dynamic linking resolves external symbols in shared libraries. Since the --wrap redirection is already baked into the binary at link time, it's should be safe.

Merged via the queue into ykjit:main with commit f677a8c Jan 22, 2026
2 checks passed
@ltratt
Copy link
Contributor

ltratt commented Jan 22, 2026

Ah. yklua doesn't use threads, so you will probably get away with this for now.

Hang on, are we saying this will break in the future?

@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

I was worried that if the lua interpreter ever introduces calls to pthread_create in the future, then we might miss wrapping them, but pavel's most recent comment suggests that it would still work.

@ltratt
Copy link
Contributor

ltratt commented Jan 22, 2026

@Pavel-Durov Can you double check that this really works?

@vext01 Would this be exercised by the thread tests in tests/c?

@Pavel-Durov
Copy link
Contributor Author

The --wrap mechanism is completely orthogonal to PLT, even if Lua introduces pthread_create calls in the future, they would still be wrapped.

@vext01
Copy link
Contributor

vext01 commented Jan 22, 2026

Would this be exercised by the thread tests in tests/c?

Not currently, because C tests are not linked with this PLT optimisation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants