Skip to content

jitlayers: Use GlobalISel on AArch64 at -O0/-O1#60339

Closed
IanButterworth wants to merge 12 commits into
JuliaLang:mg/llvm-21from
IanButterworth:ib/GlobalISel
Closed

jitlayers: Use GlobalISel on AArch64 at -O0/-O1#60339
IanButterworth wants to merge 12 commits into
JuliaLang:mg/llvm-21from
IanButterworth:ib/GlobalISel

Conversation

@IanButterworth
Copy link
Copy Markdown
Member

@IanButterworth IanButterworth commented Dec 7, 2025

On-top of #59950
Required llvm/llvm-project#171499 which is backported in #59950

Alternative to #60338


GlobalISel is LLVM's modern instruction selector that is designed to replace
both FastISel and SelectionDAG. On AArch64, it is mature and enabled by default
at -O0 in upstream LLVM.

This enables GlobalISel with fallback mode on AArch64, which provides faster
instruction selection than SelectionDAG while maintaining correctness by
falling back to SelectionDAG for unsupported patterns.

Note: This requires RemoveJuliaAddrspacesPass to run before codegen, which is
already the case in the current pipeline (see pipeline.cpp comment about
GlobalISel not liking Julia's address spaces).

Co-authored-by: Claude

@IanButterworth IanButterworth changed the title jitlayers: Enable FastISel on AArch64 at -O0/-O1 jitlayers: Use GlobalISel on AArch64 at -O0/-O1 Dec 7, 2025
@IanButterworth IanButterworth added the compiler:codegen Generation of LLVM IR and native code label Dec 7, 2025
@IanButterworth IanButterworth marked this pull request as ready for review December 7, 2025 14:49
@gbaraldi
Copy link
Copy Markdown
Member

gbaraldi commented Dec 8, 2025

I'm more comfortable with this than fastisel

@giordano
Copy link
Copy Markdown
Member

giordano commented Dec 8, 2025

Does this need a pkgeval?

@IanButterworth IanButterworth added the needs pkgeval Tests for all registered packages should be run with this change label Dec 8, 2025
@IanButterworth
Copy link
Copy Markdown
Member Author

Yeah a pkgeval on aarch64 would be good. @maleadt is that still possible?

@xal-0
Copy link
Copy Markdown
Member

xal-0 commented Dec 8, 2025

I believe this was enabled at one point for ahead-of-time compilation but disabled because of miscompiles? I should probably report/fix the easy ones upstream...
#54140 (comment)

The atomic Float16 miscompile still happens on this version:

$ ./usr/bin/julia --banner=short -O1
  o  | Version 1.14.0-DEV.1348 (2025-12-07)
 o o | HEAD/dc0539badaa* (fork: 26 commits, 7 days)
julia> code_native(setindex!, (Threads.Atomic{Float16}, Float16); debuginfo=:none)
[...]
"_julia_setindex!_600":                 ; @"julia_setindex!_600"
	stp	x29, x30, [sp, #-16]!           ; 16-byte Folded Spill
	mov	x29, sp
	mrs	w8, NZCV
	;DEBUG_VALUE: setindex!:v <- $w8
	stlrh	w8, [x0]
	ldp	x29, x30, [sp], #16             ; 16-byte Folded Reload
	ret

Compare with -O1 on master:

"_julia_setindex!_615":                 ; @"julia_setindex!_615"
	;DEBUG_VALUE: setindex!:x <- [$x0+0]
	;DEBUG_VALUE: setindex!:v <- $h0
	stp	x29, x30, [sp, #-16]!           ; 16-byte Folded Spill
	mov	x29, sp
	;DEBUG_VALUE: setindex!:x <- [$x0+0]
	;DEBUG_VALUE: setindex!:v <- $h0
	fmov	w8, s0
	stlrh	w8, [x0]
	ldp	x29, x30, [sp], #16             ; 16-byte Folded Reload
	ret

IIRC you can get this to trigger during bootstrapping with -O1.

@oscardssmith
Copy link
Copy Markdown
Member

yeah. we should definitely be reporting any bugs in global isel we're finding. IIUC Global Isel is intended to be the isel future (completely replacing SelectionDAG) so we want it to be as solid as possible.

@maleadt
Copy link
Copy Markdown
Member

maleadt commented Dec 9, 2025

Yeah a pkgeval on aarch64 would be good. @maleadt is that still possible?

Not easily, no; it's a very manual process.
The ARM machine in question is now also used for Base CI, so I can't take it over for a full-scale PkgEval run without compromising CI resources.

@oscardssmith
Copy link
Copy Markdown
Member

should we also use it at O2/O3? iiuc, for aarch64 it's generally expected to produce equally good code

@IanButterworth IanButterworth removed the needs pkgeval Tests for all registered packages should be run with this change label Dec 9, 2025
@IanButterworth
Copy link
Copy Markdown
Member Author

IanButterworth commented Dec 9, 2025

@xal-0 I added a test based on your example. It fails locally.

@gbaraldi
Copy link
Copy Markdown
Member

gbaraldi commented Dec 9, 2025

You can probably ask Claude to turn this into an llc test case for an LLVM issue.

@IanButterworth
Copy link
Copy Markdown
Member Author

llvm/llvm-project#171494

@IanButterworth
Copy link
Copy Markdown
Member Author

llvm/llvm-project#171499 has been merged.

Are there any other known bugs?

@oscardssmith
Copy link
Copy Markdown
Member

Is this ready to go?

@IanButterworth
Copy link
Copy Markdown
Member Author

This needs to wait for that fix to land in LLVM. If we wanted it backported there was some guidance here llvm/llvm-project#171499 (comment)

@oscardssmith
Copy link
Copy Markdown
Member

Can we just add it to our LLVM branch?

@IanButterworth IanButterworth changed the base branch from master to mg/llvm-21 March 15, 2026 21:07
@IanButterworth
Copy link
Copy Markdown
Member Author

Trying this ontop of #59950 cc @giordano

@giordano
Copy link
Copy Markdown
Member

I don't think your patch in included in the build at the moment present in #59950.

giordano and others added 8 commits March 23, 2026 18:24
LLVM 21 replaced the `Stmt*` parameter in `SValBuilder::conjureSymbolVal`
with `ConstCFGElementRef` (llvm/llvm-project#128251). Update all four
call sites in GCChecker.cpp to use `C.getCFGElementRef()` instead of
passing `Expr*` pointers directly.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
LLVM 21 (llvm/llvm-project#138092) aligns i128 to 16 bytes when
passing on the x86-32 stack. However, GCC does not support __int128
on 32-bit x86 at all, so C libraries use struct { int64_t, int64_t }
which only has 4-byte stack alignment. This mismatch caused segfaults
on Linux and incorrect results on Windows for ccalls involving Int128.

Fix by using [4 x i32] instead of i128 as the byval type for large
integer arguments, which naturally has 4-byte alignment matching the
C ABI. Also fix abi_win32.cpp to pass large primitive types like Int128
by reference (byval) instead of directly, avoiding the same alignment
issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FastISel was disabled on AArch64 in 2015 (PR JuliaLang#13393) to fix issue JuliaLang#13321, but
that issue was specifically about 32-bit ARM (ARMv7) segfaults during bootstrap.
The AArch64 exclusion was added conservatively alongside the ARM fix.

AArch64 FastISel has been actively maintained upstream with recent bug fixes:
- llvm/llvm-project#75993 (Jan 2024)
- llvm/llvm-project#133987 (May 2025)

This enables faster instruction selection for JIT compilation on AArch64 at
lower optimization levels, reducing compilation latency.
GlobalISel is LLVM's modern instruction selector that is designed to replace
both FastISel and SelectionDAG. On AArch64, it is mature and enabled by default
at -O0 in upstream LLVM.

This enables GlobalISel with fallback mode on AArch64, which provides faster
instruction selection than SelectionDAG while maintaining correctness by
falling back to SelectionDAG for unsupported patterns.

Note: This requires RemoveJuliaAddrspacesPass to run before codegen, which is
already the case in the current pipeline (see pipeline.cpp comment about
GlobalISel not liking Julia's address spaces).

Co-Authored-By: Claude <noreply@anthropic.com>
@IanButterworth IanButterworth force-pushed the ib/GlobalISel branch 3 times, most recently from 360eafb to 4b9356c Compare April 3, 2026 03:20
Add JL_GC_PROMISE_ROOTED annotations in egal_set::insert() for val,
list, and keyset. These class members are rooted by the caller via
JL_GC_PUSH3, but the GC checker cannot track rooting through class
member indirection. The post-put_key annotation on list is needed
because the returned pointer is a new value assigned to the already-
pushed member address.

Suppress optin.core.FixedAddressDereference false positives in
llvm-alloc-opt.cpp (LLVM ilist_node_base.h internals) and gc-stock.c
(gc_read_stack bounds-checked address) — this checker is new in
LLVM 21.

Co-authored-by: Claude <noreply@anthropic.com>
@giordano
Copy link
Copy Markdown
Member

$ julia +nightly -e 'using InteractiveUtils; code_native(setindex!, (Threads.Atomic{Float16}, Float16); debuginfo=:none)'
        .file   "setindex!"
        .text
        .globl  "julia_setindex!_0"             // -- Begin function julia_setindex!_0
        .p2align        4
        .type   "julia_setindex!_0",@function
"julia_setindex!_0":                    // @"julia_setindex!_0"
; Function Signature: setindex!(Base.Threads.Atomic{Float16}, Float16)
// %bb.0:                               // %top
        //DEBUG_VALUE: setindex!:x <- [$x0+0]
        //DEBUG_VALUE: setindex!:v <- $h0
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
                                        // kill: def $h0 killed $h0 def $s0
        //DEBUG_VALUE: setindex!:x <- [$x0+0]
        //DEBUG_VALUE: setindex!:v <- $h0
        fmov    w8, s0
        mov     x29, sp
        stlrh   w8, [x0]
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        ret
.Lfunc_end0:
        .size   "julia_setindex!_0", .Lfunc_end0-"julia_setindex!_0"
                                        // -- End function
        .section        ".note.GNU-stack","",@progbits
$ julia +pr60339 -O1 -e 'using InteractiveUtils; code_native(setindex!, (Threads.Atomic{Float16}, Float16); debuginfo=:none)'
        .file   "setindex!"
        .text
        .globl  "julia_setindex!_0"             // -- Begin function julia_setindex!_0
        .p2align        4
        .type   "julia_setindex!_0",@function
"julia_setindex!_0":                    // @"julia_setindex!_0"
; Function Signature: setindex!(Base.Threads.Atomic{Float16}, Float16)
// %bb.0:                               // %top
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
                                        // kill: def $h0 killed $h0 def $s0
        //DEBUG_VALUE: setindex!:x <- [$x0+0]
        //DEBUG_VALUE: setindex!:v <- $h0
        fmov    w8, s0
        mov     x29, sp
        stlrh   w8, [x0]
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        ret
.Lfunc_end0:
        .size   "julia_setindex!_0", .Lfunc_end0-"julia_setindex!_0"
                                        // -- End function
        .section        ".note.GNU-stack","",@progbits
$ julia +pr60339 -O0 -e 'using InteractiveUtils; code_native(setindex!, (Threads.Atomic{Float16}, Float16); debuginfo=:none)'
        .file   "setindex!"
        .text
        .globl  "julia_setindex!_0"             // -- Begin function julia_setindex!_0
        .p2align        4
        .type   "julia_setindex!_0",@function
"julia_setindex!_0":                    // @"julia_setindex!_0"
; Function Signature: setindex!(Base.Threads.Atomic{Float16}, Float16)
// %bb.0:                               // %top
        //DEBUG_VALUE: setindex!:x <- [$x0+0]
        //DEBUG_VALUE: setindex!:v <- $h0
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
        mov     x29, sp
                                        // kill: def $s0 killed $h0
        fmov    w8, s0
        stlrh   w8, [x0]
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        ret
.Lfunc_end0:
        .size   "julia_setindex!_0", .Lfunc_end0-"julia_setindex!_0"
                                        // -- End function
        .section        ".note.GNU-stack","",@progbits

@xal-0 looks good?

@IanButterworth IanButterworth requested a review from xal-0 April 17, 2026 01:23
@oscardssmith oscardssmith deleted the branch JuliaLang:mg/llvm-21 May 8, 2026 17:57
IanButterworth added a commit that referenced this pull request May 11, 2026
#60339 was auto-closed when
#59950 merged and can't be
reopened.

Co-authored-by: Claude <noreply@anthropic.com>
hardikxk pushed a commit to hardikxk/julia that referenced this pull request May 18, 2026
JuliaLang#60339 was auto-closed when
JuliaLang#59950 merged and can't be
reopened.

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler:codegen Generation of LLVM IR and native code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants