Skip to content

port(Macho): make simple executable running on a real system#1816

Merged
marxin merged 21 commits intowild-linker:mainfrom
marxin:macho5
Apr 7, 2026
Merged

port(Macho): make simple executable running on a real system#1816
marxin merged 21 commits intowild-linker:mainfrom
marxin:macho5

Conversation

@marxin
Copy link
Copy Markdown
Collaborator

@marxin marxin commented Apr 6, 2026

EDITTED:

The PR implements all the missing parts needed for a very first working example:

$ cat ~/Programming/testcases/main2.c
int foo = 42;

int main() {
  return 42;
}
$ clang -c ~/Programming/testcases/main2.c
$ ./target/debug/wild main2.o
$ codesign -s - -f a.out
$ ./a.out
$ echo $?
42
$ llvm-objdump -d a.out  .o
a.out:  file format mach-o arm64

Disassembly of section __TEXT,__text:

0000000100000288 <__text>:
100000288: 52800540     mov     w0, #0x2a               ; =42
10000028c: d65f03c0     ret

Original description:
The PR is currently more of a discussion space for the recent observations I made, which should ideally be addressed as part of it.

I noticed that both FileHeader and all load commands are mapped into the __TEXT segment. That is reflected in the PR, where I ended up with a slightly more complicated section-to-segment mapping:

START(Text)
  `FILE_HEADER`
START(LoadCommands)
  `__PAGEZERO`
  `__TEXT`
  `__DATA`
  `LC_MAIN`
  `__LINKEDIT`
END(LoadCommands)
START(TextSections)
  `__text`
  `__cstring`
END(Text)
END(TextSections)

START(DataSections)
  `__data`
END(DataSections)

This works reasonably well, but it also made write_segment_commands more complex (segment_type, segment_sections_type). I also noticed that some commands can occur more than once, such as LC_LOAD_DYLIB and LC_RPATH, so the number of commands will not simply be the sum of sections in the LoadCommands segment. Do you see a simpler approach here?

The second limitation is that all segments (__TEXT, __DATA, __DATA_CONST, and, with some exception handling, __LINKEDIT) must be aligned to page boundaries (16 KiB) both in the file and in virtual memory. That means I would need a mechanism to pad the last section in each segment, most likely something along these lines:

Load command 1
      cmd LC_SEGMENT_64
  cmdsize 472
  segname __TEXT
   vmaddr 0x0000000100000000
   vmsize 0x0000000000004000
  fileoff 0
 filesize 16384
...
Load command 2
      cmd LC_SEGMENT_64
  cmdsize 152
  segname __DATA_CONST
   vmaddr 0x0000000100004000
   vmsize 0x0000000000004000
  fileoff 16384
 filesize 16384
...
Load command 3
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __LINKEDIT
   vmaddr 0x0000000100008000
   vmsize 0x0000000000004000
  fileoff 32768
 filesize 6600

Do you see a good place where we could enforce this?

Issue #757

@davidlattimore
Copy link
Copy Markdown
Member

The commands table sounds a bit like the dynamic table in ELF land. At least LC_LOAD_DYLIB and LC_RPATH have direct equivalents. I guess the commands table does more than dynamic though, since it also has all the info about segments and sections. I'd imagine that we'd need code to allocate space in the command table for each thing that needs to go into it. Similar to how we allocate space for each DT_NEEDED entry in ELF's dynamic table, plus space for each of the other commands we put in there.

Regarding 16k alignment for each segment... segment extents are determined by the extents of the sections they contain. Section extents are in turn determined by the extents of the section parts that they contain. So initial placements are decided in layout_section_parts, which has access to the OutputOrder, including segment start and end events. At the moment, when we get an OrderEvent::SegmentStart, we call segment_alignment.align_modulo, but that's quite possibly an ELFy thing. So you could extract that bit of code out and put it behind a method on the Platform trait.

@marxin
Copy link
Copy Markdown
Collaborator Author

marxin commented Apr 6, 2026

ELF's dynamic table

Looking at the code, you are right, the similarities are pretty high and we should rework the command emission similarly. But, it's something we can postpone a bit, at least to a point where we'll have a working hello world binary.

All right, let me play with the layout logic.

@davidlattimore thanks for the suggestions

@marxin marxin changed the title port(Macho): include commands and file header into Text segment port(Macho): make simple executable running on a real system Apr 6, 2026
@marxin marxin marked this pull request as ready for review April 6, 2026 16:58
Copy link
Copy Markdown
Member

@davidlattimore davidlattimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A few random thoughts, but nothing to stop this merging as-is

@marxin marxin merged commit 6b54ef7 into wild-linker:main Apr 7, 2026
24 checks passed
@marxin marxin deleted the macho5 branch April 7, 2026 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants