This LCC-derived compiler and C library targets the Gigatron VCPU. It keeps many of the ideas of the previous attempt to port LCC to the Gigatron (pgavlin's). For instance it outputs assembly code that can be parsed by Python and it features a linker writen in Python that can directly read these files. It also differs in important ways. For instance the code generator is fundamentally different.
This project provides a complete toolchain and C library for ANSI C 1989.
Some useful things to know:
-
Types
shortandintare 16 bits long. Typelongis 32 bits long. Typesfloatanddoubleare 40 bits long, using the Microsoft Basic floating point format. Both long arithmetic or floating point arithmetic incur a substantial speed penalty, vastly improved with the dev7 rom. -
Type
charis unsigned by default. This is more efficient because the C language always promotescharvalues intointvalues to perform arithmetic. Promoting a signed byte involves a clumsy sign extension. Promoting an unsigned byte comes for free with most VCPU opcodes. If you really want signed chars, usesigned charor maybe use the compiler option-Wf-unsigned_char=0. This is not recommended. -
The nonstandard type qualifier
__nearcan be used to indicate that the data lives in page zero. This allows the compiler to generate more compact code and tells the linker to place such variables in page zero. The nonstandard type qualifier__faris also recognized but currently inoperative. It will eventually indicate that the data lives in banked memory. -
The traditional C standard library offers feature rich functions like
printf()andscanf(). These functions are present but their default implementation requires a lot of space. Using option--option=PRINTF_SIMPLEselects code that only recognizes a common subset of the printf formatting strings. See the include file<gigatron/printf.h>. Alternatively one can instead call lower level functions that either are standard likefputs()or non standard functions such asitoa(),dtoa()whose prototypes are provided by<gigatron/libc.h>. -
Alternatively one can completely bypass stdio and either use the low-level console functions provided in
<gigatron/console.h>or their more standard equivalents provided in<conio.h>. The functioncprintf()has all the formatting abilities ofprintfbut saves memory by bypassing standard io and printing to the console. The functionmidcprintfandmincprintf()further saves space by only recognizing a subset of the printf format strings. -
The include file
<gigatron/sys.h>provides declarations to access the gigatron hardware and wrappers to call native SYS routines. -
The include file
<gigatron/pragma.h>defines Gigatron-specific extensions (pragmas and attribute). In particular several pragmas provide the means to control the placement of functions and variables in the gigatron memory: explicit placement, placement constraints, and segment definitions. -
The include file
<gigatron/idioms.h>provides useful macros to construct 16 bits words from 8 bits components and vice-versa. These macros expand into idioms recognized by the compiler to create more efficient code. -
Many parts of the main library can be overriden to provide special functionalities. For instance the console has the usual 15x26 characters, but this can be changed by linking with a library that redefines what is in
cons_geom.c. This is what happens when one uses argument-map=conxto save memory with a 10x26 console. Another more important example is the standard i/o library. By default it is only connected to the console andfopen()always fail. But one just has to redefine a few functions to change that. This is one happens when one compiles with-map=simwhich forwards all the stdio calls to the emulator. Note that such binaries only run in the command line gigatron emulatorgtsimbecause they attempt to communicate withgtsimto forward the stdio calls. -
Over time the linker
glinkhas accumulated a lot of capabilites. It supports common symbols, weak symbols, and conditional imports. It can synthetize magic lists such as a list of initialization functions that are called before main, a list of data segments to be cleared before running main, a list of memory segments that can be used as heap bymalloc(), or a list of finalization functions called when the program exits. Usingglink -hprovides some help. Usingglink --infoprovides help for specific memory maps. But the more advanced functions are documented by comments in the source or comments in the library source files that use them...
You can build GLCC from source using two methods.
-
The first method relies on the traditional
makecommand using the usual Posix command line utilities. This is the method used for development and therefore is the recommended method for Linux machines or Macs. It can also be used on Windows when compiling withcygwin(https://cygwin.org) ormingw64/msys2(https://www.mingw-w64.org/). -
The second method relies on
cmake(https://cmake.org) and supports many different toolchains such as Microsoft Visual Studio, etc.
Because the primary GLCC development platform is Linux.
building gigatron-lcc with make on a Unix platform
should be very easy provided that a C compiler, bison,
gnu-make >= 4.0, and python >= 3.8 are installed.
Simply type:
$ git clone https://github.com/lb3361/gigatron-lcc.git
$ cd gigatron-lcc
$ make
Then you can either invoke the compiler from its build location ./build/glcc or
install it into your system with command
$ make PREFIX=/usr/local install
where variable PREFIX indicates where the compiler should be installed.
This command copies the compiler files into ${PREFIX}/lib/gigatron-lcc/
and symlinks the compiler driver glcc, the linker driver glink,
and the simulator gtsim into ${PREFIX}/bin. All the other files are located under
${PREFIX}/lib/gigatron-lcc. Note that this directory can be relocated
elsewhere in the system as long as its contents is preserved.
You just need to either invoke glcc using the full path of
the gigatron-lcc directory. Alternatively you can place symbolic links
to these files somewhere in the executable search path.
There is also
$ make test
to run the current test suite.
The prerequisites are python >= 3.8 and cmake >= 3.16.
In order to compile with cmake, you must first create a build directory and invoke CMake from that build directory. For instance, on a Unix machine,
$ git clone https://github.com/lb3361/gigatron-lcc.git
$ cd gigatron-lcc
$ mkdir build
$ cd build
$ cmake ..
This operation creates a Makefile in the build directory.
You can then compile with make
$ make
Then you can either invoke the compiler from its build location ./glcc or
install it into your system with command
$ cmake --install . -DCMAKE_INSTALL_PREFIX=/usr/local
where variable CMAKE_INSTALL_PREFIX is the directory where glcc
should be installed. Note that this directory can be relocated
elsewhere as long as the relative position of the glcc files
is preserved.
There are several ways to use GLCC under windows.
-
WSL The safest option is to use WSL, Windows System for Linux. Just follow the
makeapproach above. -
Cygwin Thanks to the feedback of axelb, you can use the
makemethod to compile gigatron-lcc under cygwin >= 3.2 from http://cygwin.org. Make sure to select packagesgcc-core,make,bison,git, andpython3. Then usemakefrom the cygwin shell. The main drawback of buildinggigatron-lccunder cygwin is that you have to execute it from the cygwin shell as well since it depends on the cygwin infrastructure. -
Native GLCC with MINGW You can create a native windows glcc using the mingw compiler as well as a native version of Python (https://python.org). It is also highly recommended to install Git-for-Windows (https://gitforwindows.org/) and a version of GNU make for windows, for instance using Chocolatey (https://community.chocolatey.org/packages/make). The forum post https://forum.gigatron.io/viewtopic.php?p=2484#p2484 provices more information
-
Native GLCC with CMake You can also create a native windows glcc using the cmake route. Please follow the instructions that come with CMake to select the proper toolchain, compile the project, and optionally set CMAKE_INSTALL_PREFIX for installation.
Besides the options listed in the lcc manual page,
the compiler driver glcc recognizes a few Gigatron-specific options.
Additional options recognized by the assembler/linker glink
are documented by typing glink -h
-
Option
-rom=<romversion>is passed to the linked and helps selecting the vCPU version and the runtime code that sometimes relies on SYS functions implemented by the indicated rom version. Rom names are described in fileroms.json. The default isv6. -
Option
-cpu=[4567]indicates which VCPU version should be targeted. Version 5 adds the instructionsCALLI,CMPHSandCMPHUthat came with ROMv5a. Version 6, which comes with ROMvX0, is not a strict supersed of version 6 because it changes the encodings of CMPHS/CMPHU. Version 7, which comes with DEV7ROM is a strict superset of version 5. Version 6 and 7 are mutually incompatible. GLCC offers primary support for version 7 but can generate version 6 encodings for some of its instructions. The default CPU is the one implemented by the selected ROM. -
Option
-map=<memorymap>{,<overlay>}is also passed to the linker and specifya memory layout for the generated code. The default map,32kuses all little bits of memory available on a 32KB Gigatron, starting with the video memory holes[0x?a0-0x?ff], the low memory[0x200-0x6ff]. Theconxmap uses a reduced console to offer more memory than the 32k map. There is also a64kmap that takes advantage of memory above 0x8000, and a128kmap that offers close to 64k for program and data by relocating the framebuffer into a different bank. Additional information about each map can be displayed by using option-infoas inglcc -map=32k -infoMaps can also manipulate the linker arguments, insert libraries, and define the initialization function that checks the rom type and the ram configuration. For instance, map
simproduces gt1 files that only run in the emulatorgtsim, redirectingprintfand all standard i/o functions to the emulator itself. This is my main debugging tool.
$ ./build/glcc -map=sim tst/8q.c
tst/8q.c:30: warning: missing return value
tst/8q.c:37: warning: implicit declaration of function `printf'
tst/8q.c:39: warning: missing return value
$ ./build/gtsim -rom gigatron/roms/dev.rom a.gt1
1 5 8 6 3 7 2 4
1 6 8 3 7 4 2 5
1 7 4 6 8 2 5 3
1 7 5 8 2 4 6 3
2 4 6 8 3 1 7 5
2 5 7 1 3 8 6 4
2 5 7 4 1 8 6 3
2 6 1 7 4 8 3 5
2 6 8 3 1 4 7 5
2 7 3 6 8 5 1 4
2 7 5 8 1 4 6 3
2 8 6 1 3 5 7 4
...
I found this program when studying the previous incarnation
of LCC for the Gigatron, with old forums posts where Marcel
mentionned it as a "stretch goal" for the compiler. The main
issue is that MSCP takes about 25KB of code and 25KB of data
meaning that we need to use the video memory. My main change
was to reduce the size of the opening book,
but this is not enough. One could think about using
the 128KB memory extention but this will require a lot
of changes to the code. In the mean time. we can
run it with the gtsim emulator which has no screen
but can forward stdio.
$ cp stuff/mscp/mscp0.c .
$ cp stuff/mscp/book.txt .
# Using map sim with overlay allout commits all the memory
$ ./build/glcc -map=sim,allout mscp0.c -o mscp0.gt1
Now we can run it. Option -f in gtsim allows mscp to
open and read the opening book file book.txt.
Be patient...
$ ./build/gtsim -f -rom gigatron/roms/v6.rom mscp0.gt1
This is MSCP 1.4 (Marcel's Simple Chess Program)
Copyright (C)1998-2003 Marcel van Kervinck
This program is distributed under the GNU General Public License.
(See file COPYING or http://combinational.com/mscp/ for details.)
Type 'help' for a list of commands
8 r n b q k b n r
7 p p p p p p p p
6 - - - - - - - -
5 - - - - - - - -
4 - - - - - - - -
3 - - - - - - - -
2 P P P P P P P P
1 R N B Q K B N R
a b c d e f g h
1. White to move. KQkq
mscp>
Now you can type both to make it play against itself...
mscp> both
book: (88)e4
1. ... e2e4
8 r n b q k b n r
7 p p p p p p p p
6 - - - - - - - -
5 - - - - - - - -
4 - - - - P - - -
3 - - - - - - - -
2 P P P P - P P P
1 R N B Q K B N R
a b c d e f g h
1. Black to move. KQkq
book: (88)c5
1. ... c7c5
8 r n b q k b n r
7 p p - p p p p p
This slows down a lot when we leave the opening book. But it plays!
A growing collection of examples are offered in directory
gigatron-lcc/stuff
The code generator uses several blocks of page zero variables. The linker knows the page zero usage of each rom and keeps track of all free and used page zero locations.
-
The most important block of page zero variables contains 24 general purpose word registers named
R0toR23. This block is can be manually displaced using the command line option--register-base=0x90for instance. Register pairs namedL0toL22can hold longs. Register triplets namedF0toF21can hold floats. RegistersR0toR7are callee-saved and are often used for local variables. RegistersR8toR15are used to pass arguments to functions. RegistersR15toR22are used for temporaries. -
The compiler makes use of additional locations. The word registers
T2andT3, the long accumulatorLAC, the accumulator extension byteLAX, the floating point sign and exponent bytesFASandFAE, and the stack pointerSPare allocated in the upper half of page zero. ROMs that provide suitable native support may dictate the location some of these registers. The compiler uses the namesT0andT1to refer to the first two words of thesysArgsarray. The library also uses the namesT4andT5for the remaining two words of thesysArgsarray. Care is needed because these locations are also often used by SYS calls or by the new opcodes implemented in recent roms. -
Since the DEV7 rom offers a true 16 bits stack pointer, GLCC-2.0 makes
SPequal tovSP, allowing the use of efficient opcodes to access non-register local variables.
The function prologue first saves vLR and constructs a stack frame
by adjusting SP. It then saves the callee-saved registers onto the
stack. Nonleaf functions save 'vLR' in the stack frame and copy the
argument passed in a registers to their final location. In contrast,
leaf functions keep arguments passed in registers where they are
because these registers are no longer needed for further calls. In
the same vein, nonleaf functions allocate callee-saved registers for
local variables, whereas leaf functions use callee-saved registers in
last resort, often avoiding the construction of a stack-frame.
Leaf functions that do not need to allocate space on the
stack can use a register to save VLR and become entirely frameless.
Sometimes one can help this by using register when declaring local
variables. Saving vLR allows us to use CALLI as a long jump
without fearing to erase the function return address.
This is especially useful when one needs to hop over page boundaries.
The VCPU accumulator vAC is not treated by the compiler as a normal
register because there is essentially nothing the VCPU can do once the
accumulator is allocated to represent a particular variable or a temporary.
This would force the compiler to spill its content to a stack location
in ways that not only produce less efficient code, but often result
in an infinite loop because the spilling code must itself use vAC.
Instead, the GLCC code generator produces VCPU instructions in bursts
that are packed on a single line of the generated assembly code.
Each burst is in fact what LCC calls one instruction. Bursts are
produced by subverting the mechanisms defined by LCC to construct
various parts of a typical CPU instruction such as the mnemonic,
the address mode, etc. The VCPU accumulator vAC is treated as a scratch
register inside a burst. Meanwhile LCC allocates zero page registers
to pass data across bursts. This approach avoid the spilling problems
but sometimes needs improving because it does not keep track
of what data is left on the accumulator after each burst.
This has been improved by a preralloc pass that tries to eliminate
temporaries that can be passed through vAC, and by a state machine
in the instruction emitter which conservatively maintains assertions
about register or accumulator equality which can be used to
simplify the code.
The compiler produces a python file that first define a function for each
code or data fragment. The file then constructs a module that
holds a list of all the fragments, as well as all the imported and
exported symbols. The linker/assembler glink can read such files
or can read a library file that is merely the concatenation
of individual modules. Each fragment is represented as a function
that calls predefined functions whose uppercase name mirrors the name
of the instruction they emit. Additional functions implement synthetic
opcodes that can be implemented differently by different VCPU versions.
More predefined functions are used to define labels or control
when to check for a page boundary. The source of truth for
all this is the file glink.py.
The linker collects all the code and data fragments generated by the compiler.
It then analyzes import and exports to determine which ones should be
kept. It tries hard to place short functions into single segments in order
to avoid costly hops. Then it iterates until all symbols are resolved and
all symbol value dependent code is stabilized. Finally it produces a
familar GT1 file.