> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!
> Compare this to downloading Clang, which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions. The size difference here comes because the Clang tarball ships with more utilities than a C compiler, as well as pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled libraries; instead it ships with source code, and builds what it needs on-the-fly.
> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!
Even though it probably doesn't qualify this is pretty close a Canadian Cross, which for some reason is one of my favorite pieces of CS trivia. It's when you cross compile a cross compiler.
And on that point, your correspondent is right. The two bear no real resemblance to each other. The cross compilation approach described in the article is not something to be held in high regard. It's the result of poor design. It's a lot of work involving esoteric implementation details to solve a problem that the person using the compiler should never have encountered in the first place. It's exactly the problem that the Zig project leader is highlighting in the sticker when he contrasts Zig with Clang, etc.
The way compilers like Go and Zig work is the only reasonable way to approach cross compilation: every compiler should already be able to cross compiler.
Thanks for putting it this way. I always wondered why cross compilation was a big deal. For me it sounds like saying "look, I can write a text in English without having to be in an English-speaking country!".
The problem with cross compilation isn’t the compiler, it’s the build system. See eg the autoconf, which builds test binaries and then executes them, to test for availability of strcmp(3).
Go doesn't cross compile: it only supports the Go operating system on a very limited number of processors variants.
If zig were to truly cross compile for every combination of CPU variant and supported version of every operating system, it would require terabytes of storage and already be out of date.
It doesn't require nearly as much storage as you think. For zig, there's only 3 standard libraries (glibc, musl, mingw) and 7 unique architectures (ARM, x86, MIPS, PowerPC, SPARC, RISC-V, WASM). LLVM can support all of these with a pretty small footprint and since the standard libraries can be recompiled by Zig, it really only needs to ship source code - no binaries necessary.
People really do appreciate such convenience. I am not familiar with Zig, but GO provides me similar experiences for cross-compilation.
Being able to bootstrap FreeBSD/amd64, Linux/arm64, and actually commonly-used OS/ARCH combinations in a few minutes is just like a dream, but it is reality for modern language users.
I'm all for cross compilation, but in reality you still need running copies of those other operating systems in order to be able to test what you've built.
Setting up 3 or 4 VM images for different OSes takes a few minutes. Configuring 3 or 4 different build environments across as many OSes on the other hand ...
Sure, but building typically takes more resources than executing, so it's not really feasible to use a Raspberry Pi to build, but it can be for testing.
You can do that in clang/gcc but you need to pass: -static and -static-plt(? I can't find what it's called). The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms
In brief, most programs these days are position-independent, which means you need a runtime loader to load sections(?) and symbols of the code into memory and tell other parts of the code where they've put it. Because of differences between musl libc and gnu libc, in effect for the user this means that a program compiled on gnu libc can be marked as executable, but when they try to run it the user is told it is "not executable", because the binary is looking in the wrong place for the dynamic loader, which is named differently across the libraries. There are also some archaic symbols that gnu libc describes that are non-standard, which musl libc has a problem with, that can cause a problem for the end-user.
e: I didn't realise it was 5am, so I'm sorry if it's not very coherent.
I would also appreciate if you manage to be even more specific once more "coherency" is possible. I'm also interested what you specifically can say more about "The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms"
Ok so, it's been a year or so since I was buggering around with the ELF internals (I wrote a simpler header in assembly so I could make a ridiculously small binary...). Let's take a look at an ELF program. If you run `readelf -l $(which gcc)` you get a bunch of output, among that is:
alx@foo:~$ readelf -l $(which gcc)
Elf file type is EXEC (Executable file)
Entry point 0x467de0
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x0000000000000230 0x0000000000000230 R 0x8
INTERP 0x0000000000000270 0x0000000000400270 0x0000000000400270
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000fa8f4 0x00000000000fa8f4 R E 0x200000
you can see that in the ELF header is a field called "INTERP" that requests the loader. This is because the program has been compiled with the -fPIE flag, which requests a "Position Independent Executable". This means that each section in the code has been compiled so that they don't expect a set position in memory for the other sections. In other words, you can't just run it on a UNIX computer and expect it to work, it relies on another library, to load each section, and tell the other sections where to load it.
The problem with this is that the musl loader (I don't have my x200 available right now to copy some output from it to illustrate the difference) is usually at a different place in memory. What this means is that when the program is run, the ELF loader tries to find the program interpreter to execute the program, because musl libc's program interpreter is at a different place and name in the filesystem hierarchy, it fails to execute the program, and returns "Not a valid executable".
Now you would think a naive solution would be to symlink the musl libc loader to the expected position in the filesystem hierarchy. The problem with this is illustrated when you look at the other dependencies and symbols exported in the program. Let's have a look:
alx@foo:~$ readelf -s $(which gcc)
Symbol table '.dynsym' contains 153 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __strcat_chk@GLIBC_2.3.4 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __uflow@GLIBC_2.2.5 (3)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND mkstemps@GLIBC_2.11 (4)
4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND getenv@GLIBC_2.2.5 (3)
5: 0000000000000000 0 FUNC GLOBAL DEFAULT UND dl_iterate_phdr@GLIBC_2.2.5 (3)
6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __snprintf_chk@GLIBC_2.3.4 (2)
7: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __pthread_key_create
8: 0000000000000000 0 FUNC GLOBAL DEFAULT UND putchar@GLIBC_2.2.5 (3)
9: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strcasecmp@GLIBC_2.2.5 (3)
As you can see, the program not only expects a GNU program interpreter, but the symbols the program has been linked against expect GLIBC_2.2.5 version numbers as part of the exported symbols (Although I cannot recall if this causes a problem or not, memory says it does, but you'd be better off reading the ELF specification at this point, which you can find here: https://refspecs.linuxfoundation.org/LSB_2.1.0/LSB-Core-gene...). So the ultimate result of trying to run this program on a GNU LibC system is that it fails to run, because the symbols are 'missing'. On top of this, you can see with `readelf -d` that it relies on the libc library:
alx@foo:~$ readelf -d $(which gcc)
Dynamic section at offset 0xfddd8 contains 25 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
0x000000000000000c (INIT) 0x4026a8
Unfortunately for us, the libc.so.6 binary produced by the GNU system is also symbolically incompatible with the one produced by musl, also GNU LibC defines some functions and symbols that are not in the C standard. The ultimate result of this is that you need to link statically against libc, and against the program loader, for this binary to have a chance at running on a musl system.
Dlang is a better C. DMD, the reference compiler for Dlang, can also compile and link with C programs. It can even compile and link with C++03 programs.
It has manual memory management as well as garbage collection. You could call it hybrid memory management. You can manually delete GC objects, as well as allocate GC objects into manually allocated memory.
The Zig website says "The reference implementation uses LLVM as a backend for state of the art optimizations." However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks. In contrast, GCC 9 and 10 officially support Dlang.
Help us update the GCC D compiler frontend to the latest DMD.
D's whole premise of "being a better C++" has always made them look like argumentative jerks. Why build a language on top of a controversy? Their main argument from early 2000s: C++ requires a stdlib and compiler toolchain is not required to provide one. Wtf D? I mean I understand that C++ provides a lot of abstractions on top of C to call itself "more" than C, but what does D provide other than a few conveniences? If you even consider garbage collection or better looking syntax or more consistent, less orthogonal sytax a convenience. It didn't even have most of it's current features when it was first out back in early 2000s. Trying to gain adoption through creating some sort of counterculture what are they? 14?. /oneparagraphrant
It is probably the case that D has a brilliant engineer team who doesn't really focus on the PR side of things. D definitely provides value over C/C++ other than a few sugars for the syntax. It is just not communicated that well.
> Compare this to downloading Clang, which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions. The size difference here comes because the Clang tarball ships with more utilities than a C compiler, as well as pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled libraries; instead it ships with source code, and builds what it needs on-the-fly.
Hot damn! You had me at Hello, World!