> Take a moment to appreciate what just happened here - I downloaded a Windows b...

krate · on March 25, 2020

> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!

Even though it probably doesn't qualify this is pretty close a Canadian Cross, which for some reason is one of my favorite pieces of CS trivia. It's when you cross compile a cross compiler.

https://en.wikipedia.org/wiki/Cross_compiler#Canadian_Cross

> The term Canadian Cross came about because at the time that these issues were under discussion, Canada had three national political parties.

catblast · on March 25, 2020

In what way is this even close?

What are the three targets in this case? It simply isn’t relevant at all.

halotrope · on March 25, 2020

It is tangentially relevant CS-Trivia. I found it to be interesting and fun. The only think completely useless is unfortunately your comment.

cxr · on March 25, 2020

You wrote:

> this is pretty close a Canadian Cross

And on that point, your correspondent is right. The two bear no real resemblance to each other. The cross compilation approach described in the article is not something to be held in high regard. It's the result of poor design. It's a lot of work involving esoteric implementation details to solve a problem that the person using the compiler should never have encountered in the first place. It's exactly the problem that the Zig project leader is highlighting in the sticker when he contrasts Zig with Clang, etc.

The way compilers like Go and Zig work is the only reasonable way to approach cross compilation: every compiler should already be able to cross compiler.

rbonvall · on March 25, 2020

Thanks for putting it this way. I always wondered why cross compilation was a big deal. For me it sounds like saying "look, I can write a text in English without having to be in an English-speaking country!".

trasz · on March 25, 2020

The problem with cross compilation isn’t the compiler, it’s the build system. See eg the autoconf, which builds test binaries and then executes them, to test for availability of strcmp(3).

yellowapple · on March 25, 2020

I feel like there should be a more sane way to test for the availability of strcmp than to build and run a whole executable and see if it works.

The sheer number of things autoconf does even for trivial programs has always been baffling to me.

bregma · on March 25, 2020

Go doesn't cross compile: it only supports the Go operating system on a very limited number of processors variants.

If zig were to truly cross compile for every combination of CPU variant and supported version of every operating system, it would require terabytes of storage and already be out of date.

matthewbauer · on March 25, 2020

It doesn't require nearly as much storage as you think. For zig, there's only 3 standard libraries (glibc, musl, mingw) and 7 unique architectures (ARM, x86, MIPS, PowerPC, SPARC, RISC-V, WASM). LLVM can support all of these with a pretty small footprint and since the standard libraries can be recompiled by Zig, it really only needs to ship source code - no binaries necessary.

bregma · on March 26, 2020

If it's only supporting two OS runtimes and a small subset of hardware it's mostly just a curiosity.

qlk1123 · on March 25, 2020

People really do appreciate such convenience. I am not familiar with Zig, but GO provides me similar experiences for cross-compilation.

Being able to bootstrap FreeBSD/amd64, Linux/arm64, and actually commonly-used OS/ARCH combinations in a few minutes is just like a dream, but it is reality for modern language users.

jclulow · on March 25, 2020

I'm all for cross compilation, but in reality you still need running copies of those other operating systems in order to be able to test what you've built.

Reelin · on March 25, 2020

Setting up 3 or 4 VM images for different OSes takes a few minutes. Configuring 3 or 4 different build environments across as many OSes on the other hand ...

VMG · on March 25, 2020

And actually building on these potentially very low-power systems...

solarkraft · on March 25, 2020

Sure, but building typically takes more resources than executing, so it's not really feasible to use a Raspberry Pi to build, but it can be for testing.

vijaybritto · on March 25, 2020

Yes but the dev setup is not really necessary for all of those OSes.

pjmlp · on March 25, 2020

Only if not using OS APIs.

vijaybritto · on March 25, 2020

Yeah sorry I didnt think about. Probably very important as low level code like this mainly for talking with OS APIs

fao_ · on March 25, 2020

You can do that in clang/gcc but you need to pass: -static and -static-plt(? I can't find what it's called). The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms

nh2 · on March 25, 2020

Could you elaborate/link on the loader-independency topic?

fao_ · on March 25, 2020

In brief, most programs these days are position-independent, which means you need a runtime loader to load sections(?) and symbols of the code into memory and tell other parts of the code where they've put it. Because of differences between musl libc and gnu libc, in effect for the user this means that a program compiled on gnu libc can be marked as executable, but when they try to run it the user is told it is "not executable", because the binary is looking in the wrong place for the dynamic loader, which is named differently across the libraries. There are also some archaic symbols that gnu libc describes that are non-standard, which musl libc has a problem with, that can cause a problem for the end-user.

e: I didn't realise it was 5am, so I'm sorry if it's not very coherent.

acqq · on March 25, 2020

I would also appreciate if you manage to be even more specific once more "coherency" is possible. I'm also interested what you specifically can say more about "The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms"

fao_ · on March 25, 2020

Ok so, it's been a year or so since I was buggering around with the ELF internals (I wrote a simpler header in assembly so I could make a ridiculously small binary...). Let's take a look at an ELF program. If you run `readelf -l $(which gcc)` you get a bunch of output, among that is:

    alx@foo:~$ readelf -l $(which gcc)

    Elf file type is EXEC (Executable file)
    Entry point 0x467de0
    There are 10 program headers, starting at offset 64

    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                     0x0000000000000230 0x0000000000000230  R      0x8
      INTERP         0x0000000000000270 0x0000000000400270 0x0000000000400270
                     0x000000000000001c 0x000000000000001c  R      0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                     0x00000000000fa8f4 0x00000000000fa8f4  R E    0x200000

you can see that in the ELF header is a field called "INTERP" that requests the loader. This is because the program has been compiled with the -fPIE flag, which requests a "Position Independent Executable". This means that each section in the code has been compiled so that they don't expect a set position in memory for the other sections. In other words, you can't just run it on a UNIX computer and expect it to work, it relies on another library, to load each section, and tell the other sections where to load it.

The problem with this is that the musl loader (I don't have my x200 available right now to copy some output from it to illustrate the difference) is usually at a different place in memory. What this means is that when the program is run, the ELF loader tries to find the program interpreter to execute the program, because musl libc's program interpreter is at a different place and name in the filesystem hierarchy, it fails to execute the program, and returns "Not a valid executable".

Now you would think a naive solution would be to symlink the musl libc loader to the expected position in the filesystem hierarchy. The problem with this is illustrated when you look at the other dependencies and symbols exported in the program. Let's have a look:

    alx@foo:~$ readelf -s $(which gcc)

    Symbol table '.dynsym' contains 153 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __strcat_chk@GLIBC_2.3.4 (2)
         2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __uflow@GLIBC_2.2.5 (3)
         3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND mkstemps@GLIBC_2.11 (4)
         4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getenv@GLIBC_2.2.5 (3)
         5: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND dl_iterate_phdr@GLIBC_2.2.5 (3)
         6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __snprintf_chk@GLIBC_2.3.4 (2)
         7: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __pthread_key_create
         8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND putchar@GLIBC_2.2.5 (3)
         9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strcasecmp@GLIBC_2.2.5 (3)

As you can see, the program not only expects a GNU program interpreter, but the symbols the program has been linked against expect GLIBC_2.2.5 version numbers as part of the exported symbols (Although I cannot recall if this causes a problem or not, memory says it does, but you'd be better off reading the ELF specification at this point, which you can find here: https://refspecs.linuxfoundation.org/LSB_2.1.0/LSB-Core-gene...). So the ultimate result of trying to run this program on a GNU LibC system is that it fails to run, because the symbols are 'missing'. On top of this, you can see with `readelf -d` that it relies on the libc library:

    alx@foo:~$ readelf -d $(which gcc)

    Dynamic section at offset 0xfddd8 contains 25 entries:
      Tag        Type                         Name/Value
     0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
     0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
     0x000000000000000c (INIT)               0x4026a8

Unfortunately for us, the libc.so.6 binary produced by the GNU system is also symbolically incompatible with the one produced by musl, also GNU LibC defines some functions and symbols that are not in the C standard. The ultimate result of this is that you need to link statically against libc, and against the program loader, for this binary to have a chance at running on a musl system.

acqq · on March 25, 2020

Wow. Your answer is really a good fit to the details provided by the author of the original article.

Many, many thanks for the answer! I've already done some experimenting myself and wanted to do more, so it really means a lot to me.

fao_ · on March 28, 2020

For further interest you might want to take a look at:

http://www.muppetlabs.com/~breadbox/software/tiny/somewhat.h...

I altered a version of that ELF64 header for 64 bit, and then modified it to work under grsec's kernel patches: https://gitlab.com/snippets/1749660

bregma · on March 25, 2020

One example of a description of how the Linux linkloader works is here [0]. Other OSes are similar.

[0] https://lwn.net/Articles/631631/

crazypython · on March 25, 2020

Dlang is a better C. DMD, the reference compiler for Dlang, can also compile and link with C programs. It can even compile and link with C++03 programs.

It has manual memory management as well as garbage collection. You could call it hybrid memory management. You can manually delete GC objects, as well as allocate GC objects into manually allocated memory.

The Zig website says "The reference implementation uses LLVM as a backend for state of the art optimizations." However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks. In contrast, GCC 9 and 10 officially support Dlang.

Help us update the GCC D compiler frontend to the latest DMD.

Help us merge the direct-interface-to-C++ into LLVM D Compiler main. https://github.com/Syniurge/Calypso

Help us port the standard library to WASM.

gameswithgo · on March 25, 2020

>However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks

That is true, but it is ALSO true that LLVM is consistently 5% better than the GCC toolchain at performance across multiple benchmarks

kick · on March 25, 2020

D seems like its pitch is "a better C++," but "a better C" doesn't seem quite right.

nurettin · on March 25, 2020

D's whole premise of "being a better C++" has always made them look like argumentative jerks. Why build a language on top of a controversy? Their main argument from early 2000s: C++ requires a stdlib and compiler toolchain is not required to provide one. Wtf D? I mean I understand that C++ provides a lot of abstractions on top of C to call itself "more" than C, but what does D provide other than a few conveniences? If you even consider garbage collection or better looking syntax or more consistent, less orthogonal sytax a convenience. It didn't even have most of it's current features when it was first out back in early 2000s. Trying to gain adoption through creating some sort of counterculture what are they? 14?. /oneparagraphrant

diegoperini · on March 25, 2020

It is probably the case that D has a brilliant engineer team who doesn't really focus on the PR side of things. D definitely provides value over C/C++ other than a few sugars for the syntax. It is just not communicated that well.

Reelin · on March 25, 2020

It has an official subset and associated compiler flag. (https://dlang.org/spec/betterc.html)

aidenn0 · on March 25, 2020

Can DMD compile C programs though? That's what "zig cc" does, and it's so much easier to get up-and-running than any crossdev setup I've used before.

crazypython · on March 25, 2020

pjmlp · on March 25, 2020

Not really, because unlike Zig, D doesn't allow for C common security exploits, unless one explicitly write them as such.