It was a very interesting read, in the context of Python getting a copy-and-patch JIT compiler in the upcoming 3.13 release [1], to understand better the approach.
The last (interpreter only) version mentioned that neither GC nor modules were implemented. Did that change?
The JIT work is exciting but even more exciting would be a faster, fully featured interpreter for platforms with runtime code generation constraints (e.g. iOS) for integration into engines like Love
There is already Luau if you need a sandbox. Neither Lua nor LuaJIT are sandboxes. There is also my libriscv project if you need a low latency sandbox, without JIT.
I haven't mentioned sandboxes and don't need them. As an example, Love integrates LuaJIT, but the JIT is disabled in i-platforms. As is mentioned by LuaJIT:
> Note: the JIT compiler is disabled for iOS, because regular iOS Apps are not allowed to generate code at runtime. You'll only get the performance of the LuaJIT interpreter on iOS. This is still faster than plain Lua, but much slower than the JIT compiler. Please complain to Apple, not me. Or use Android. :-p
So to return to my original comment, the improvement that I'm seeing here is a faster _interpreter_, which is something advertised on the luajit-remake repo.
Looks like LuaJIT is still going to be faster, because Deegen requires runtime code generation, thus executable + writable pages, which iOS platform does not allow.
Maybe we have different definitions of “sandbox”, but I thought the Lua interpreter was one? That is, isn’t it safe (or can be made safe) to embed the interpreter within an application and use it to run untrusted Lua code?
There is a lot of information there, but it doesn't seem to be able to handle resource exhaustion, execution time limits or even give any guarantees. It does indicate that it's possible to use as a sandbox, and has a decent example of the most restrictive setup. But I would for example compare it with Luau's SECURITY.md.
> Luau provides a safe sandbox that scripts can not escape from, short of vulnerabilities in custom C functions exposed by the host. This includes the virtual machine and builtin libraries. Notably this currently does not include the work-in-progress native code generation facilities.
> Any source code can not result in memory safety errors or crashes during its compilation or execution. Violations of memory safety are considered vulnerabilities.
> Note that Luau does not provide termination guarantees - some code may exhaust CPU or RAM resources on the system during compilation or execution.
So, even luau will have trouble with untrusted code, but it does give certain guarantees, and writes specifically about what is not covered. I think that's fair. And then libriscv.
> libriscv provides a safe sandbox that guests can not escape from, short of vulnerabilities in custom system calls installed by the host. This includes the virtual machine and the native helper libraries. Do not use binary translation in production at this time. Do not use linux filesystem or socket system calls in production at this time.
> libriscv provides termination guarantees and default resource limits - code should not be able to exhaust CPU or RAM resources on the system during initialization or execution. If blocking calls are used during system calls, use socket timeouts or timers + signals to cancel.
So, it is possible to provide limits while still running fast. I imagine many WebAssembly emulators can give the same guarantees.
This is a beautiful piece of work. Connecting all the semantic levels is hard work, and this does it elegantly. It goes to show that old-fashioned technology like object files and linkers is still useful, and can still pay off in unexpected ways as part of new technology.
It's a template jit with a strange implementation.
Instead of writing the bytes directly, it uses llvm to compile functions that refer to external symbols and then patches copies of those bytes at jit time. That does have the advantage of being loosely architecture agnostic.
Template jits can't register alloc across bytecodes so that usually messes up performance. That can be partially mitigated by picking the calling convention of the template/stencils carefully, in particular you don't want to flush everything to/from the stack on every jump for a register architecture.
It's not in the same league of engineering as luajit, but then not much is.
This article is about a technique "Copy-and-Patch" for just-in-time (JIT) compilation that is both fast to compile, produces reasonably efficient machine code, and is easier to maintain than writing assembly by hand.
The section "Copy-and-Patch: the Art of Repurposing Existing Tools" describes the heart of the method, which is to use an existing compiler to compile a chunk of c/c++ code corresponding to some bytecode instruction, then patch the resulting object file in order to tweak the result (e.g. to specify the instruction operands) in a similar fashion to symbol relocation happening during link.
Given a stream of bytecode instruction, the JIT compilation reduces to copying code objects (named "stencils") corresponding to bytecode instructions from a library of precompiled stencils then patching the stencils as needed, which is very fast compared to running a full-blown compiler like LLVM from the syntax tree.
Of course, the resulting code is slower than full-blown Ahead-of-Time (AOT) compilation, but the authors describe a few tricks to keep the execution speed within a reasonable margin of AOT. For instance, they leverage tail calls to replace function calls with jumps, compile sequences of frequently associated bytecode instructions together, and so on.
My advice would be to read Piumarta's "Optimizing direct threaded code by selective inlining" paper [1] first, and then read the references from the wikipedia article [2].
If the piumarta paper is still over your head, take a look at its references, but they will refer to Java, SmallTalk and Forth which might be a distraction.
I’m sure all the modern JavaScript JITs would beat LuaJIT for raw performance. JS JITs were already faster when I compared them several years ago and have only improved since - whereas LuaJIT has almost been standing still for a decade.
And v3.0 is underway.
https://github.com/LuaJIT/LuaJIT/issues/1092