Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Better JIT for Postgres (github.com/vladich)
139 points by vladich 17 hours ago | hide | past | favorite | 69 comments
 help



> However, standard LLVM-based JIT is notoriously slow at compilation. When it takes tens to hundreds of milliseconds, it may be suitable only for very heavy, OLAP-style queries, in some cases.

I don't know anything here, but this seems like a good case for ahead of time compilation? Or at least caching your JIT results? I can image much of the time, you are getting more or less the same query again and again?


Yes.

Some years ago we ported some code from querying out the data and tallying in Python (how many are in each bucket) to using SQL to do that. It didn't speed up the execution. I was surprised by that, but I guess the Postgres interpreter is roughly the same speed as Python, which when you think about it perhaps isn't that surprising.

But Python is truly general purpose while the core query stuff in SQL is really specialized (we were not using stored procedures). So if Pypy can get 5x speedup, it seems to me that it should be possible to get the same kind of speed up in Postgres. I guess it needs funding and someone as smart as the Pypy people.


That's curious. I regularly get speed ups when moving processing from Python to postgres. At least when using indices properly and when the shift reduces the amount of data carried back and forth.

At some level the application needs to participate in the performance conversation too.

https://www.postgresql.org/docs/current/sql-prepare.html


Postgres’s PREPARE is per-connection so it’s pretty limited, and then connection poolers enter the fray and often can’t track SQL-level prepares.

And then the issue is not dissimilar to Postgres’s planner issues.


Oracle’s wasn’t but I haven’t used it in a very long time so that may not be longer be true.

The problem though was that it had a single shared pool for all queries and it could only run a query if it was in the pool, which is how out DB machine would max out at 50% CPU and bandwidth. We had made some mistakes in our search code that I told the engineer not to make.


Unless you cache query plans like other RDBMS's then the client manually managing that goes away and its not limited to a single connection.

MS SQL still has prepared statements and they really haven't been used in 20 years since it gained the ability to cache plans based on statement text.


I’m always surprised to learn LLVM is so slow given that was one of the original motivations for developing it. I don’t know if that’s down to feature creep or intrinsic complexity being higher than people presumed was the case for GCC.

We have everything optimized, and yet somehow DB queries need to be "interpreted" at runtime. There's no reason for DB queries to not be precompiled.

The "byte-code" coming from the query planner typically only has a handful of steps in a linear sequence. Joins, filters, and such. But the individual steps can be very costly.

So there is not much to gain from JITing the query plan execution only.

JITing begins to make more sense, when the individual query plan steps (join, filter, ...) themselves be specialized/recompiled/improved/merged by knowing the context of the query plan.


This is a neat idea. I want to take it further and precompile the entire DBMS binary for a specific schema.

Someone is already working on it: https://arxiv.org/pdf/2603.02081

How will you handle ALTER TABLE queries without downtime?

That would definitely present a bit of a challenge, but:

- not all databases need migrations (or migrations without downtime)

- alternatively, ship the migrations as part of the binary

Adhoc modifications would still be more difficult but tbh that’s not necessarily a bug


Many SQL engines have JIT compilers.

The problems related to PostgreSQL are pretty much all described here. It's very difficult to do low-latency queries if you cannot cache the compiled code and do it over and over again. And once your JIT is slow you need a logic to decide whether to interpret or compile.

I think it would be the best to start interpreting the query and start compilation in another thread, and once the compilation is finished and interpreter still running, stop the interpreter and run the JIT compiled code. This would give you the best latency, because there would be no waiting for JIT compiler.


> It's very difficult to do low-latency queries if you cannot cache the compiled code

This is not too difficult, it just requires a different execution style. Salesforce's Hyper for example very heavily relies on JIT compilation, as does Umbra [1], which some people regard as one of the fastest databases right now. Umbra doesn't cache any IR or compiled code and still has an extremely low start-up latency; an interpreter exists but is practically never used.

Postgres is very robust and very powerful, but simply not designed for fast execution of queries.

Disclosure: I work in the group that develops Umbra.

[1]: https://umbra-db.com/


> I think it would be the best to start interpreting the query and start compilation in another thread

This technique is known as a "tiered JIT". It's how production virtual machines operate for high-level languages like JavaScript.

There can be many tiers, like an interpreter, baseline compiler, optimizing compiler, etc. The runtime switches into the faster tier once it becomes ready.

More info for the interested:

https://ieeexplore.ieee.org/document/10444855


It’s also common for JITs to sprout a tier and shed a tier over time, as the last and first tiers shift in cost/benefit. If the first tier works better you delay the other tiers. If the last tier gets faster (in run time or code optimization) you engage it sooner, or strip the middle tier entirely and hand half that budget to the last tier.

DB queries do get pre compiled and cached if you use prepared statements. This is why you should always use prepared statements if you can.

It is not always necessary to explicitly use prepared statements, though. For example, the pgx library for Go [1] and the psycopg3 library for Python [2] will automatically manage prepared statements for you.

[1]: https://pkg.go.dev/github.com/jackc/pgx/v5#hdr-Prepared_Stat...

[2]: https://www.psycopg.org/psycopg3/docs/advanced/prepare.html


See prepared statements.

Postgresql uses a process per connection model and it has no way to serialize a query plan to some form that can be shared between processes, so the time it takes to make the plan including JIT is very important.

Most other DB's cache query plans including jitted code so they are basically precompiled from one request to the next with the same statement.


What do you mean ? Cause the obvious thing is a shared cache and if there is one thing the writers of a db know it is locking

Sharing executable code between processes it not as easy as sharing data. AFAIK unless somethings changed recently PG shares nothing about plans between process and can't even share a cached plan between session/connections.

Write the binary to a file, call it `libquery-id1234.so`, and link that to whichever processes that need it?

Won't work well if it executes 20k+ queries per second. Filesystem will be a bottleneck among other things.

You can put more than one function in one file.

Might want to take a look at some research like this [1] that goes over the issues:

"This obvious drawback of the current software architecture motivates our work: sharing JIT code caches across applications. During the exploration of this idea, we have encountered several challenges. First of all, most JIT compilers leverage both runtime context and profile information to generate optimized code. The compiled code may be embedded with runtime-specific pointers, simplified through unique class-hierarchy analysis, or inlined recursively. Each of these "improve- ments" can decrease the shareability of JIT compiled code."

Anythings doable here with enough dev time. Would be nice if PG could just serialize the query plan itself maybe just as an SO along with non-process specific executable code that then has to be dynamically linked again in other processes.

1. https://dl.acm.org/doi/10.1145/3276494


Executable code is literally just data that you mark as executable. It did the JIT code, and the idea that it can't then share it between processes is incomprehensible.

I was actually confused by this submission as it puts so much of an emphasis on initial compilation time, when every DB (apparently except for pgsql) caches that result and shares it/reuses it until invalidation. Invalidation can occur for a wide variety of reasons (data composition changing, age, etc), but still the idea of redoing it on every query, where most DBs see the same queries endlessly, is insane.


The emphasis on compilation time there is because the JIT provider that comes with Postgres (LLVM-based) is broken in that particular area. But you're right, JITed code can be cached, if some conditions are met (it's position independent, for one). Not all JIT providers do that, but many do. Caching is on the table, but if your JIT-compilation takes microseconds, caching could be rather a burden in many cases. Still for some cases useful.

No a lot of jitted code has pointers to addresses specific to that process which makes no sense in another process.

To make code shareable between processes takes effort and will have tradeoff in performance since it is not specialized to the process.

If the query plan where at least serializable which is more like a AST then at least that part could be reused and then maybe have jitted code in each processes cached in memory that the plan can reference by some key.

DB's like MSSQL avoid the problem because they run a single OS process with multiple threads instead. This is also why it can handle more connections easily since each connection is not a whole process.


> and it has no way to serialize a query plan to some form that can be shared between processes

https://www.postgresql.org/docs/current/parallel-query.html

"PostgreSQL can devise query plans that can leverage multiple CPUs in order to answer queries faster."


Nothing to do with plan caching, thats just talking about plan execution of parallel operations which is that thread or process based in PG?

If process based then they can send small parts of plan across processes.


Ah, didn't see the caching part.

Plans for prepared statements are cached though.


Only on a per-connection basis

Yes if the client manually prepares the statement it will be cached for just that connection because in PG a connection is a process, but it won't survive from one connection to the next even in same process.

Other databases like MSSQL have prepared statements but they are rarely used now days since plan caching based on query text was introduced decades ago.


The last time I looked into it my impression was that disabling the JIT in PostgreSQL was the better default choice. I had a massive slowdown in some queries, and that doesn't seem to be an entirely unusual experience. It does not seem worth it to me to add such a large variability to query performance by default. The JIT seemed like something that could be useful if you benchmark the effect on your actual queries, but not as a default for everyone.

That is quite strange, given that big boys RDMS (Oracle, SQL Server, DB2, Informix,...) all have JIT capabilities for several decades now.

The big boys all cache query plans so the amount it time it take to compile is not really a concern.

Postgres caches query plans too, the problem is you can only cache what you can share, and if your planner works well, you can share very little, there can be a lot of unique plans even for the same query

No it cannot cache query plans between processes (connections) and the only way it can cache in the same process in the same connection is by the client manually preparing it, this was how the big boys did it 30 years ago, not anymore.

Was common guidance back in the day to use stored procedures for all application access code because they where cached in MSSQL (which PG doesn't even do). Then around 2000 it started caching based on statement text and that became much less important.

You would only used prepared statements if doing a bunch of inserts in a loop or something and it has a very small benefit now days only because its not sending the same text over the network over and over and hashing to lookup plan.


I didn't say it can cache between processes. The problem is not caching between processes, it's that caching itself is not very useful, because the planner creates different plans for different input parameters of the same query in the general case. So you can reliably cache plans only for the same sets of parameters. Or you can cache generic plans, which Postgres already does as well (and sharing that cache won't solve much of the problem too).

Other databases cache plans and have for years because it's very useful, many (most?) apps run many of the same statement with differing parameters, its a big win. They do this without the client having to figure out the statement matching logic like your various PG Orms and connection poolers try and do.

They also do things like auto parameterization if the statement doesn't have them and parameter sniffing to make multiple different plans based on different values where it makes sense.

https://learn.microsoft.com/en-us/sql/relational-databases/q...

You can also get this, add HINTs to control this behavior if you don't like it or its causing a problem in production, crazy I know.

https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...

PG is extremely primitive compared to these other systems in this area, and it has to be since it doesn't cache anything unless specifically instructed to for a single connection.


You make some unsubstantiated claims here. I assure you that it isn't as simple as you claim. And what Postgres does here is (mostly) the right thing, you can't do much better. You simply can't decide what plan you need to use based on the query and its parameters alone, unless you already cached that plan for those parameters (and even in that case you need to watch out for possible dramatic changes in statistics). Prepared statements != cached execution plans.

Ah yes so Microsoft and Oracle do these things for no good reason, you are the one making unsubstantiated claims such as "you can't do much better". And "You simply can't decide what plan you need to use based on the query and its parameters alone" which is mostly what those systems do (along with statistics). If you bothered to read what I linked you could see exactly how they are doing it.

I never said it was simple, in fact I said how primitive PG is compared to the "big boys" because they put huge effort into making their systems fast back in the TPS wars of the early 2000's on much slower hardware.

>Prepared statements != cached execution plans

Thats exactly what a prepared statement is:

https://en.wikipedia.org/wiki/Prepared_statement


There are reasons for that, it's useful in a very narrow set of situations. Postgres cached plans exist for the same reason. If you're claiming Oracle and MSSQL do _much_ better in this area - that's what I call unsubstantiated. From what you write further it's pretty clear you don't have a lot of understanding what happens under the hood. And no, prepared statements are not what you read in Wikipedia. Not in all databases anyway. Go read it somewhere else.

>There are reasons for that, it's useful in a very narrow set of situations.

So narrow its enabled by default for all statements from the "big boy" commercial RDBMS's...

https://www.ibm.com/docs/en/i/7.4.0?topic=overview-plan-cach...

https://docs.oracle.com/en/database/oracle/oracle-database/1...

https://learn.microsoft.com/en-us/sql/relational-databases/p...

https://help.sap.com/docs/SAP_HANA_PLATFORM/6b94445c94ae495c...

>Postgres cached plans exist for the same reason.

Postgresql doesn't cache plans unless the client explicitly sends commands to do so. Applications cannot take advantage of this unless they keep connections open and reuse them in a pool and they must mange this themselves. The plan has to be planned for every separate connection/process rather than a single cached planed increasing server memory costs which are plan cache size X number of connections.

It has no "reason" to cache plans the client must do this using its "reasons".

>If you're claiming Oracle and MSSQL do _much_ better in this area - that's what I call unsubstantiated.

You are making all sorts of claims without nary a link to back it up. Are you suggestion PG does better than MSSQL, Oracle and DB2 in planning while be constrained to replan on every single statement? The PG planner is specifically kept simple so that it is fast at its job, not thorough or it would adversely effect execution time more than it already does, this is well documented and always a concern when new features are proposed for it.

>From what you write further it's pretty clear you don't have a lot of understanding what happens under the hood.

Sticks and stones, is that all you have how about something substantial.

> And no, prepared statements are not what you read in Wikipedia. Not in all databases anyway.

Ok Mr. Unsubstantiated are we talking about PG or not? What does one use prepared statements for in PG hmmm, you know the thing you call the PG plan cache? How about something besides your claim that prepared statements are not in fact plan caches? Are you talking about completely different DB systems? How about you substantiate that?


That's not generally correct. Compile-time is a concern for several databases.

Most systems submit many of the same queries over and over again.

Ad-hoc one off queries usually can accept higher initial up-front compile cost because the main results usually take much longer anyway, vs worrying about an extra 100ms of compile.

Maybe it was too strong to say its not a concern at all, but nothing like PG where every single request needs to replan and potentially jit unless the client manually prepares and keeps the connection open.


What sort of things are people doing in their SQL queries that make them CPU bound? Admittedly I'm a meat-and-potatoes guy, but I like mine I/O bound.

Really amazed to see not one but several generic JIT frameworks though, no idea that was a thing.


Most databases in practice are sub-terabyte and even sub-100Gb, their active dataset is almost fully cached. For most databases I worked with, cache hit rate is above 95% and for almost all of them it's above 90%. In that situation, most queries are CPU-bound. It's completely different from typical OLAP in this sense.

Anything jsonb in my experience is quickly CPU bound...

Definitely. If you're doing regular queries with filters on jsonb columns, having the index directly on the JSON paths is really powerful. If I have a jsonb filter in the codebase at all, it probably needs an index, unless I know the result set is already very small.

Yeah, the other problem is I've really struggled to have postgres use multiple threads/cores on one query. Often maxes out one CPU thread while dozens go unused. I constantly have to fight loads of defaults to get this to change and even then I never feel like I can get it working quite right (probably operator error to some extent).

This compares to clickhouse where it constantly uses the whole hardware. Obviously it's easier to do that on a columnar database but it seems that postgres is actively designed to _not_ saturate multiple cores, which may be a good assumption in the past but definitely isn't a good one now IMO.


I've shaved off 30s of queries by transforming json columns into a string after the first CTE is done with it

I think reading queries that are always served from cache are CPU bound because it also involves locking the buffers etc and there is no I/O involved.

PostgreSQL is Turing complete, so I guess they do what ever they want?

Have you tested this under high concurrency with lots of short OLTP queries? I’m curious whether the much faster compile time actually moves the point where JIT starts paying off, or if it’s still mostly useful for heavier queries.

> By default, jit_above_cost parameter is set to a very high number (100'000). This makes sense for LLVM, but doesn't make sense for faster providers. It's recommended to set this parameter value to something from ~200 to low thousands for pg_jitter (depending on what specific backend you use and your specific workloads).

sadly, no windows version yet AFAICT

awesome! I wonder if it's possible to point AI at this problem and synthesize a bespoke compiler (per-architecture?) for postgresql expressions?

Two things are holding back current LLM-style AI of being of value here:

* Latency. LLM responses are measured in order of 1000s of milliseconds, where this project targets 10s of milliseconds, that's off by almost two orders of magnitute.

* Determinism. LLMs are inherently non-deterministic. Even with temperature=0, slight variations of the input lead to major changes in output. You really don't want your DB to be non-deterministic, ever.


> LLMs are inherently non-deterministic.

This isn't true, and certainly not inherently so.

Changes to input leading to changes in output does not violate determinism.


> This isn't true

From what I understand, in practice it often is true[1]:

Matrix multiplication should be “independent” along every element in the batch — neither the other elements in the batch nor how large the batch is should affect the computation results of a specific element in the batch. However, as we can observe empirically, this isn’t true.

In other words, the primary reason nearly all LLM inference endpoints are nondeterministic is that the load (and thus batch-size) nondeterministically varies! This nondeterminism is not unique to GPUs — LLM inference endpoints served from CPUs or TPUs will also have this source of nondeterminism.

[1]: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...


Yes, lots of things can create indeterminism. But nothing is inherent.

Quoting:

"But why aren’t LLM inference engines deterministic? One common hypothesis is that some combination of floating-point non-associativity and concurrent execution leads to nondeterminism based on which concurrent core finishes first."

From https://thinkingmachines.ai/blog/defeating-nondeterminism-in...


Yes, lots of things can create indeterminism. But nothing is inherent.

> 1000s of milliseconds

Better known as "seconds"...


The suggestion was not to use an LLM to compile the expression, but to use an LLM to build the compiler.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: