I once worked on a project that used the Boehms collector for a long-running ser...

I once worked on a project that used the Boehms collector for a long-running server process.

About once a year, we'd lose a week or two to tracking down a memory issue. While the root cause was never a GC bug, the GC would make it dramatically more difficult to figure out what the real problem was.

The closest we had to an actual GC bug was a leak involving a data structure containing a list of IP addresses that was incorrectly interpreted as containing pointers. It took forever to figure out what had happened, since the issue only appeared if you worked with a specific set of IP addresses--and even after we had a replication case, the GC made it extremely difficult to determine why memory was growing rapidly. The fix was simple once we identified the precise problem--flag that memory as "atomic" (i.e., guaranteed not to contain pointers)--but finding the precise data structure that was holding memory live was a nightmare.

We eventually ran into a slow memory growth issue that we couldn't figure out. After much debugging, one developer became fed up with the situation and tore the GC out. It took him about two weeks to produce a version of the codebase that leaked less than the GC-enabled one, and another couple weeks to eliminate virtually all memory leaks. We continued to find minor leaks for another several months, none of which were difficult to correct. (We used a debugging malloc library that flagged leaked memory on exit, so we always knew exactly where the leak had originated.)

Not only did the GC-less version leak less, it used about 40% less memory.

I would strongly recommend against using the Boehm collector for any long-running processes. The tradeoffs may be more acceptable in short-term processes where slow leakage over time is unimportant.