I have a somewhat popular HTTP server library for Zig [1]. It started off as a t...

commandersaki · on Jan 11, 2024

Excellent explanation and cool project.

I wasn't aware that you could store a reference when you register an fd with epoll. I have used select and poll in the past, and you'd need to maintain a mapping of fd->structure somehow, and you know C doesn't come with a handy data structure like a map to make this efficient and easy when scaling to say 100k+ fds. So being able to store and retrieve a reference is incredibly useful.

How do you handle the application handler code, would you run that in a separate thread to not block handling of other fds?

latch · on Jan 11, 2024

In my project, you can start N workers, each accept(2) thanks to REUSEPORT[_LB] and manages its own epoll/kqueue. When a request is complete, the application's handler is called directly by that worker.

I considered what you're suggesting: having a threadpool to dispatch application handlers on. It's obviously better. But you do have to synchronize a little more, especially if you don't trust the client. While dispatched, the client shouldn't be able to send another request, so you need to remove the READ notification for the socket and then once the response is written, re-add it. Seemed a bit tedious considering I'm hoping to throw it all out when async is re-added as a first class citizen to the language.

The main benefit of my half-baked solution is that a slow or misbehaving connection won't slow (or block!) other connections. Application latency is an issue (since the worker can't processed more requests while the application handler is executing), but at least that's not open to an attack.

toast0 · on Jan 11, 2024

> and you know C doesn't come with a handy data structure like a map to make this efficient and easy when scaling to say 100k+ fds.

There's plenty of map like things available; not in the language standard, but in libcs or os includes, but fds are ints, and kernel APIs that return 'new' fds are contracted to return you the least numerical fd that isn't currently in use... So you can set a max fds (the OS will tell you if you ask), and set an array of that size.