The project is very interesting but the benchmark results must be completely wro...

jsd1982 · on Feb 1, 2017

I'd rather stick with Golang here because it's not hiding anything. Everything in the Go stack is written in Go, and all your http handler (business) logic is written in Go.

With this mix of C and Python it's impossible to tell what will happen to the system's performance when you actually include significant amounts of business logic written in Python on top of the C framework here. The author even makes a point about how little actual Python code and/or data structures are in use. If use of the host language is discouraged in the framework, how can I trust the performance of the framework with my code written in the host language?

I bet if you put anything non-trivial on top and try to connect with real-world http clients which mostly don't use pipelining, it will fall over hard. Enjoy debugging what you can't understand.

The emphasis on http parsing using SSE intrinsics is odd, as http parsing is rarely a bottleneck. I'm not saying it can't be, but that'll only be when you've got the rest of your stack highly tuned, performance is predictable, and profiling has shown that http parsing is your actual bottleneck. Even so, http/2 in theory alleviates this bottleneck and you don't have to use processor-specific intrinsics, as well as providing a better solution to pipelining problems with its multiplexing of requests.

EDITED: "how can I trust the performance of the framework with my code written in the host language?" Original text lacked emphasized addition.

pekk · on Feb 1, 2017

Python always made it possible to use C, that's a design feature. It's ridiculous to characterize this as untrustworthy.

jsd1982 · on Feb 1, 2017

I only meant trustworthiness in terms of predictability of performance.

The point was that Go's http stack and standard library and runtime being written all in Go gives me confidence that any Go code I write on top will enjoy the same performance characteristics and that there will be very few surprises. It's predictable due to its uniformity.

gpderetta · on Feb 1, 2017

I assume your OS and hardware firmware is also written in GO...

throwawayish · on Feb 1, 2017

Performance is a non-predictable good on macroscopic (application) scales.

IndianAstronaut · on Feb 2, 2017

This is also why I am looking into more analytics work in Go. Uniformity of performance across different data analysis processes.

didibus · on Feb 1, 2017

I wouldn't say untrustworthy, but it's definitly misleading to say python achieves those performances, when it's actually calling into C for most of it.

kibwen · on Feb 1, 2017

It may be misleading to imply that this is pure Python, but if the C is encapsulated well enough that someone can write their server solely in Python then it's easy to argue that this is Python. After all, Go has bits that are written in assembly, but we don't have to qualify every Go benchmark with an asterisk and a footnote to mention that.

didibus · on Feb 1, 2017

Well, I guess that's a good argument. Though I feel like unless it's part of the base language implementation, you'd have to consider it an extension to the language.

So CPYTHON can not in fact deliver those performances. But CPYTHON can be extended through Japronto to achieve them. I'd say it's still misleading, since casually when we refer to Python, it means CPYTHON.

Go's primary implementation provides all the fast goodies needed for fast performance. There could be a slow Go compiler which didn't, but when you say Go, it refers to Google's implementation.

You could argue that at least, CPYTHON makes it easy to extend the language with C. So Python language extensions are trivial to use compared to some other languages. So at least I'll give you that.

BuckRogers · on Feb 1, 2017

That's how scripting languages have always worked. Typically the reason one language is considered scripting while another like Go is not. I tend to like the Python arrangement more because it's almost as high level as you can go and even pure Python performance suits me. But they're both good setups.

makapuf · on Feb 1, 2017

many parts of the stdlib in python are written in C, so C and python are really working close together in python.

nhumrich · on Feb 1, 2017

Yes, because the real bottleneck is TCP. Pipelining reduces number of packets sent. However, very few clients actually support pipelining, so it's almost useless

cuu508 · on Feb 1, 2017

Also, it's measuring the number of static responses to a single client. I guess there probably is an use case for this, but a more typical situation is many clients making one or a few requests, and also doing the TLS handshake.

floatboth · on Feb 1, 2017

Yeah, HTTP/1.1 pipelining is useless. But the good news is, HTTP/2 is pipelined (and doesn't have the head-of-line blocking problem). And HTTP/2 is supported by modern browsers.

But anyway… HTTP/2, or 1.1 pipelining for that matter, is usually terminated at the reverse proxy level. So it's really not necessary in a web framework! Just makes these unfair benchmark results possible.

SomeCallMeTim · on Feb 2, 2017

I think that their benchmark isn't pushing the server hard enough (and having one "hey" requesting system may not be enough either?). And that if it were, you'd actually see the benefit of using fewer blocking OS threads.

If you look at the Plaintext TechEmpower benchmark [1], for instance, the "echo-prefork" Go benchmark hits 3.6M requests per second. fasthttp hits 2.8M. On a less powerful cloud server, fasthttp hits 850000 and echo-prefork hits 742000 (yes, fasthttp is faster on a slower system...the fun of benchmarks).

Not sure how fast your machine is, but I'm sticking with Go for my performance-critical code. As a parallel comment points out, Go is also fast no matter what code you're running, and Go's optimizations are throughout the stack, including some pretty extreme garbage collection optimizations, so when you have a complex, long-running server, GC won't be killing latency at a crucial time.

[1] https://www.techempower.com/benchmarks/#section=data-r13&hw=...

floatboth · on Feb 2, 2017

I measured single thread because the author measured single thread.

SomeCallMeTim · on Feb 2, 2017

I missed that fact.

In that case, the author is measuring something that's completely useless. The entire advantage of Go is that it has really good multithreaded asynchronous behavior.

A benchmark that's measuring single thread performance of a task that's optimized to do well in a single thread (i.e., it doesn't actually do anything other than return static text) is entirely worthless.

If you're going to be returning static text, may as well compare to Nginx, which I'm certain can return static text even faster. If you're going to be doing processing in Python, then do at least something.

And run in multiple threads, since that's Go's native environment.

Rating: PANTS ON FIRE. [1]

[1] Not your comparison, but the original author's performance claims.

floatboth · on Feb 4, 2017

The author doesn't care about the advantage of Go, he cares about fair comparison to his project, which is single-threaded Python.

e12e · on Feb 1, 2017

> But the benchmark is not fair when only one server supports pipelining.

Anyone know why go went from 45kreq/s to 150 when pipelining was enabled - without supporting pipelining? Or is your last run with yet more concurrent requests?

nitely · on Feb 2, 2017

'coz enabling pipelining for the client has some gains (all requests in one package). Even if the server won't pipeline.

e12e · on Feb 2, 2017

But isn't the flow then:

  1) client: gimme 10
  2) server: here, have 1
  3) client: ok, gimme 9!
  4) server: here, have 1
  ...

?

nitely · on Feb 4, 2017

No. Client sends one packet containing the 10 requests. Server sends 10 packets. Basically, the client avoids 9 extra round-trips and sits and waits for the ten replies. However, this whole thing has many cons, that's why is not even used anywhere.

It's more like this:

    1) client: gimme 10, gimme 9...
    2) server: here, have 10
    3) server: here, have 9
    ...

Edit: There is not always a 1:1 relationship between packets and requests/responses, but it's easier to explain that way. You end up sending less packets when pipelining, plus you don't need to wait for a response to send the next request which also saves time (i.e: client has sent every request and server don't have to wait for the client to receive the response to send another request).

rlarroque · on Feb 1, 2017

Have you tried running benchmark using Gatling? This load testing tool seems promising and more accurate others I've used.