The YAPC::Asia hackathon is over but hacking is continuing ...
- Kazuho made the fast XS based HTTP header parser HTTP::Parser::XS and I wrote Plack::HTTPParser::PP which is a pure perl fallback and has the same API. XS version is 10 times faster (but makes the whole server only about 20% faster because header parsing takes smaller part of the processing time compared to other I/O stuff)
- Kazuho also patched Leont's Sys::Sendfile to support Mac OS X.
- WIth those two in my toolkit, I created Plack::Impl::Standalone which is a pretty simple IO::Socket::INET based standalone, single thread and non-forking server that uses the XS parser and sendfile(2) if available (and fallback to pure perl).
- I also made the first pass of the Apache2 backend.
And now it's time to benchmark: overall, the standalone daemon beats everything else even though it's a single thread, non-forking daemon: the simple Hello World is like 2500 req/s on Standalone and 2000 req/s with Apache2, 1200 req/s with ServerSimple. Also the Catalyst's default Welcome page runs 200 req/s with Standalone and 140 req/s with Apache2. The numbers don't really change if I increased the concurrency (-c) to a bigger number like 20.
Note that to run Catalyst via PSGI on Apache2 on Leopard I needed to re-install CPAN modules with 64bit, otherwise I could run Apache in 32bit mode (as a PPC binary) but that makes the whole Apache really slow.
In other news, the Catalyst::Engine::PSGI and Plack::Impl::Standalone now passes all live and aggregated tests from Catalyst-Runtime distribution. In the process I've found a pile of bugs in both Engine::PSGI and Impl::Standalone and they're all fixed. This is a huge milestone!
The other implementations like ServerSimple also get very close to pass 100% and it also has revealed some weird bugs or corner cases around URI decoding etc. Fun.
I am hoping to get the HTTP::Parser module to be much more functional that it currently is. I plan to look closely at both HTTP::Parser::XS and Plack::HTTPParser::PP to see how I can best do this, so that the code is useful to the most people. Ideally, I'd like it to be a drop-in replacement for using the several other choices. This will all be somewhat dependent on how strongly the original author feels about his current code and interface. I'm hoping I can sway him to my way of thinking.
Posted by: www.rjray.org | 2009.09.14 at 13:18
Yes, I planned to use HTTP::Parser on this (I actually did when I implemented HTTP::Engine::Interface::AnyEvent) but we created a new parsers so: a) the memory footprint is minimal and the speed is very fast b) the parsing should stop when it finishes parsing header and c) do not create HTTP::Response object, instead create PSGI-like hash.
So the end result of ours is a minimalistic, low memory usage and very fast header-only parser that doesn't use objects. If you try to improve HTTP::Parser that's great (for instance it supports chunked request as well, as far as i see) but keep in mind the three requirements from us if you want us to switch :)
Posted by: miyagawa | 2009.09.14 at 13:58
We are playing with running simple things under nginx PSGI and see that parsing query string (we use CGI::PSGI) takes a long time. It occured to me that we actually do double work here because nginx has its own parameters parsing. Only there's no easy way to pass the result into our code via PSGI.
Are there plans to include something like that into PSGI spec? Optional 'args' hashref element in $env would make a lot of sense. Frameworks aware of the element would use it if it exists instead of their current way of parsing query_string.
Apache backend could have a very fast query parser via apr too, btw.
Posted by: Account Deleted | 2009.09.26 at 17:33
It probably won't be in the spec, but I see your point of making an (optional) parameter to bypass parsing if the web server has an API to parse them. But I wonder how parsing QUERY_STRING realistically become a bottleneck though?
If you have a number (benchmark etc.) to show the difference between, say, APR/nginx native QUERY_STRING parsing and pure perl based parser that'd be nice to see.
Posted by: miyagawa | 2009.09.26 at 17:50
which version of CGI::PSGI by the way? I'll be interested to see the actual code, as well as whether you use Yappo's nginx patches when you say "nginx PSGI". PSGI has lots of options to configure ... like Standalone backend or embedded nginx perl or fastcgi, so :)
Posted by: miyagawa | 2009.09.26 at 17:57
OK I see what you're doing: http://github.com/kappa/perl-httpd-benchmarks
So you're calling Dancer that uses CGI::PSGI, from PSGI enabled nginx? Yeah, that definitely sounds like a double work. I wonder how that's comparable to say, doing raw PSGI based app from nginx embedded perl? CGI::PSGI's sole purpose is to help transitioning CGI.pm based web frameworks to PSGI based, and probably not the best option for benchmarking. But you're correct there might need to be double work anyway, albeit it could be faster than a little hacky CGI::PSGI subclassing.
Posted by: miyagawa | 2009.09.26 at 18:18
Actually, this is not the Dancer example that is interesting. Just skip it :)
(I use Yappo's nginx patches and the last CGI::PSGI from github.)
See the last three handlers in http://github.com/kappa/perl-httpd-benchmarks/blob/master/NginxSum.pm
The very last uses parameters directly from the $env -- this requires changes to Yappo's patch to pass them through from nginx (GET parameters are always parsed and available as $arg_NAME variable in nginx). This variant is also the fastest.
I tested them all via "ab -n 10000 -k 'http://127.0.0.1:8011/sum_raw?a=32&b=11'". The results on my box are:
handler_rawpsgi: ~3300 r/s
handler_raw2: ~8000 r/s
handler_raw: ~8100 r/s
Posted by: Account Deleted | 2009.09.27 at 04:28
As I think about this issue more, I come to realize that this naive benchmark only shows that CGI::PSGI->new() is slow (as slow as CGI->new() is, naturally). It does so much more than simply parse the query string. Still, most of the work it does is only needed when processing complex requests. It would be interesting to devise a scheme to avoid it most of the time.
Posted by: Account Deleted | 2009.09.27 at 04:45
And one more thing. Is there a mailing list or something where you people doing these cool PSGI things communicate? I feel that a lot of issues I face should probably have been discussed to death :)
Posted by: Account Deleted | 2009.09.27 at 04:50
Yes, we do the similar benchmarks in our Plack repo's benchmarks/ab.pl and alike, and doing the same with our (pretty fast :) Standalone server shows 3000 req/s without doing anything with $env, and 1200 req/s with CGI::PSGI->new and 1100 req/s with CGI::PSGI->new->param with a reasonable QUERY_STRING.
So I agree with you that this is not really a QUERY_STRING problem, but using CGI::PSGI->new, which might be slower than the other approach like using Plack::Request, but in the real world this difference might be smaller than, say your application's run time anyway.
Posted by: miyagawa | 2009.09.27 at 04:56
See our spec and FAQ, but we're chatting on #http-engine on irc.perl.org, and we have a pretty low traffic mailing list http://groups.google.com/group/psgi-plack
I'm preparing a website that has links to all of these and specifications when I upload Plack and PSGI 1.0 distribution to CPAN pretty soon :)
Posted by: miyagawa | 2009.09.27 at 04:57
FYI, I did a benchmark using Plack::Request (kind of like Apache::Request in mod_perl, but with few deps), and with some optimizations I just pushed minutes ago, Plack::Request now gives 2200 req/s while CGI::PSGI gives 1200 req/s and raw handler gives 3200 req/s in a simple GET request.
Posted by: miyagawa | 2009.09.27 at 06:26
Plack::Request looks interesting! I'm going to give it a try outside of Plack :)
Meanwhile, standalone Plack with AnyEvent/EV backend shows us very promising performance, too. This is very cool and congratulations on that!
Posted by: Account Deleted | 2009.09.27 at 07:58
Thanks! And yeah, despite the Plack:: name, Plack;:Request can be used outside Plack server environments. We plan to make it a separate distribution from Plack core actually :)
Posted by: miyagawa | 2009.09.27 at 13:53