ssl_session_cache in Nginx and the ab benchmark

December 31, 2010
2 comments DoneCal, Linux

A couple of days ago I wrote about how blazing fast the DoneCal API can be on HTTP (1,400 requests/second) and how much slower it becomes when doing the same benchmark over HTTPS. It was, as Chris Adams pointed out, possible to run ab with Keep-Alive on and after some reading up it's clear that it's a good idea to switch on shared ssl_session_cache so that Nginx's SSL TCP traffic can cache some handshakes.

With ssl_session_cache shared:SSL:10m :


 Requests per second:    112.14 [#/sec] (mean)

Same cache size but with -k on the ab loadtest:


Requests per second:    906.44 [#/sec] (mean)

I'm fairly sure that most browsers with use Keep-Alive connections so I guess it's realistic to use -k when running ab but since this is a test of an API it's perhaps more likely than not that clients (i.e. computer programs) don't use it. To be honest I'm not really sure but it never the less feels right to be able to use ssl_session_cache to boost my benchmark by 40%.

It's also worth noticing that when doing a HTTP benchmark it's CPU bound on the Tornado (Python) processes (I use 4). But when doing HTTPS it's CPU bound on the Nginx itself (I use 1 worker process).

Speed of DoneCal API (over 1,400 request/sec) and HTTPS (less than 100 request/sec)

December 27, 2010
4 comments DoneCal

DoneCal (my simple calendar and time sheet substitute web app) now has HTTPS support. It's not enabled yet as I'm ironing out some more testing. Basically, HTTPS is, at least at the moment, only going be available to premium users. Anyway, this is a performance story and about the difference in speed between HTTP and HTTPS.

I'll let these unscientific benchmarks speak for themselves.

HTTP:


donecal:~# ab -n 1000 -c 10 "http://donecal.com/api/events.json?guid=xxx&start=1292999600&end=1293294812"
...
Document Length:        616 bytes
Failed requests:        0
...
Requests per second:    1432.40 [#/sec] (mean)
...
Transfer rate:          1184.81 [Kbytes/sec] received

HTTPS:


..
Server Port:            443
SSL/TLS Protocol:       TLSv1/SSLv3,DHE-RSA-AES256-SHA,2048,256

...
Document Length:        616 bytes
Failed requests:        0
...
Requests per second:    84.73 [#/sec] (mean)
...
Transfer rate:          70.08 [Kbytes/sec] received

That's quite a huge difference in requests per second. HTTPS 17 times slower than HTTP. Is this the reality of HTTPS? Or something wrong with my cert or something wrong with running HTTPS through ab?

Anyway, this pretty good me thinks anyway. The HTTP version is over 1,400 requests per second and this is a fully database, security and encoding involving application. This particular test data (616 bytes JSON) isn't big but it sure is bigger than some of the "'hello world'" benchmarks you see on the interweb.

UPDATE

See this new entry about enabling ssl_session_cache in Nginx

To code or to pdb in Python

December 20, 2010
6 comments Python

To code or to pdb in Python This feels like a bit of a face-plant moment but I've never understood why anyone would use the code module when you can use the pdb when the pdb is like the code module but with less.

What you use it for is to create you own custom shell. Django does this nicely with it's shell management command. I often find myself doing this:


$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from this import that
>>> from module import something
>>> from db import connection
>>> con = connection('0.0.0.0', bla=True)
>>> con.insert(something(that))

And there's certain things I almost always import (depending on the project). So use code to write your own little custom shell loader that imports all the stuff you need. Here's one I wrote real quick. Ultra useful time-saver:


#!/usr/bin/env python
import code, re
if __name__ == '__main__':
   from apps.main.models import *
   from mongokit import Connection
   from pymongo.objectid import InvalidId, ObjectId
   con = Connection()
   db = con.worklog
   print "AVAILABLE:"
   print '\n'.join(['\t%s'%x for x in locals().keys()
                    if re.findall('[A-Z]\w+|db|con', x)])
   print "Database available as 'db'"
   code.interact(local=locals())

This is working really well for me and saving me lots of time. Hopefully someone else finds it useful.

Page here about DoneCal

December 19, 2010
0 comments DoneCal

I've finally had the time to write a little bit about my latest web app project: DoneCal.

Hopefully the banner at the top of this page will yield a bit of traffic. Since my marketing budget for DoneCal is exactly $0.00 I'm going to try to go for nice organic SEO and just generally build a great app so that people don't have to link to it but that they want to link to it.

Let me know if the text makes sense or if it makes you feel confused about what DoneCal is.

DoneCal gets a grade A (92)

November 27, 2010
3 comments DoneCal

DoneCal gets a grade A (92) All the hard work I've put into DoneCal pre-optimization has paid off: Got a Grade A with 92 percent on YSlow!

What's cool about this is that unlike other sites I've built with high YSlow score this site is very Javascript intense and rendering the home page depends on 9 different Javascript files weighing over 300 Kb which when combined and packed for production use is reduced down to 5 requests and weighing in just over 80Kb. The reason it's still 5 and not just 1 is also important. This is deliberate since it only loads the minimum first to render the calendar and then after the DOM is fully rendered more Javascript is pulled in depending on what's needed.

One annoying thing about YSlow is that it suggests that you use CDNs for Javascript and CSS files. What they perhaps don't appreciate is that most CDNs don't support negotiated gzipping like Nginx does. The ability to gzip is a CSS or Javascript file generally means less waiting for the client than getting it un-gzipped from a CDN. One thing I will work on though is perhaps serving all the images that support the CSS from my Amazon Cloudfront CDN. Gzipping is not applicable to images.

Gmail tip: Searching only for attachments

November 25, 2010
0 comments

Gmail tip: Searching only for attachments Because I've seen people many times searching in Gmail when what they're looking for is an attachment. Often you search for something like "ProjectX" and find a huge thread full of emails without that one document attachment you're looking for.

Add to your normal search:


has:attachment

and it will only find emails with an attachment. See example screenshot on the right.

Welcome to the world: DoneCal.com

November 22, 2010
0 comments Python, Tornado

Welcome to the world: DoneCal.com After about two months of evening hacking I'm finally ready to release my latest project: DoneCal.com

It's a simple calendar that doesn't get in your way. You just click on a day and type what you did that day. DoneCal can be an ideal replacement to boring spreadsheet-like timesheets. And unlike regular timesheets/timetrackers with tags you immediately get statistics about how you've spent your time.

I'm personally excited about the Bookmarklet because I practically live in my webbrowser and now I can quickly type what I've just done (could be a piece of support work for a client) with one single click.

If you're a project manager trying to track what your developers are working on, ask them to start tracking time on DoneCal and then ask them to share their calendar with you. They can set up their share so that it only shares on relevant tags.

I'm going to improving it more and more as feedback comes in. Hopefully later this week I'm going to be writing about the technical side of this since this is my first web app built with the uber-fast Tornado framework

jsonpprint - a Python script to format JSON data nicely

November 21, 2010
5 comments Python

This isn't rocket science but it might help someone else.

I often do testing of my various restful HTTP APIs on the command line with curl but often the format the server spits out is very compact and not easy to read. So I pipe it to a little script I've written. Used like this:


$ curl http://worklog/api/events.json?u=1234 | jsonpprint
{'events': [{'allDay': True,
            'end': 1290211200.0,
            'id': '4ce6a2096da6814e5b000000',
            'start': 1290211200.0,
            'title': '@DoneCal test sample'},
           {'allDay': True,
            'end': 1290729600.0,
            'id': '4ce6a22b6da6814e5b000001',
...

Truncated! Read the rest by clicking the link below.

How to book a ticket on the Royal Academy of Music's website

November 13, 2010
1 comment Web development

I've finally managed to book my ticket to see Zappa. It's the Royal Academy of Music Manson Ensemble who play about 10 Frank Zappa classics. It's here in London on Baker Street.

The Royal Academy of Music website sucks. Its ticket booking part is completely broken. Fortunately I found a way to "hack" it so that I could get a ticket. And it only cost me £1 extra.

On that note, why isn't the box office open on weekends? And why is no one answering any of their phones on a Saturday?

Truncated! Read the rest by clicking the link below.

Worst Flash site of the year 2010

November 8, 2010
2 comments Misc. links

Worst Flash site of the year 2010 If you ever wonder, how do I make a website that is just wrong on every front: Turn up your volume and tune into http://industrialpainter.com/ Oh yeaaaahhh...

It's got it all.

  • completely irrelevant background music
  • epileptic flashing animations
  • uber-geeky loading counters showing you how many kilobytes you've downloaded
  • new Flash file for each annoying page
  • a counter
  • blurred mugshots
  • telling you want day it is, what date it is and what time it is in three completely different locations on the screen
  • marquee text scrolling by
  • being Flash