Damn lies and benchmark comparing Apache and Nginx

June 3, 2008
7 comments Linux

Today I moved a bunch of sites over from Apache to Nginx but still keeping Squid in between as a http accelerator (I hope to replace Squid with Varnish soon). I did a quick benchmark of a HTML page that is cached by Squid, 4 times via Apache and 4 times via Nginx. The results:


Apache2
********
Requests per second:    1601.34 [#/sec] (mean)
Time per request:       6.268 [ms] (mean)
Time per request:       0.627 [ms] (mean, across all concurrent requests)
Transfer rate:          13020.50 [Kbytes/sec] received

Nginx
********
Requests per second:    1810.02 [#/sec] (mean)
Time per request:       5.6435 [ms] (mean)
Time per request:       0.5645 [ms] (mean, across all concurrent requests)
Transfer rate:          14591.35 [Kbytes/sec] received

That's "only" 13% faster and I had hoped for a bigger difference but the test is very simple and depends on how Squid feels. The other important test would be to see how much less CPU and memory Nginx uses during the stresstest period but that's for another day.

One note: This is Nginx 0.4.3 on Debian Etch. The current stable release is Nginx 0.6.13. I'll need to talk to my sys admins to remedy this. Perhaps it makes a difference on the benchmark, I don't know.

zope-memory-readings - Tracking Zope2's memory usage by URL

May 30, 2008
0 comments Zope

zope-memory-readings - Tracking Zope2's memory usage by URL I've just released a new little project in Python for tracking memory usage in Zope2 applications with the added benefit that you can hopefully see what URL causes which memory usage "jumps". Hopefully this can help Zope2 developers find out what causes RAM bloat but can also help in helping you optimize your application by early in the development process find out what uses too much RAM. I wouldn't be surprised that there is already a program that does something like this. I've just never seen one. Also by putting this out as an Open Source project and blogging about it hopefully more clever people than me will come forward and point out the right way to do things.

I've also used Google Code this time to manage the project. I've used it before but only for hosting a public SVN for the IssueTrackerProduct SVN. I have to say that I was quite impressed with Google Code this second time. I think it's still fundamentally wrong to confuse people with by offering both download and SVN checkout. I did both this time but I think I might give up on the downloads because who out there, who understands that he/she needs to debug RAM usage, doesn't know how to use SVN?

Finally a little disclaimer: By writing about this here, preparing it on Google Code and writing a README.txt file I've now spent more time "managing" the project than I have on coding it. It's an early test release which hopefully will stir up some ideas for genuine important improvements. I had fun coding it as well since this is my first attempt with Flot which has been great to work with. You get very quick and powerful results. Lastly, I haven't tested this in anything but 32-bit Ubuntu Linux and Firefox.

Here is a sample report: 2008-05-30_16.47.32__3.8_minutes

split_search() - A Python functional for advanced search applications

May 15, 2008
0 comments Python

Inspired by Google's way of working I today put together a little script in Python for splitting a search. The idea is that you can search by entering certain keywords followed by a colon like this:


Free Text name:Peter age: 28

And this will be converted into two parts:


'Free Text'
{'name': 'Peter', 'age':'28}

You can configure which keywords should be recognized and to make things simple, you can basically set this to be the columns you have to do advanced search on in your application. For example (from_date,to_date)

Feel free to download and use it as much as you like. You might not agree completely with it's purpose and design so you're allowed to change it as you please.

Here's how to use it:


$ wget https://www.peterbe.com/plog/split_search/split_search.py
$ python
>>> from split_search import split_search
>>> free_text, parameters = split_search('Foo key1:bar', ('key1',))
>>> free_text
'Foo'
>>> parameters
{'key1': 'bar'}

UPDATE

Version 1.3 fixes a bug when all you've entered is one of the keywords.

The importance of the TITLE attribute

April 23, 2008
2 comments Web development

Let's go back to basics of HTML development.

All A tags whose content isn't a text string should have a TITLE attribute

If your link is plain like this, adding a TITLE attribute is less über important but gives you a chance to help your poor user even more:


<a href="settings.html" 
   title="Change settings, language and preferred colour">Settings</a>

Where it really matters is when you use an icon instead of system text to link to describe something. Having an ALT attribute on the image isn't always good enough. Some browsers will not show the ALT attribute tooltip when you roll over an image that is wrapped in an A attribute. Here's how you should do it:


<a href="settings.html"
   title="Change settings, language and preferred colour">
   <img src="wrench.gif" alt="Wrench" border="0" />
</a>

Sure you should use the ALT attribute. In this above example, in Firefox, what happens when you roll over the icon is that the TITLE attribute's content is shown in the tooltip. What we have to do then is to copy the TITLE attribute to the ALT attribute so it looks like this:


<a href="settings.html"
   title="Change settings, language and preferred colour">
   <img src="wrench.gif" border="0"
        alt="Change settings, language and preferred colour" />
</a>

Now you get the best user experience in both Firefox and IE. Your users can roll the mouse over the icon and be guided by a tooltip if they're uncertain what clicking the link means. Why does this matter? You probably, as me, have been on tonnes of sites with mysterious icons you can click and you have no idea what they do. Sometimes they have tooltips, sometimes just a tooltip like "email" or something equally cryptic. There's been times when I hesitate to click but instead try to guess what the click means by looking at the URL it will go to. If it looks like something like this .../change_password?user_id=1234 that gives a way a lot. Other times, I've actually inspected what the name of the icon file is to understand what it actually does (you can do this in Firefox by right-clicking and select Copy Image Location).

Why does this matter? The ultimate gospel in web usability (if you belong to the Steve Krug school) is: Don't make me think! It's painful to not only have to waste seconds on guesswork and forensic analysis but it's also a really bad user experience since you'll force your users to plunge into a click they're not entirely certain about.

Whilst I'm at it, this appeared in front of my eyes today on a hotel booking site. None of them were links but just icons with no ALT attribute. Can you guess them all?

Hotel booking icons

I've put together a little demo.html page so you can see for yourself what happens when you roll your mouse over these and what happens.

What I like and dislike about Grok

April 11, 2008
2 comments Zope

Martijn Faasen is my hero. Not only is an absolutely brilliant coder he's also able talk so that mortals understand.

This is why I like Grok

What he's replying about is mainly the question "What does Grok give me that, say, django does not?"

And, this is why I dislike Grok

Yes, you clever people. It's the same link. For some reason all the great documentation goes into replies on the mailing list rather than into a concise web page with cookbook, book and styled and funny tutorials. Why is that? They've actually made it quite easy now to enter documentation on grok.zope.org with the new Plone site.

An equally important question is: Why don't I do something about it rather than to complain? Well, I've written one how-to at least. My other "excuse" is that I'm not yet an expert enough and hence writing good documentation takes a very long time.

I think there's an important philosophical and political issue at hand. The Grok community is filled with really clever people who are very senior in the web development industry who like using mailing lists and perhaps more importantly, don't need documentation since they can study source code and unit tests to answer their questions. I know this is a sensitive statement but I'll take my chances since it implies that these guys are smarter (or perhaps just more time on their hands).

My internal battle of which new web framework to put my energy into continues. Today (thanks to Martijn's post) Grok earned one more point.

Mixing in new-style classes in Zope 2.7

April 9, 2008
0 comments Zope

Don't ask why I'm developing products for Zope 2.7 but I had to and I should have been more careful with these oldtimers.

I kept getting this error:


TypeError:  expected 2 arguments, got 1

(notice the strange double space after the : colon) This is different from the standard python TypeError when you get the parameters wrong which looks like this TypeError: __init__() takes exactly 2 arguments (1 given).

The line it complained this happened looked like this:


class MyTool(SimpleItem, UniqueObject, OtherClass):
   id = 'some_tool'
   meta_type = 'some meta type'
   def __init__(self, id='some_tool'):
       self.id = id  # <--- THIS WAS THE CULPRIT LINE APPARENTLY!!

I couldn't understand what the hell was wrong on that line! Clearly it wasn't a normal Python error. Here's the explaination: That OtherClass was a new-style class inheriting from object. It looked like this:


class OtherClass(object):
   ...

When I changed that to:


class OtherClass:
   ...

The whole thing started to work. Long lesson learnt, don't use new-style classes mixed in into Zope 2.7.

pwdf - a mix of ls and pwd

April 7, 2008
2 comments Linux

I often need to know the path to a file so that I can put that in an email for example. The only way I know is to copy and paste the output of pwd followed by a slash / followed by the name of the file. This is too much work so I wrote a quick bash script to combine this into one. Now I can do this:


$ cd bin
$ pwdf todo.sh 
/home/peterbe/bin/todo.sh

I call it pwdf since it's pwd + file. Here's the code for the curious:


#!/bin/bash
echo -n `pwd`
echo -n '/'
echo $1

Is there no easier way built in into Linux already?

Lesson learnt with creating DOM element with jQuery

April 4, 2008
6 comments JavaScript

This took me some seriously wasted time to figure out yesterday. What I was trying to do was to create a DOM element of tag type A and insert it into the DOM tree of my page. As I was coding along, everything was working just fine in Firefox but the damn thing wouldn't show up anywhere in IE 6. I debugged and debugged and tried all kinds of different approaches and I just couldn't work it out. Then Karl Rudd gave the right hint on the jQuery mailing list.

Basically, what I was doing was something like this:


var a = $("<a>").attr('href','#').click(somefunction);
$('#toolbar').append(a);

What was then so strange is now less surprising. When I changed the <a> to a <span> it actually worked but just looked wrong with the rest of the site I was working on. Here's the correct way of doing it:


var a = $("<a></a>").attr('href','#').click(somefunction);
$('#toolbar').append(a);

Notice the difference between <a> and <a></a>. The strange thing is that to reproduce this I created this test.html page but here I noticed that in IE 6 it won't let you add any elements that are enclosing ones that are written as singulars. That's really strange since in the same javascript as the above stuff I did a $("<div>") which was working fine. I'll have to get back to figuring out why that one worked nad the A one didn't.

One thing I hate about Linux: cron

March 31, 2008
6 comments Linux

First of all, I understand that the problem cron solves is a hard one but come on, it's been many years now without much progress. At least not in the usability field of cron jobs. Secondly, I don't know of an operating system that does this better. Perhaps there is one. All I'm saying here is that this aspect of Linux sucks. The issues I have with cron are:

Beef number 1 Is it root, user1 or user2 running a crontab job? I'll have to su into each suspected user and run crontab -l. Granted, some jobs require root access and others don't but it nevertheless makes it hard to find the configured jobs when maintaining someones server.

Beef number 2 Even though they do such a similar thing, it feels like /etc/cron.* is a different battlefield from crontab. Why can't this all be in one coherent place?

Beef number 3 The crontab syntax. How difficult would it be to allow an interface to accept user input as "every 10 minutes" or "01.30 every day"?

Beef number 4 With there being 12 different ways (sarcasm) to write cron job scripts there's no coherent place to collect all log and errors that happen from cron. Couldn't it be default to always write to /var/log/cron/access.log and all executions that cause a write to stderr could append to /var/log/cron/error.log

I don't think Anacron would make me any happier since the problem Anacron solves was not one of the problems I listed above. And lastly, I wouldn't be surprised if there's a semi-abandoned Open Source project on SourceForge that is user friendly but what I'm after is something to get into stock Linux. Kind of like apt/aptitude/dselect is for dpkg maybe?

How to uninstall nginx with apt

March 28, 2008
11 comments Linux

My colleague Jan showed me how to do this so I'm going to blog about it to not forget and perhaps by being here other people might be able to search and find the solution too. I installed nginx because I wanted to play with it as an alternative to apache on my laptop. Now I've played enough and I'm going to want to remove it. My first attempt didn't work:


peterbe@trillian:~ $ sudo apt-get --purge remove nginx
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
 nginx*
0 upgraded, 0 newly installed, 1 to remove and 116 not upgraded.
1 not fully installed or removed.
Need to get 0B of archives.
After unpacking 528kB disk space will be freed.
Do you want to continue [Y/n]?
(Reading database ... 242827 files and directories currently installed.)
Removing nginx ...
Stopping nginx: invoke-rc.d: initscript nginx, action "stop" failed.
dpkg: error processing nginx (--purge):
 subprocess pre-removal script returned error exit status 1
Starting nginx: invoke-rc.d: initscript nginx, action "start" failed.
dpkg: error while cleaning up:
 subprocess post-installation script returned error exit status 1
Errors were encountered while processing:
 nginx
E: Sub-process /usr/bin/dpkg returned an error code (1)

I tried this both before and after having stopped and started nginx. Nothing worked. The trick is to fiddle with the init script /etc/init.d/nginx and insert a exit 0 at the top so that it now starts like this:


#!/bin/sh
exit 0

Once saved and you try apt-get --purge remove nginx it will work. It might warn you that /var/log/nginx aren't removed because they're not empty but you can safely remove them manually unless you want to keep them.