Nice testimonial about django-static

February 21, 2011
0 comments Django

My friend Chris is a Django newbie who has managed to build a whole e-shop site in Django. It will launch on a couple of days and when it launches I will blog about it here too. He sent me this today which gave me a smile:

"I spent today setting up django_static for the site, and optimising it for performance. If there's one thing I've learned from you, it's optimisation.

So, my homepage is now under 100KB (was 330KB), and it loads in @5-6 seconds from hard refresh (was 13-14 seconds at its worst). And I just got a 92 score on Yslow. I do believe I have the fastest tea website around now, and I still haven't installed caching.

Wicked huh?"

He's talking about using django-static. Then I get another email shortly after with this:

"correction - I get 97 on YSlow if I use a VPN.

I just found that the Great Firewall tags extra HTTP requests onto every request I make from my browser, pinging a server in Shanghai with a PHP script which probably checks the page for its content or if its on some kind of blocked list. Cheeky buggers!"

It's that interesting! (Note: Chris is based in China but hosts the test site in the UK)

How I profile my Nginx + proxy pass server

February 16, 2011
3 comments Web development, Python

Like so many others you probably have an Nginx server sitting in front of your application server (Django, Zope, Rails). The Nginx server serves static files right off the filesystem and when it doesn't do that it proxy passes the request on to the backend. You might be using proxy_pass, uwsgi or fastcgi_pass or at least something very similar. Most likely you have an Nginx site configure something like this:


server {
   access_log /var/log/nginx/mysite.access.log;
   location ^~ /static/ {
       root /var/lib/webapp;
       access_log off;
   }
   location / {
       proxy_pass http://localhost:8000;
   }
}

What I do is that I add an access log directive that times every request. This makes it possible to know how long every non-trivial request takes for the backend to complete:


server {
   log_format timed_combined '$remote_addr - $remote_user [$time_local]  ' 
                             '"$request" $status $body_bytes_sent '
                             '"$http_referer" "$http_user_agent" $request_time';
   access_log /var/log/nginx/timed.mysite.access.log timed_combined;

   location ^~ /css/ {
       root /var/lib/webapp/static;
       access_log off;
   }
   location / {
       proxy_pass http://localhost:8000;
   }
}

Truncated! Read the rest by clicking the link below.

DoneCal homepage now able to do 10,000 requests/second

February 13, 2011
0 comments DoneCal

I've done some work refactoring the homepage of DoneCal so that it does no logic other than just serving HTML. What it used to do was some basic security checks and stuff so that it says "Hi Peter" and a log out link. Now all of that has been moved to one simple piece of AJAX call.

BEFORE:


# ab -n 1000 -c 10 http://donecal.com/
...
Requests per second:    353.65 [#/sec] (mean)

AFTER:


# ab -n 1000 -c 10 http://donecal.com/
...
Requests per second:    9796.78 [#/sec] (mean)

# ab -n 1000 -c 10 http://donecal.com/auth/logged_in.json
...
Requests per second:    3756.25 [#/sec] (mean)

The reason why loading the index.html can be so fast is because I'm using Nginx directly. In my Nginx config I have to not use the static file if the request isn't a GET request or if it has a query string. I'll need to remove that stuff too and then it means that I can push the index.html file out to my AWS CloudFront CDN using a CNAME.

DoneCal is my first web application that is this Javascript heavy. It raises the bar in terms of optimal HTTP optimization to get the best user experience possible. I love learning this new way of working.

EditDistanceMatcher - NodeJS script for doing edit distance 1 matching

February 5, 2011
0 comments JavaScript

I needed a very basic spell correction string matcher in my current NodeJS project so I wrote a simple class called EditDistanceMatcher that compares a string against another string and matches if it's 1 edit distance away. With it you can do things like Google search's "Did you mean: poop?" when you search for pop.

Note, this code doesn't check popularity of correct words (e.g. "pop" might appear much more often than "poop" so it'll suggest "pop" if you enter "poup"). Anyway this simple snippet from the unit tests will reveal how it works:


     /* The match() method */
     var edm = new EditDistanceMatcher(["peter"]);
     // edm.match returns an array and remember,
     // in javascript ['peter'] == ['peter'] => false
     test.equal(edm.match("petter").length, 1);
     test.equal(edm.match("petter")[0], 'peter');
     test.equal(edm.match("junk").length, 0);

     /* the is_matched() method */
     var edm = new EditDistanceMatcher(["peter"]);
     test.equal(typeof edm.is_matched('petter'), 'boolean');
     test.equal(typeof edm.is_matched('junk'), 'boolean');
     test.ok(edm.is_matched("petter"));
     test.ok(!edm.is_matched("junk"));

The most basic use case is if you have a quiz and you want to accept some spelling mistakes. "What's the capital of Sweden?; STOKHOLM; Correct!"

For the unlazy this NodeJS code can very easily be used in a browser by simply removing the exports stuff.

edit_distance.js

tests/test_edit_distance.js

Note! I wrote this in an airport lounge so I'm sure it can be improved lots more.

DoneCal on MumbaiMirror

February 3, 2011
1 comment DoneCal

Here's a nice write up about DoneCal on MumbaiMirror

"All in all, DoneCal is one of those Web 2.0 tools that you wouldn’t really miss if it wasn’t around, but once you use it, you can’t go back."

They don't make a link to DoneCal which I suspect is some sort of half assed attempt to avoid too many outgoing links. They've strangely spent time writing about another web page but can't make a link to it. If I've learned anything from Google is that the ultimate mantra of SEO is: don't try to be smarter than us, just write great content and let us worry about ranking.

If these guys are worried about that, why don't they use a rel="nofollowup" attribute on the link?

DoneCal.com international visitors

January 21, 2011
0 comments DoneCal

DoneCal.com international visitors For the first time in my life I've launched a web site/app that isn't mostly popular in the United States. Yay!(?) Not that I care or that it matters but it's worth noting. For some reason it's currently most popular in France, followed closely by China and United States is not till the 5th place.

Of the United States visitors I'm not surprised the California bunch is more prominent. The service is quite new and quite technically interesting for people in the industry so I guess a lot of those visitors are Silicon Valley type folks.

Fastest "boolean SQL queries" possible with Django

January 14, 2011
5 comments Django

For those familiar with the Django ORM they know how easy it is to work with and that you can do lots of nifty things with the result (QuerySet in Django lingo).

So I was working report that basically just needed to figure out if a particular product has been invoiced. Not for how much or when, just if it's included in an invoice or not.

Truncated! Read the rest by clicking the link below.

django-static version 1.5 automatically taking care of imported CSS

January 11, 2011
1 comment Django

I just released django-static 1.5 (github page) which takes care of optimizing imported CSS files.

To explain, suppose you have a file called foo.css and do this in your Django template:


{% load django_static %}
<link href="{% slimfile "/css/foo.css" %}"
  rel="stylesheet" type="text/css" />

And in foo.css you have the following:


@import "bar.css";
body {
   background-image: url(/images/foo.png);
}

And in bar.css you have this:


div.content {
   background-image: url("bar.png");
}

The outcome is the following:


# foo.css
@import "/css/bar.1257701299.css";
body{background-image:url(/images/foo.1257701686.png)}

# bar.css
div.content{background-image:url("/css/bar.1257701552.png")}

In other words not only does it parse your CSS content and gives images unique names you can set aggressive caching headers on, it will also unfold imported CSS files and optimize them too.

I think that's really useful. You with one single setting (settings.DJANGO_STATIC=True) you can get all your static resources massaged and prepare for the best possible HTTP optimization. Also, it's all automated so you never need to run any build scripts and the definition of what static resources to use (and how to optimize them) is all defined in the template. This I think makes a lot more sense than maintaining static resources in a config file.

The coverage is 93% and there is an example app to look at in the if you prefer that over a README.

RequireJS versus HeadJS

January 9, 2011
4 comments JavaScript

I've spent a lot of time trying to figure out which Javascript script loading framework to use. RequireJS or HeadJS. I still don't have an answer. Neither website refers to each other.

In general

  • To me, it's important to be able to load and execute some Javascript before downloading Javascript modules that aren't needed to render the initial screen. Makes for a more responsive behaviour and gets pixels drawn quicker for Javascript-heavy sites.
  • An understated, massive, benefit to combining multiple .js files into one is sporadic network bottlenecks. Fewer files to download and fewer things can go wrong. These bottlenecks can make a few Kb of a Javascript file take 10 seconds to download.
  • Public CDNs (e.g. jQuery from Google's CDN) is an extremely powerful optimization technique. Not only are they extremely fast, it's very likely they're preloaded because some other site uses the exact same URL.
  • Where does it say that Javascript has to be loaded in the head? Even html5-boilerplate loads Javascript just before the </body> tag.
  • Realistically, in the real world, it's not uncommon that you can't combine all .js files into one. This is not true for web apps that consists of just one HTML file. One page might require A.js, B.js and C.js but another page requires A.js, B.js and D.js. Requires manual thinking whether you should combine A,B,C,D.js or A,B.js + C|D.js. No framework can predict this.
  • All loading and browser incompatibility hacks will eventually become obsolete as browsers catch up. Again, requires manual thinking because supporting and ultra-boosting performance might have a different cost today compared to a year from now. The most guilty of this appears to be ControlJS
  • I'm confident that optimization in terms of file concatenation and white space optimization does not belong to the framework.
  • Apparently iPhone 3.x series can't cache individual files larger than 15Kb (25Kb for iPhone 4.x). That's a very small about if you combine several large modules.
  • Accepting the fact of life that sporadic network bottlenecks can kill your page, think hard about asynchronous loading and preserved order. Perhaps ideal is a mix of both. What framework allows that? (both RequireJS and HeadJS it seems)
  • Loading frameworks are not for everything and everyone. If you're building something "simple" or landing page like Google's search page frameworks might just get in your way.

RequireJS

  • Author well known but his Dojoesque style shines through in RequireJS's syntax and patterns.
  • Is only about Javascript. No CSS hacks or other html5ish boilerplates.
  • Gets into the realm of module definitions. Neat but do you want the loading framework to get involved in how you prefer to write your code or do you just want it to load your files?
  • All the module definition stuff feels excessive for every single project I can imagine but we're entering an era of "web apps" (as opposed to "web sites") so this might need to change.
  • What you learn in using RequireJS you can reuse when building NodeJS (a server-side framework). It's also possible to use RequireJS in Rhino (server-side Javascript engine) but personally I haven't reached that level yet.

HeadJS

  • Author relatively unknown. quite well known too. Author also of Flowplayer and jQuery Tools.
  • Contains a kitchen sink (CSS tricks, modernizer.js) but perhaps they're really quite useful. After all, you don't write your web site in Assembly.
  • There's a fork of HeadJS that does just the Javascript stuff. But will it be maintained? And does that defeat the whole point of using HeadJS?
  • With its CSS hacks (aka. kitchen sink) HeadJS seems great if you really care about combining HTML5 techniques with Internet Explorer.
  • This awesome experiment shows that HeadJS really works and that asynchronous loading can be really powerful. But ask yourself, are you ready to build in an asynchronous way?
  • With HeadJS I can label a combined and optimized bundle and load my code once that bundle is loaded. Can I do that with RequireJS? It seems to depend on the filename (minus the .js suffix).
  • Makes the assumption that just because a file is loaded the order of execution is a non-issue. This means you might have trouble controlling dependencies during execution. This is a grey area that might or might not matter depending on the complexity of your app.
  • A feeling I get is that HeadJS without the CSS kitchen sink stuff reduces to become LabJS or EnhanceJS.

Other alternatives

The ones I can think of are: ControlJS (feels too "hacky" for my taste), CommonJS (not sufficiently "in-browser specific" for my taste) and EnhanceJS (like HeadJS and LabJS but with less power/features)

The one I haven't studied as much is LabJS. It seems more similar to HeadJS in style. Perhaps it deserves more attention but the reason HeadJS got my attention is because it's got a better looking website.

In conclusion

You mileage will vary. The deeper I look into this I feel personal taste comes into play. It's hard enough for a single framework other to write realistic benchmarks; even harder for "evalutators" like myself to benchmark them all. It gets incrementally harder when you take into account the effects of http latency, sporadic network bottlenecks, browser garbage collection and user experience.

Personally I think HeadJS is a smoother transition for general web sites. RequireJS might be more appropriate when write web apps with virtually no HTML and a single URL.

With the risk of starting a war... If you're a Rails/Django/Plone head, consider HeadJS. If you're a mobile web app/NodeJS head consider RequireJS.

UPDATE

Sorry, I now realise that Tero Piirainen actually has built a fair amount of powerful Javascript libraries.

ToDo apps I gave up on in 2010

January 3, 2011
4 comments Wondering

First I tried Things for the iPhone which I tried because some people I work with said it was good. It lasted about a week. I think it failed, for me, because I didn't feel how time slowly wipes away old stuff that isn't relevant any more. My todo lists are usually about work projects which mainly means writing code and sending emails to people on the project. Things being on the iPhone meant I had to take my hands off the computer.

The second one I tried is the app with perhaps the most brilliant UI I've seen in years: TeuxDeux There's only three things you can do, enter events and mark them as done. I tried the iPhone version but even though it works well it wasn't as neat as the web version. Eventually I gave up because I think I couldn't keep up with moving past day events forward to today's date. That meant that new events entered "today" sort of got higher priority than old ones and that just felt wrong in the long run.

The third one wasn't really a todo list but that's how I ended up using it: Workflowy Again, an absolutely brilliant UI and technical achievement. I had it as an open tab for about three weeks until I ended up not bothering any more. I love writing bullet point lists to the n'th degree but I felt that every time I came back to it I had to "search" for where I was and had to make a tonne of micro-decisions about where to put stuff. When I had a thought in my head I didn't want to first think and plan where to put it.

What did work?

It's far from applicable to everyone but one thing that has worked (has for many years in fact) was our work issue tracker. We use IssueTrackerProduct, written by yours truely. It's not really fair because when you add the fact that multiple people are using the same tool the personal choices don't really matter. Also, I think project issue trackers like this have the added bonus that you don't clutter them with small basic things like "Check database log X".

The perhaps most successful todo list for me in 2010 was keeping a TODO.txt file in my project source directory. This is a personal file I rarely check in to git because my colleagues don't need to see mine (well, sometimes that's useful too). It's a simple text file and it looks something like this:


* (MEDIUM) render the shared classes in calendar.html on page load 

* (HIGH) Find out why all CSS is lost when an event is added

* (LOW) Experiment with http://vis.stanford.edu/protovis/ to write
 some nice stats

...

I guess it works because it's my own invention. From scratch. Generally todo list apps work best if you wrote the app yourself. It's immediately in context because each code project gets its own file. Its order is implied usually by writing to the top of the file but you can be a bit cowboy about it and just jot things down without doing it "the correct way".