Beach volleyball bums

August 2, 2012
2 comments Misc. links

Bums
My good friend @jonanmary brought this very amusing tweet to my attention:

I know you've all seen this, but it's awesome anyway: Photographing other sports like beach volleyball metro.us/newyork/sports... #olympics

FIVB

That's brilliant! The thing is, if you didn't already know it, beach volleyball gals are mandated to wear those speedos looking things they wear. What's even more funny is something another good friend, @trollkip on twitter point:

@jonanmary @peterbe out of interest, earlier, I looked up who makes these rules. Governing body: fivb.org/EN/FIVB/Board_ Yeah.

UPDATE

Apparently, the rule about what beach volleyball players have to wear changed recently
I love me some tanned sexy lady-skin but don't be an asshole about it. Let her choose.

Is Nginx obsolete now that we have Amazon CloudFront?

July 28, 2012
24 comments Web development

About 5 years ago I switched from Apache to Nginx. And with that switch I could practically stop stabbing my feet with HTTP accelerators like Squid and Varnish because Nginx serves files from the filesystem both faster and more efficient than the accelerators. And, it's one less moving part that can go wrong.

Then in late 2010 Amazon introduced Custom Origins on their Amazon CloudFront CDN service. Compared to other competing CDNs I guess CloudFront loses some benchmarks and win some others. Nevertheless, network latency is the speed-freaks biggest enemy and CDNs are awesome.

With Custom Origin all you do is tell CloudFront to act as a "proxy". It takes and URL and replaces the domain name to go and fetch the original from your own server. For example...

  1. You prepare http://mydomain.com/static/foo.css
  2. You configure your CloudFront get your new domain (aka. "Distribution")
  3. You request http://efac1bef32rf3c.cloudfront.net/static/foo.css
  4. CloudFront fetches the resource from http://mydomain.com/static/foo.css and saves a copy
  5. CloudFront observes which cache headers were used and repeat that. Forever.

So, if I make my Nginx server serve /static/foo.css with:

Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000
Cache-Control: public

Then CloudFront will do the same and it means it will never come back to your Nginx again. In other words, your Nginx server serves the cacheable static assets once and all other requests are just the usual HTML and JSON and whatever your backend web server spits out.

So, what does this mean? It means that we can significantly re-think they way we write code that prepares and builds static assets. Instead of a complex build or a run-time process that ultimately writes files to the filesystem we can basically do it all in run-time and not worry about speed. E.g. something like this::


# urls.py
  url(r'/static/(.*\.css)', views.serve_css)

# views.py
def serve_css(request, filename):
    response = http.HttpResponse(mimetype="text/css")
    response.setHeader('Cache-Control': 'public, max-age:315360000')
    content = open(filename).read()
    content = cssmin.cssmin(content)
    content = '/* copyright: you */\n%s' % content
    response.write(content)
    return response

That's untested code that can be vastly improved but I hope you get the idea. Obviously there are lots more things you can and should do such concatenating files.

So, what does this also mean? You don't need Nginx. At least not for serving static files faster. I've shown before that something like Nginx + uWSGI is "better" (faster and less memory) than something like Apache + mod_wsgi but oftentimes the difference is negligable.

I for one am not going to re-write all my various code I have to prepare for optimal static assets hosting but I'll definietly keep this stuff in mind. After all, there are other nifty things Nginx can do too.

By the way, here's a really good diagram that explains CloudFront

UPDATE

Want to read this in Serbian? Thank you Anja Skrba for the translation!

How to use premailer as a command line script

July 13, 2012
5 comments Python

(This post is a response to Richard Patchet's request for tips on how to actually use premailer)

First of all, premailer is a Python library that converts a document of HTML and tranforms its <style> tags into inline style attributes on the HTML itself. This comes very handy when you need to take a nicely formatted HTML newletter template and prepare it before sending because when you send HTML emails you can't reference an external .css file.

So, here's how to turn it into a command line script.

First, install, then write the script:

$ pip install premailer
$ touch ~/bin/run-premailer.py
$ chmod +x ~/bin/run-premailer.py

Now, you might want to do this differently but this should get you places:


#!/usr/bin/env python

from premailer import transform

def run(files):
    try:
        base_url = [x for x in files if x.count('://')][0]
        files.remove(base_url)
    except IndexError:
        base_url = None

    for file_ in files:
        html = open(file_).read()
        print transform(html, base_url=base_url)

if __name__ == '__main__':
    import sys
    run(sys.argv[1:])

To test it, I've made a sample HTML page that looks like this:


<html>
    <head>
        <title>Test</title>
        <style>
        h1, h2 { color:red; }
        strong {
          text-decoration:none
          }
        p { font-size:2px }
        p.footer { font-size: 1px}
        p a:link { color: blue; }
        </style>
    </head>
    <body>
        <h1>Hi!</h1>
        <p><strong>Yes!</strong></p>
        <p class="footer" style="color:red">Feetnuts</p>
        <p><a href="page2/">Go to page 2</a></p>
    </body>
</html>

Cool. So let's run it: $ run-premailer.py test.html


<html>
  <head>
    <title>Test</title>
  </head>
  <body>
        <h1 style="color:red">Hi!</h1>
        <p style="font-size:2px"><strong style="text-decoration:none">Yes!</strong></p>
        <p style="{color:red; font-size:1px} :link{color:red}">Feetnuts</p>
    <p style="font-size:2px"><a href="page2/" style=":link{color:blue}">Go to page 2</a></p>
    </body>
</html>

Note that premailer supports converting relative URLs, so let's actually using that:
$ run-premailer.py test.html https://www.peterbe.com


<html>
  <head>
    <title>Test</title>
  </head>
  <body>
        <h1 style="color:red">Hi!</h1>
        <p style="font-size:2px"><strong style="text-decoration:none">Yes!</strong></p>
        <p style="{color:red; font-size:1px} :link{color:red}">Feetnuts</p>
    <p style="font-size:2px"><a href="https://www.peterbe.com/page2/" 
     style=":link{color:blue}">Go to page 2</a></p>
    </body>
</html>

I'm sure you can think of many many ways to improve that. Mayhaps use argparse or something fancy to allow for more options. Mayhaps make it so that you can supply named .css files on the command line that get automagically inserted on the fly.

US License Plate Spotter (part 1)

July 9, 2012
4 comments JavaScript

This is part 1 in, hopefully, a series of blog articles about developing mobile apps with Javascript.

Screenshot
My app that I'm going to build is called "US License Plate Spotter". A dead-simple app where you tick off each US state once you see it.

In case you didn't know, in the USA, you mostly see "local" license plates because if you for example buy a car in Micigan but move to California, after about 3 months you have to re-license it with plates from the state you're living in. So here where I live, in California, you mostly see "California plates" but every now and then you see other plates such as Nevada, Washington or Floria. The further away, the less likely to be spotted.

Anyway, the first version is available right here: https://www.peterbe.com/uslicenseplates/index.html Code is on Github
It works best in smartphones like an Android or iOS app since it's built with jQuery Mobile

This is about 2 hours of work which is pretty quick but it was easy because I've used jQuery Mobile a lot in the past and this was more or less just getting familiar with the recent changes.
The app works fine and I even used it this last weekend to keep track of new license plates since Friday.

Next steps:

  1. Polish it a bit more with an icon, an about page and maybe add the date it was spotted
  2. Add "cloud storage" (at the moment it uses localStorage) using the Facebook API
  3. Compile an Android and iOS version with PhoneGap and see if I can launch it in some app stores.

UPDATE

Here's part 2 in the series.

Newfound love of @staticmethod in Python

July 2, 2012
6 comments Python

The @staticmethod decorator is nothing new. In fact, it was added in version 2.2. However, it's not till now in 2012 that I have genuinely fallen in love with it.

First a quick recap to remind you how @staticmethod works.


class Printer(object):

    def __init__(self, text):
        self.text = text

    @staticmethod
    def newlines(s):
        return s.replace('\n','\r')

    def printer(self):
        return self.newlines(self.text)

p = Printer('\n\r')
assert p.printer() == '\r\r'

So, it's a function that has nothing to do with the instance but still belongs to the class. It belongs to the class from an structural point of view of the observer. Like, clearly the newlines function is related to the Printer class. The alternative is:


def newlines(s):
    return s.replace('\n','\r')

class Printer(object):

    def __init__(self, text):
        self.text = text

    def printer(self):
        return newlines(self.text)

p = Printer('\n\r')
assert p.printer() == '\r\r'

It's the exact same thing and one could argue that the function has nothing to do with the Printer class. But ask yourself (by looking at your code); how many times do you have classes with methods on them that take self as a parameter but never actually use it?

So, now for the trump card that makes it worth the effort of making it a staticmethod: object orientation. How would you do this neatly without OO?


class UNIXPrinter(Printer):

    @staticmethod
    def newlines(s):
        return s.replace('\n\r', '\n')

p = UNIXPrinter('\n\r')
assert p.printer() == '\n'  

Can you see it? It's ideal for little functions that should be domesticated by the class but have nothing to do with the instance (e.g. self). I used to think it looked like it's making a pure looking thing like something more complex that it needs to be. But now, I think it looks great!

Difference between $.data('foo') and $.attr('data-foo') in jQuery

June 10, 2012
9 comments JavaScript

I learned something today thanks to my colleague Axel Hecht; the difference between $element.data('foo') and $element.attr('data-foo').

Basically, the .data() getter/setter is more powerful since it can do more things. For example:


<img id="image" data-number="42">

// numbers are turned to integers
assert($('#image').data('number') + 1 == 43);
// the more rudimentary way
assert($('#image').attr('data-number') + 1 == '421');

Integers is just one thing the .data() getter is able to parse. It can do other cool things too like booleans and JSON. Check out its docs

So, why would you NOT use .data()?

One reason is that with .data(name, value) the original DOM element is not actually modified. This can cause trouble if other pieces of Javascript depend on the value of a data- attribute further along in the page rendering process.

To see it in action check out: test.html

In conclusion: just be aware of it. Feel free to use the .data() getter/setter because it's way better but be aware of the potential risks.

How I deal with deferred image loading in Javascript

June 8, 2012
3 comments Web development, JavaScript

First of all, this technique is only really applicable to apps where there's only one big HTML template which is then shuffles, part hidden and part visible thanks to lots of Javascript. Those familiar with jQuery Mobile will have seen this.

On Around The World there are a lot of images. Majority of them you don't need to see immediately because only one screen is loaded at the time. The page structure looks like this:


<div class="section" id="page1">
  <h2>Page 1</h2>
  <img src="section-icon1.png">
</div>
<div class="section" id="page2" style="display:none">
  <h2>Page 2</h2>
  <img src="section-icon2.png">
</div>
<div class="section" id="page3" style="display:none">
  <h2>Page 3</h2>
  <img src="section-icon3.png">
</div>

So, if you load that you'll notice that your browser will download "section-icon1.png", "section-icon2.png" and "section-icon3.png" even though two of the images are not going to be displayed. Good for pre-loading the images when they're later needed but bad for the user experience since the browser will be busy downloading images rather than displaying the first visible section.

This is how I solve this; first I change the HTML to be this:


<div class="section" id="page1">
  <h2>Page 1</h2>
  <img src="." data-src="section-icon1.png" class="deferred">
</div>
<div class="section" id="page2" style="display:none">
  <h2>Page 2</h2>
  <img src="." data-src="section-icon2.png" class="deferred">
</div>
<div class="section" id="page3" style="display:none">
  <h2>Page 3</h2>
  <img src="." data-src="section-icon3.png" class="deferred">
</div>

And now for the magic that turns these img tags into real normal img tags. The truth is that the Javascript about loading individual sections is a bit more complicated but in its inner core it looks something like this:


// variable 'hash' is something like 'page2'
if ($(hash + '.section').size()) {
  $('.section:visible').hide();
  $(hash + '.section').show();
  $('img.deferred', hash).each(function() {
    var el = $(this);
    el.attr('src', el.data('src'));
    el.removeClass('deferred');
  });
  ...

It makes the HTML slightly more complicated but the end result is great. It's not just useful for the first-time load but also applicable every time someone reloads the page.

Postgres collation errors on CITEXT fields when upgrading to 9.1

May 21, 2012
1 comment Web development

Just in case this hits you too when you use CITEXT fields that were originally defined in a Postgres before version 9.1.

ProgrammingError: could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

This can happen if you use something like:


WHERE name='peter'

when field name is a case insensitive text field.

After some googling around and shooting in the dark I found the the only way to crack this is to run this command:


CREATE EXTENSION citext FROM unpackaged;

Hope that helps some poor schmuck with the same problem.

UPDATE

If you have problems applying this to new tables in Postgres 9.1 you might need to run this instead:


CREATE EXTENSION citext WITH SCHEMA public ;

Secs sell! How I cache my entire pages (server-side)

May 10, 2012
1 comment Python, Django

I've blogged before about how this site can easily push out over 2,000 requests/second using only 6 WSGI workers excluding latency. The reason that's possible is because the whole page(s) can be cached server-side. What actually happens is that the whole rendered HTML blob is stored in the cache server (Redis in my case) so that no database queries are needed at all.

I wanted my site to still "feel" dynamic in the sense that once you post a comment (and it's published), the page automatically invalidates the cache and thus, the user doesn't have to refresh his browser when he knows it should have changed. To accomplish this I used a hacked cache_page decorator that makes the cache key depend on the content it depends on. Here's the code I actually use today for the home page:


def _home_key_prefixer(request):
    if request.method != 'GET':
        return None
    prefix = urllib.urlencode(request.GET)
    cache_key = 'latest_comment_add_date'
    latest_date = cache.get(cache_key)
    if latest_date is None:
        # when a blog comment is posted, the blog modify_date is incremented
        latest, = (BlogItem.objects
                   .order_by('-modify_date')
                   .values('modify_date')[:1])
        latest_date = latest['modify_date'].strftime('%f')
        cache.set(cache_key, latest_date, 60 * 60)
    prefix += str(latest_date)

    try:
        redis_increment('homepage:hits', request)
    except Exception:
        logging.error('Unable to redis.zincrby', exc_info=True)

    return prefix


@cache_page_with_prefix(60 * 60, _home_key_prefixer)
def home(request, oc=None):
    ...
    try:
        redis_increment('homepage:misses', request)
    except Exception:
        logging.error('Unable to redis.zincrby', exc_info=True)
    ...

And in the models I then have this:


@receiver(post_save, sender=BlogComment)
@receiver(post_save, sender=BlogItem)
def invalidate_latest_comment_add_dates(sender, instance, **kwargs):
    cache_key = 'latest_comment_add_date'
    cache.delete(cache_key)

So this means:

  • whole pages are cached for long time for fast access
  • updates immediately invalidates the cache for best user experience
  • no need to mess with ANY SQL caching

So, the next question is, if posting a comment means that the cache is invalidated and needs to be populated, what's the ratio of hits versus hits where the cache is cleared? Glad you asked. That's why I made this page:

www.peterbe.com/stats/

It allows me to monitor how often a new blog comment or general time-out means poor django needs to re-create the HTML using SQL.

At the time of writing, one in every 25 hits to the homepage requires the server to re-generate the page. And still the content is always fresh and relevant.

The next level of optimization would be to figure out whether a particular page update (e.g. a blog comment posting on a page that isn't featured on the home page) should or should not invalidate the home page. esp

On the command line no one can hear you screen. Or can they?

May 3, 2012
2 comments Linux

This is how you check if a command (with or without any output) exited successfully or if it exited with something other than 0, in bash:

#!/bin/bash
./someprogram
WORKED=$?
if [ "$WORKED" != 0 ]; then
  echo "FAILED"
else
  echo "WORKED"
fi

But how do you inspect this on the command line? I actually don't know, until it hit me. The simplest possible solution:

$ ./someprogram && echo worked || echo failed

What a great low-tech solution. I just works. If you're on OSX, you can nerd it up a bit more:

$ ./someprogram && say worked || say failed