setuptools usability - not good, what can be done?

July 15, 2009
12 comments Python

Gun to your head; what would it take to make setuptools as a package author easy to use?

I've spent far too long time today trying to create a package for a little piece of code I've written. Because I can never remember all the bizarre options and commands to setup.py I tried to do it by following Tarek Ziade's wonderful Expert Python Programming but I still got stuck.

Granted, I did not read the f**n manual. Why should I have to? I've got more important things to do such as eating cookies and watching tv.

Truncated! Read the rest by clicking the link below.

premailer.py - Transform CSS into line style attributes with lxml.html

July 11, 2009
9 comments Python

By blogging about it I can pretty much guarantee that someone will comment and say "Hey, why didn't you use the pypi/alreadyexists package which does the same thing but better". I couldn't find one after a quick search and I felt the hacker mood creeping up on my begging me to (re)invent it.

premailer.py takes a HTML page, finds all CSS blocks and transforms these into style attributes. For example, from this:


<html>
  <head>
    <title>Test</title>
    <style>
    h1, h2 { color:red; }
    strong {
      text-decoration:none
    }
    </style>
  </head>
  <body>
    <h1>Hi!</h1>
    <p><strong>Yes!</strong></p>
  </body>
</html>

You get this:


<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <h1 style="color:red">Hi!</h1>
    <p><strong style="text-decoration:none">Yes!</strong></p>
  </body>
</html>

Why is this useful? When you're writing HTML emails. Like this newsletter app that I'm working on.

I just wrote it late yesterday and it needs lots of work to impress but for the moment it works for me. If I take the time to tidy it up properly I'll turn it into a package. Assuming there isn't one already :)

UPDATE

No available on github.com and as a PyPi package

UPDATE #2

Two new copy-cats have been released:

  • python-premailer which seems to do the same thing but without lxml (which is sort of the whole point)
  • inline-styler which also uses lxml but I don't know what it does differently or better

Might be worth poking around at these if my premailer isn't good enough.

My first iPhone web app - Crosstips iPhone interface

July 1, 2009
2 comments iPhone

My first iPhone web app - Crosstips iPhone interface I've just finished my first fully iPhone enable web app (not to be confused with iPhone app which are installed onto the phone via App Store). It's here:

crosstips.org/iphone

It's basically a wrapped version of Crosstips that uses these sample resources to imitate a native app but in the Safari web browser.

It works really well and wasn't too hard to do. I think the key is to remember that the iPhone Safari is after all a web browser but it also has excellent support for AJAX (and jQuery). There are probably some bugs that I have spotted yet and there is work on the optimization but for now I'm happy.

So, if you like to solve crosswords in bed and you have an iPhone then bookmark crosstips.org/iphone

My dislike for booleans and that impact on the Django Admin

June 1, 2009
8 comments Django

I've got this model in Django:


class MyModel(models.Model):
   completed_date = models.DateTimeField(null=True)

My dislike for booleans and that impact on the Django Admin By using a DateTimeField instead of a BooleanField I'm able to record if an instance is completed or not and when it was completed. A very common pattern in relational applications. Booleans are brief but often insufficient. (Check out Ned Batchelder's Booleans suck)

To make it a bit more convenient (and readable) to work with I added this method:


class MyModel(models.Model):
   completed_date = models.DateTimeField(null=True)

   @property
   def completed(self):
       return self.completed_date is not None

That's great! Now I can do this (use your imagination now):


>>> from myapp.models import MyModel
>>> instance = MyModel.objects.all()[0]
>>> instance.completed
False
>>> instance.completed_date = datetime.datetime.now()
>>> instance.save()
>>> instance.completed
True

I guess I could add a setter too.

But Django's QuerySet machinery doesn't really tie in with the ORM Python classes until the last step so you can't use these property methods in your filtering/excluding. What I want to do is to be able to do this:


>>> from myapp.models import MyModel
>>> completed_instances = MyModel.objects.filter(completed=True)
>>> incomplete_instances = MyModel.objects.filter(completed=False)

To be able to do that I had to add special manager which is sensitive to the parameters it gets and changes them on the fly. So, the manager plus model now looks like this:


class SpecialManager(models.Manager):
   """turn certain booleanesque parameters into date parameters"""

   def filter(self, *args, **kwargs):
       self.__transform_kwargs(kwargs)
       return super(SpecialManager, self).filter(*args, **kwargs)

   def exclude(self, *args, **kwargs):
       self.__transform_kwargs(kwargs)
       return super(SpecialManager, self).exclude(*args, **kwargs)

   def __transform_kwargs(self, kwargs):
       bool_name, date_name = 'completed', 'completed_date'
       for key, value in kwargs.items():
           if bool_name == key or key.startswith('%s__' % bool_name):
               if kwargs.pop(key):
                   kwargs['%s__lte' % date_name] = datetime.now()
               else:
                   kwargs[date_name] = None

class MyModel(models.Model):
   completed_date = models.DateTimeField(null=True)

   @property
   def completed(self):
       return self.completed_date is not None

Now, that's fine but there's one problem. For the application in hand, we're relying on the admin interface a lot. Because of the handy @property decorator I set on the method completed() I now can't include completed into the admin's list_display so I have to do this special trick:


class MyModelAdmin(admin.ModelAdmin):
   list_display = ('is_completed',)

   def is_completed(self, object_):
       return object_.completed
   is_completed.short_description = u'Completed?'
   is_completed.boolean = True

Now, I get the same nice effect in the admin view where this appears as a boolean. The information is still there about when it was completed if I need to extract that for other bits and pieces such as an advanced view or auditing. Pleased!

Now one last challenge with the Django admin interface was how to filter on these non-database-fields? It's been deliberately done so that you can't filter on methods but it's slowly changing and with some hope it'll be in Django 1.2. But I'm not interested in making my application depend on a patch to django.contrib but I really want to filter in the admin. We've already added some custom links and widgets to the admin interface.

After a lot of poking around and hacking together with my colleague Bruno Renié we came up with the following solution:


class MyModelAdmin(admin.ModelAdmin):
   list_display = ('is_completed',)

   def is_completed(self, object_):
       return object_.completed
   is_arrived.short_description = u'Completed?'
   is_arrived.boolean = True

   def changelist_view(self, request, extra_context=None, **kwargs):
       from django.contrib.admin.views.main import ChangeList
       cl = ChangeList(request, self.model, list(self.list_display),
                       self.list_display_links, self.list_filter,
                       self.date_hierarchy, self.search_fields, 
                       self.list_select_related,
                       self.list_per_page,
                       self.list_editable, self)
       cl.formset = None

       if extra_context is None:
           extra_context = {}

       if kwargs.get('only_completed'):
           cl.result_list = cl.result_list.exclude(completed_date=None)
           extra_context['extra_filter'] = "Only completed ones"

       extra_context['cl'] = cl
       return super(SendinRequestAdmin, self).\
         changelist_view(request, extra_context=extra_context)

   def get_urls(self):
       from django.conf.urls.defaults import patterns, url
       urls = super(SendinRequestAdmin, self).get_urls()
       my_urls = patterns('',
               url(r'^only-completed/$', 
                   self.admin_site.admin_view(self.changelist_view),
                    {'only_completed':True}, name="changelist_view"),
       )
       return my_urls + urls

Granted, we're not getting the nice filter widget on the right hand side in the admin interface this time but it's good enough for me to be able to make a special link to /admin/myapp/mymodel/only-completed/ and it works just like a normal filter.

Ticket 5833 is quite busy and has been going on for a while. It feels a daunting task to dig in and contribute when so many people are already ahead of me. By writing this blog entry hopefully it will help other people who're hacking on their Django admin interfaces who, like me, hate booleans.

Introducing django-spellcorrector

May 28, 2009
0 comments Django

I've now made a vastly improved spellcorrector specifically tied into Django and it's models. It's the old class as before but hooked up to models so Django can take care of persisting the trained words. Again, I have to give tribute to Peter Norvig for his inspirational blog How to Write a Spelling Corrector which a large majority of my code is based in. At least in the tricky parts.

What's nice about this little app is that it's very easy to plug in and use. You just download it, put it on your PATH and include it in your INSTALLED_APPS. Then from another app you do something like this:


from spellcorrector.views import Spellcorrector
sc = Spellcorrector()
sc.load() # nothing will happen the first time

sc.train(u"peter")
print sc.correct(u"petter") # will print peter
sc.save()

sc2 = Spellcorrector()
sc2.load()
print sc2.correct(u"petter") # will print peter

Truncated! Read the rest by clicking the link below.

Crossing the world - new feature on Crosstips

May 23, 2009
1 comment Django

Crossing the world - new feature on Crosstips I've added a very fun new feature on Crosstips called Crossing the world which shows real-time searches happening all over the world. Admittedly the traffic on Crosstips isn't particularly high, (At the time of writing, 1 search every 2 minutes) so you might have to sit there for a while until something happens. It's strangely addictive to watch it.

To do this I had to use all sorts of buzz words. AJAX, function cache decorators, GeoIP and Google Maps. I'm currently using the free version of GeoIP City Lite which seems to work on a large majority of all captured IP addresses. And since the map is sufficiently zoomed out you can't really tell how inaccurate it is.

One little detail I'm quite proud of is how the AJAX code understands how to change interval between lookups. Each time the server responds with something, the interval is reduced down but if there aren't any new searches the interval slowly increases again. This is done to minimize the number of useless server requests but at the same time try to make it react often if there are plenty of things to show. The next feature to add is Comet (like AJAX but push instead of pull).

Now if we could only get some more action on the site!! Tell all your grand-people to use this site when they get stuck on solving crossword puzzles!

UPDATE

I've just learnt that GeoIP is already shipped in GeoDjango so I've basically reinvented half a wheel :(

Sequences in PostgreSQL and rolling back transactions

May 12, 2009
0 comments Linux

This behavior bit me today and caused me some pain so hopefully by sharing it it can help someone else not ending up in the same pitfall.

Basically, I use Zope to manage a PostgreSQL database and since Zope is 100% transactional it rolls back queries when exception occur. That's great but what I didn't know is that when it rolls back it doesn't roll back the sequences. Makes sense in retrospect I guess. Here's a proof of that:


test_db=# create table "foo" (id serial primary key, name varchar(10));
CREATE TABLE
test_db=# insert into foo(name) values('Peter');
INSERT 0 1
test_db=# select * from foo;
 id | name  
----+-------
  1 | Peter
(1 row)

test_db=#  select nextval('foo_id_seq');
 nextval 
---------
       2
(1 row)

test_db=# begin;
BEGIN
test_db=# insert into foo(id, name) values(2, 'Sonic');
INSERT 0 1
test_db=# rollback;
ROLLBACK
test_db=#  select nextval('foo_id_seq');
 nextval 
---------
       3
(1 row)

In my application I often use the sequences to predict what the auto generate new ID is going to be for things that the application can use such as redirecting or updating some other tables. As I wasn't expecting this it caused a bug in my web app.

Most unusual letters in English language

May 12, 2009
11 comments Python

I needed to find out what are the least used letters in the English language. I pulled down a list of about 100,000+ English words, split them all and made a list of about 1,000,000 letters. Sorted them by usage and came up with this as the result:


esiarntoldcugpmhbyfkwvzxjq

It would be interesting to make a heatmap of this over an image of a QWERTY keyboard.

Truncated! Read the rest by clicking the link below.

To JSON, Pickle or Marshal in Python

May 8, 2009
4 comments Python

To JSON, Pickle or Marshal in Python I was reading David Cramer's tip to use JSONField in Django to be able to store arbitrary fields in a SQL database. Nice. But is it fast enough? Well, I can't answer that but I did look into the difference in read/write performance between simplejson, cPickle and marshal.

Only reading:


JSON 0.00593531370163
PICKLE 0.0109532237053
MARSHAL 0.00413788318634

Reading and writing:


JSON 0.0434390544891
PICKLE 0.0289686655998
MARSHAL 0.00728442907333

Clearly marshal is faster but to quote the documentation:

"Warning: The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."

Clearly simplejson is a very fast reader and the JSON format has the delicious advantage that it's "human readable" (compared to the others).

NOTE! I spent about 5 minutes putting together the script and about 10 minutes writing this so feel free to doubt it's scientific accuracy.

Truncated! Read the rest by clicking the link below.

Never seen before Google Server Error

May 7, 2009
1 comment

Never seen before Google Server Error I've never seen a Server Error on Google before. I've seen errors before but they often indicate that the whole service is out for a brief moment. This time it feels like a bug that has caused it.

Don't get me wrong. I still thing Google search is the best Internet invention since e-mail.