Filtered by Python

Page 21

Reset

To JSON, Pickle or Marshal in Python

May 8, 2009
4 comments Python

To JSON, Pickle or Marshal in Python I was reading David Cramer's tip to use JSONField in Django to be able to store arbitrary fields in a SQL database. Nice. But is it fast enough? Well, I can't answer that but I did look into the difference in read/write performance between simplejson, cPickle and marshal.

Only reading:


JSON 0.00593531370163
PICKLE 0.0109532237053
MARSHAL 0.00413788318634

Reading and writing:


JSON 0.0434390544891
PICKLE 0.0289686655998
MARSHAL 0.00728442907333

Clearly marshal is faster but to quote the documentation:

"Warning: The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."

Clearly simplejson is a very fast reader and the JSON format has the delicious advantage that it's "human readable" (compared to the others).

NOTE! I spent about 5 minutes putting together the script and about 10 minutes writing this so feel free to doubt it's scientific accuracy.

Truncated! Read the rest by clicking the link below.

Git + Twitter = Friedcode

April 22, 2009
10 comments Python, Linux

Git + Twitter = Friedcode I've now written my first Git hook. For the people who don't know what Git is you have either lived under a rock for the past few years or your not into computer programming at all.

The hook is a post-commit hook and what it does is that it sends the last commit message up to a twitter account I called "friedcode". I guess it's not entirely useful but for you who want to be loud about your work and the progress you make I guess it can make sense. Or if you're a team and you want to get a brief overview of what your team mates are up to. For me, it was mostly an experiment to try Git hooks and pytwitter. Here's how I did it:

Truncated! Read the rest by clicking the link below.

To assert or assertEqual in Python unit testing

February 14, 2009
17 comments Python

When you write unit tests in Python you can use these widgets:


self.assertEqual(var1, var2, msg=None)
self.assertNotEqual(var1, var2, msg=None)
self.assertTrue(expr, msg=None)
self.assertRaises(exception, func, para, meters, ...)

That's fine but is it "pythonic" enough? The alternative is to do with with "pure python". Eg:


assert var1 == var2, msg
assert var1 != var2, msg
assert expr, msg
try:
   func(para, meter)
   raise Exception
except exception:
   pass

I'm sure there are several benefits with using the unittest methods that I don't understand but I understand the benefits of brevity and readability. The more tests you write the more tedious it becomes to write self.assertEquals(..., ...) every time. In my own code I prefer to use simple assert statements rather than the verbose unittest alternative. Partially because I'm lazy and partially because they read better and the word assert is highlit in red in my editor so it just looks nicer from a distance.

Perhaps some much more clever people than me can explain what a cardinal sin it is to not use the unittest methods over the lazy more pythonic ones.

Incidentally, during the course of jotting down this blog I reviewed some old inherited code and changed this:


self.assertEqual(len(errors),0)

into this:


assert not errors

Isn't that just nicer to use/read/write?

bool is instance of int in Python

December 5, 2008
15 comments Python

I lost about half an hour just moments ago debugging this and pulling out a fair amount of hair. I had some code that looked like this:


result = []
for key, value in data.items():
   if isinstance(value, int):
       result.append(dict(name=key, value=value, type='int'))
   elif isinstance(value, float):
       result.append(dict(name=key, value=value, type='float'))
   elif isinstance(value, bool):
       result.append(dict(name=key, type='bool',
                          value=value and 'true' or 'false'))
...

It looked so simple but further up the tree I never got any entries with type="bool" even though I knew there were boolean values in the dictionary.

The pitfall I fell into was this:


>>> isinstance(True, bool)
True
>>> isinstance(False, bool)
True
>>> isinstance(True, int)
True
>>> isinstance(False, int)
True

Not entirely obvious if you ask me. The solution in my case was just to change the order of the if and the elif so that bool is tested first.

domstripper - A lxml.html test project

November 20, 2008
1 comment Python

I'm just playing with the impressive lxml.html package. It makes it possible to easily work with HTML trees and manipulate them.

I had this crazy idea of a "DOM stripper" that removes all but specified elements from an HTML file. For example you want to keep the contents of the <head> tag intact but you just want to keep the <div id="content">...</div> tag thus omitting <div id="banner">...</div> and <div id="nav">...</div>. domstripper now does that. This can be used for example as a naive proxy that tranforms a bloated HTML page into a more stripped down smaller version suitable for say mobile web browsers. It's more a proof of concept that anything else.

To test you just need a virtual python environment and the right system libs to needed to install lxml. This worked for me:


$ sudo apt-get install cython libxslt1-dev zlib1g-dev libxml2-dev
$ cd /tmp
$ virtualenv --no-site-packages testenv
$ cd testenv
$ source bin/activate
$ easy_install domstripper

Now you can use it like this:


>>> from domstripper import domstripper
>>> help(domstripper)
...
>>> domstripper('bloat.html', ['#content', 'h1.header'])
<!DOCTYPE...
...

Best to just play with it and see if makes sense. I'm not saying this is an amazing package but it goes to show what can be done with lxml.html and the extremely user friendly CSS selectors.

The importance of env (and how it works with virtualenv)

September 18, 2008
8 comments Python

I have for a long time wondered why I'm supposed to use this in the top of my executable python files:


#!/usr/bin/env python

Today I figured out why.

The alternative, which you see a lot around is something like this:


#!/usr/bin/python

Here's why it's better to use env rather than the direct path to the executable: virtualenv. Perhaps there are plenty of other reasons the Linux experts can teach me but this is now my first obvious benefit of doing it the way I'm supposed to do it.

If you create a virtualenv, enter it and activate it so that writing:


$ python 

starts the python executable of the virtual environment, then this will be respected if you use the env shebang header. Good to know.

The stupidity of 'id' as a variable name (or stupidity of me)

September 16, 2008
3 comments Python

Both in Zope2 and in Django you need to work with attributes called id. This is a shame since it's such a huge pitfall. Despite having done Python programming for so many years I today fell into this pitfall twice!! The pitfall is that id is a builtin function, not a suitable variable name. The reason is that I was changing a complex app to use something called the UUID as the indentifier instead of the ID which happened to be a name of a primary key in a table.

This meant lots of changes and I tested and tested and kept getting really strange errors. I took the whole thing apart and put it back together when I discovered my error of checking if variable id was set or not. id, if undefined, defaults to the builtin function id() which will always return true on bool(id).

It's been a long day. I'm going home. Two newbie mistakes in one programming session. I'm sure I'm not the only one who's been trapped by this.

Python new-style classes and the super() function

July 12, 2008
5 comments Python

I've never really understood the impact of new-style Python classes and what it means to your syntax until now. With new-style classes you can use the super() builtin, otherwise you can't. This works for new-style classes:


class Farm(object):
   def __init__(self): pass

class Barn(Farm):
   def __init__(self):
       super(Barn, self).__init__()

If you want to do the same for old-style classes you simply can't use super() so you'll have to do this:


class Farm:
   def __init__(self): pass

class Barn(Farm):
   def __init__(self):
       Farm.__init__(self)

Strange that I've never realised this before. The reason I did now was that I had to back-port some code into Zope 2.7 which doesn't support setting security on new-style classes.

Now I need to do some reading on new-style classes because clearly I haven't understood it all.

split_search() - A Python functional for advanced search applications

May 15, 2008
0 comments Python

Inspired by Google's way of working I today put together a little script in Python for splitting a search. The idea is that you can search by entering certain keywords followed by a colon like this:


Free Text name:Peter age: 28

And this will be converted into two parts:


'Free Text'
{'name': 'Peter', 'age':'28}

You can configure which keywords should be recognized and to make things simple, you can basically set this to be the columns you have to do advanced search on in your application. For example (from_date,to_date)

Feel free to download and use it as much as you like. You might not agree completely with it's purpose and design so you're allowed to change it as you please.

Here's how to use it:


$ wget https://www.peterbe.com/plog/split_search/split_search.py
$ python
>>> from split_search import split_search
>>> free_text, parameters = split_search('Foo key1:bar', ('key1',))
>>> free_text
'Foo'
>>> parameters
{'key1': 'bar'}

UPDATE

Version 1.3 fixes a bug when all you've entered is one of the keywords.

See you at PyCon 2008

March 11, 2008
0 comments Python

I'm going to Chicago on Wednesday for the PyCon 2008 conference. I'm going to stay at the Crowne Plaza (or whatever it was called) like many of the other people at the conference.

This is what I look like:

See you at PyCon 2008

If you see this mug, go up to it and say Hi. It speaks British, Swedish and some American and loves food, beer and tea which might be helpful to know if you would feel like to talk more to it. Its interests for this conference are: Grok, Zope, Django, Plone, buildout, automated testing, agile development and Javascript. Its main claim-to-fame is an Open Source bug/issue tracker program called IssueTrackerProduct which it is more than delighted to talk about.

I've never been to Chicago before and I'm really excited about Tuesday night as I've bought tickets to a Chicago Bulls NBA game (basketball). All other nights I'm hoping to socialise, get drunk, get full and get down and dirty nerdy all week. See you there!