Filtered by Python

Page 23

Reset

Fastest way to uniqify a list in Python

August 14, 2006
92 comments Python

SEE UPDATE BELOW

--

Suppose you have a list in python that looks like this:


['a','b','a']
# or like this:
[1,2,2,2,3,4,5,6,6,6,6]

and you want to remove all duplicates so you get this result:


['a','b']
# or
[1,2,3,4,5,6]

How do you do that? ...the fastest way? I wrote a couple of alternative implementations and did a quick benchmark loop on the various implementations to find out which way was the fastest. (I haven't looked at memory usage). The slowest function was 78 times slower than the fastest function.

Truncated! Read the rest by clicking the link below.

Unicode strings to ASCII ...nicely

August 8, 2006
20 comments Python

This has been a problem for a long time for me. Whenever someone enters a title in my CMS the id of the document is derived from the title. Spaces are replaced with '- and &' is replaced with and etc. The final thing I wanted to do was to make sure the Id is ASCII encoded when it's saved. My original attempt looked like this:


>>> title = u"Klüft skräms inför på fédéral électoral große"
>>> print title.encode('ascii','ignore')
Klft skrms infr p fdral lectoral groe

But as you can see, a lot of the characters are gone. I'd much rather that a word like "Klüft" is converted to "Kluft" which will be more human readable and still correct. My second attempt was to write a big table of unicode to ascii replacements.

It looked something like this:


u'\xe4': u'a',
u'\xc4': u'A',
etc...

Truncated! Read the rest by clicking the link below.

slim, a new free web service for white space optimisation

July 25, 2006
1 comment Python

If you have some code that you need to optimise, like some Javascript code that is well commented but costs too many bytes of download for your users then you might want to use my slimmer web service. I'll let a Python example speak for itself:


>>> import xmlrpclib
>>> s=xmlrpclib.Server('https://www.peterbe.com/')
>>> css='h1 { font-family: Arial, Verdana; }'
>>> s.slim(css)
'h1{font-family:Arial,Verdana}'

Truncated! Read the rest by clicking the link below.

Nice stats added to RememberYourFriends.com

July 15, 2006
0 comments Python

Nice stats added to RememberYourFriends.com I've just added some nice stats to RememberYourFriends.com. It's basically two line charts. One of how many reminders have been sent that week and one of how many reminders have been sent, ever up to a particular week from the very start.

To do this I use the wonderful CharDirector package for Python which is really fast and very easy to implement. These graphs are created on the fly and apart from generating the image it obviously needs to generate the data. Pretty fast I'd say. Will be interesting to see how it will fair when the load starts to get interesting.

Once I get a bit more users I'll start thinking of other funky charts to draw. It's fun.

DifferenceFinder (aka. humanreadablediff.py)

July 6, 2006
4 comments Python

I've just quickly put together a little script that computes the difference between two texts in a human readable format. The result when you run diff is a bit difficult to understand for a human being and I wanted something more "humane" that quickly summarises what's different on one simple line. Eg. "Added 2 lines, change 1 line".

This little script is going to be part an undo function in our new CMS that I'm working on. Instead of just pinpointing which revision date you want to go back to you'll also be able to see what the differences were between each revision in the undo history for the CMS.

Truncated! Read the rest by clicking the link below.

Geeking with tags file for Jed

May 29, 2006
0 comments Python, Linux

A little while ago I wrote about how I got Jed + TAGS to work thanks the ntags library. I've been using it now for a while and I love it! I doubt there are any IDEs that beats a swift combination of Ctrl+2 followed by Alt+. and you get the definition of a function or variable without losing any focus.

If you're not into programming stop reading now because it's going to get even more technical.

Truncated! Read the rest by clicking the link below.

Private functions in Javascript?

April 29, 2006
3 comments Web development, Python

In Python you indicate whether a function/method is private by naming it so that it starts with an _ underscore like this:


def _thisIsPrivate():
    return 'a'

def thisIsPublic():
    return 'b'

It means that if you have these two functions in a file called dummy.py and you do something like this:


>>> from dummy import *
>>> thisIsPublic()
'a'
>>> _thisIsPrivate()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name '_thisIsPrivate' is not defined

Seems fair doesn't it. Now is there such similar naming convention and functionality in Javascript?

Truncated! Read the rest by clicking the link below.

Interesting float/int casting in Python

April 25, 2006
21 comments Python

Casting is when you convert a variable value from one type to another. This is, in Python, done with functions such as int() or float() or str(). A very common pattern is that you convert a number, currently as a string into a proper number.

Let me exemplify:


>>> x = '100'
>>> y = '-90'
>>> print x + y
100-90
>>> print int(x) + int(y)
10

That was the int() function. There's also another very common one which is float() which does basically the same thing:


>>> print float(x) + float(y)
10.0

Truncated! Read the rest by clicking the link below.

Date formatting in Python or in PostgreSQL (part II)

April 19, 2006
0 comments Python

This is an update on Date formatting in python or in PostgreSQL where the test wasn't done very well. The solution using Python for the formatting created a new DateTime object each time for each formatting because the time_stamp extracted from the database was a string. That would be beneficial to the Python formatting alternative but that's not the whole point. I suspect that the way I did the experiment last time (code is lost by the way) was wrong and didn't focus on the correct benchmark.

In this, my second attempt, I've done a more correct test and tried it on 500 selects. 500 formatted in SQL and 500 formatted in Python. The results are even more outstanding for PostgreSQL than last time.

Here are the results:


times1 (python formatted)
0.113439083099
times2 (sql formatted)
0.00697612762451

That means that doing the date formatting in SQL is 16 times faster!!

Bare in mind that this is optimization and you always have to be careful when doing optimization. For example, the SQL database shouldn't get involved in the presentation and if you need to use a different locale you might to change your application in two places which is risky.

Case insensitive list remove call (part II)

April 11, 2006
1 comment Python

Yesterday I blogged about a simple algorithm for removing a string from a list of string case insensitively. I was happy to see that several people joined in with suggestions just a few hours after I published it. I have now thought about it a bit more and to honour those who commented I've done a little benchmark to find out which one is the best.

What both my own solution and some peoples suggestions forgot was to raise a ValueError if it fails just like this would: list("abc").remove("d")

So, I've tidied up the suggestions and where need be I've added the exception throw. This is not the first time list comprehension in python impresses me. The winner in terms of performance is Andrews list comprehension suggestion. Here are the timeing results:


f1 0.704859256744
f2 1.5358710289
f3 1.37636256218
f4 0.468783378601
f5 0.475452899933
f6 0.666154623032

Truncated! Read the rest by clicking the link below.