Peterbe.com

Peter Bengtsson's blog

Filtered by JavaScript, Python

Page 45

Playing with Reverend Bayesian

October 19, 2005
0 comments Python

I've been playing around with Reverend a bit on getting it to correctly guess appropriate "Sections" for issues on the Real issuetracker. What I did was that I downloaded all 140 issuetexts and their "Sections" attribute which is a list (that is often of length 1). From list dataset I did a loop over each text and the sections within it (skipped the default section General) so something like this:


data = ({'sections':['General','Installation'], 
         'text':"bla bla bla..."}
        {'sections':['Filter functions'], 
         'text':"Lorem ipsum foo bar..."}
        ...)
for item in data:
    secs = [each for each item['sections'] if each != 'General']
    for section in secs:
        guesser.train(section, item['text'])

Truncated! Read the rest by clicking the link below.

Dream: python bindings for squidclient

October 11, 2005
3 comments Python, This site

At the moment I'm not running Squid for this site but if experimentation time permits I'll have it running again soon. One thing I feel uneasy about is how to "manually" purge cached pages that needs to be updated. For example, if you read this page (and it's cached for one hour) and post a comment, then I'd like to re-cache this page with a purge. Setting a HTTP header could be something but that I would only be able to do on the page where you have this in the URL:


?msg=Comment+added

which, because of the presence of a querystring, is not necessarily cached anyway. The effect is that as soon as the "?msg=Comment+added" is removed from the URL, the viewer will see the page as it was before she posted her comment. squidclient might be the solution. ...sort of.

Truncated! Read the rest by clicking the link below.

Ruby and Python benchmarked

September 25, 2005
27 comments Python

Some Ruby (the programming language) blogger has tried to implement the Edit-Distance algorithm in three different ways in both Python and in Ruby. His conclusion is more steered towards the difference between the algorithms and not so much the language. I guess that's fair because the implementation might be done better in Python than it was in Ruby. But, please notice that this is a Ruby coder and yet his Python implementations are always faster.

A quick glance tells me that the Python programs run about 3 times as fast as the Ruby ones. See the benchmarks

Smurl from Python

September 22, 2005
1 comment Python

If you thought the Web Service example on the about page of Smurl.name was complicated, here's a much simpler version. I use this code on my very own site for the email notifications which will contain long URLs.

Feel free to steal this code into your own projects:


from urllib import urlopen, quote

# variable 'url' is defined elsewhere
if len(url) > 80:
    url = urlopen('http://smurl.name/createSmurl?url=%s' % quote(url)).read()

Was that hard?

Python regular expression tester

September 19, 2005
1 comment Python

I've just discovered retest by Christof Hoeke which is a developers tool for testing and experimenting with regular expressions. It doesn't have a GUI so it uses SimpleHTTPServer to serve a web interface on http://localhost:8087 that uses AJAX to make the interface snappier. You use this if you feel uncertain how to write your regular expression syntax and need a helpful sandbox for playing in.

This is cool because as an application it's very modern. The source code is only 100 lines python code, some javascript code for the AJAX and a relatively simple HTML page. A genuine GUI app would be considerably much more code but would admittedly run faster. However, considering how "basic" this application is, speed is not an issue.

Truncated! Read the rest by clicking the link below.

Random ID generator for Zope

September 2, 2005
0 comments Python

I was working on a little application that is similar to tinyurl.com (where any URL gets a unique random id that can be used to redirect with long clumsy URL strings) and came up with this little algorithm that I wanted to share with the world for some feedback.

There are two loops, one nested inside one big loop that goes on for infinity. The inner loop creates a list of possible combinations from a sample of letters and numbers eg (abc, acb, bac, bca, cab, cba). For each such found combinations it does a check to see if this has been saved before and if not it exists the loop like this:


if not hasattr(self, combination):
    return combination

Truncated! Read the rest by clicking the link below.

\B in Python regular expressions

July 23, 2005
0 comments Python

Today I learnt about how to use the \B gadget in Python regular expressions. I've previously talked about the usefulness of \b but there's a big benefit to using \B sometimes too.

What \b does is that it is a word-boundary for alphanumerics. It allows you to find "peter" in "peter bengtsson" but not "peter" in "nickname: peterbe". In other words, all the letters have to be grouped prefixed or suffixed by a wordboundry such as newline, start-of-line, end-of-line or a non alpha character like (.

Truncated! Read the rest by clicking the link below.

SmartDict - a smart 'dict' wrapper

July 14, 2005
9 comments Python

For a work project we needed a convenient way to wrap our SQL recordset, instance objects and dictionary variables to share the same interface. The result is SmartDict which makes it possible to assert that access can be made in any which way you want; something that is very useful when you write templates and don't want to have to know if what you're working with is a dict or a recordset.

This doesn't work:


>>> d= {'name':"Peter"}
>>> print d.get('name') # fine
>>> print d.name # Error!!!

Likewise with some instance objects or record sets, this doesn't work:


>>> d = getRecordsetObject()
>>> print d.name # fine
>>> print d.get('name') # Error!!!

Truncated! Read the rest by clicking the link below.

Lisp compared to Python

July 7, 2005
1 comment Python

I was reading a thread about "Lisp development with macros faster than Python development?" on comp.lang.python when I stumbled across a little statement by Raymond Hettinger, a core Python developer:

"With Lisp or Forth, a master programmer has unlimited power and expressiveness. With Python, even a regular guy can reach for the stars."

ztar - my wrapper on tar -z

June 29, 2005
8 comments Python, Linux

Something I find myself doing very often is to download a .tar.gz or .tgz file that I want to unpack, but only in a subfolder. Some rather annoying gzips aren't collected in one folder so that when you unpack it lots of files are created in the current directory. Do you find yourself often doing this:


$ tar -ztvf Some-0.x.tar.gz
Some/file1.txt
Some/file2.txt
...
Some/file100.txt
$ tar -zxvf Some-0.x.tar.gz
Some/file1.txt
Some/file2.txt
...
Some/file100.txt

Or, in case they the gzip is badly organised:


$ tar -ztvf Foo-0.y.tar.gz
file1.txt
file2.txt
...
file100.txt
$ mkdir Foo; mv Foo-0.y.tar.gz Foo/; cd Foo/
$ tar -zxvf Foo-0.y.tar.gz
file1.txt
file2.txt
...
file100.txt
$ cd ..

Truncated! Read the rest by clicking the link below.