Free business idea: An app for figuring out the best car for you

January 18, 2013
0 comments Wondering

Here's a business idea that I've not seen implemented and which I likely won't have time to attempt:

An app for statistically figuring out which car you should buy.

Like Hot or Not it shows you one car at a time (at random) with a variable (also at random). The variable will be turned into a question. The question will be something like: "What about the price of this?" and it's a picture of a Toyota Prius 2013 with its price. Three buttons to choose: "Too expensive", "About right", "Too cheap".

Next, it's a different car and a different variable. For example, a Volvo XC90 with the question "What about the looks of this?" and, again, three buttons: "Too ugly", "About right", "Too sexy".

Car salesman
On so on... You can keep going, answering more questions, or you can stop and check out your result. Obviously, the more you answer the better the suggestion. You might want to help the user with this so they don't answer too few.

Then when you present the result you can, on that page, show a bunch of affiliate links to various local dealerships where you can buy the ideal car for you. Additionally, if the app becomes successful I'm sure you can easily sell advertisement to car companies who would love to show their ads depending on certain variables. E.g. Honda Fits for those who answer that they want low MPG and small cars.

The algorithm shouldn't be too hard to figure out. I'm sure you can get a lot of mileage just by doing a weighted average on the totals. If you sit down and think about it some more I'm sure you can fit some better established algorithm or something from the neural networks if you lay out your results as a matrix.

That's about it. I don't know where to get the pictures and specs for each car but I'm sure one can scrape from various sites and/or seed some of it manually.

It's the kind of app where you can start small (assuming you have at least 100 cars and 3-6 facts about each car). Also, it doesn't depend on having a bunch of traffic already so you don't need to worry so much about the chicken & egg predicament.

Do you think it could fly?

All your images are belong to data uris

January 6, 2013
12 comments Web development

If the number 1 rule for making faster websites is to "Minimize HTTP Requests", then, let's try it.

On this site, almost all pages are served entirely from memcache. Django renders the template with the database content and the generated HTML is cached. So I thought I insert a little post processing script that converts all <img src="...something..."> into <img src="data:image/png;base64,iVBORw0KGgo..."> which basic means the HTML gets as fat as the sum of all referenced images combined.

It's either 10Kb HTML followed by (rougly) 10 x 30Kb images or it's 300Kb HTML and 0 images. The result is here: https://www.peterbe.com/about2 (open and view source)

You can read more about the Data URI scheme here if you're not familiar with how it works.

The code is a hack but that's after all what a personal web site is all about :)

So, how much slower is it to serve? Well, actual server-side render time is obviously slower but it's a process you only have to do a small fraction of the total time since the HTML can be nicely cached.

Running..
ab -n 1000 -c 10 https://www.peterbe.com/about

BEFORE:

Document Path:          /about
Document Length:        12512 bytes

Concurrency Level:      10
Time taken for tests:   0.314 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      12779000 bytes
HTML transferred:       12512000 bytes
Requests per second:    3181.36 [#/sec] (mean)
Time per request:       3.143 [ms] (mean)
Time per request:       0.314 [ms] (mean, across all concurrent requests)
Transfer rate:          39701.75 [Kbytes/sec] received

AFTER:

Document Path:          /about2
Document Length:        306965 bytes

Concurrency Level:      10
Time taken for tests:   1.089 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      307117000 bytes
HTML transferred:       306965000 bytes
Requests per second:    918.60 [#/sec] (mean)
Time per request:       10.886 [ms] (mean)
Time per request:       1.089 [ms] (mean, across all concurrent requests)
Transfer rate:          275505.06 [Kbytes/sec] received

So, it's basically 292Mb transferred instead of 12Mb in the test and the requests per second is a third of what it used to be. But it's not too bad. And with web site optimization, what matters is the individual user's impression, not how much or how little the server can serve multiple users.

Next, how does the waterfall of this look?

Truncated! Read the rest by clicking the link below.

Gamification for me as a software developer

December 21, 2012
3 comments Web development

"Gamification is the use of game-thinking and game mechanics in non-game contexts in order to engage users and solve problems" -- wikipedia

Gamification sneaks into a software developer's life whether he/she likes it or not. Some work for me, some don't.

What works for me

  1. PyPI downloads on my packages
    Although clouded with inaccuracies and possible false positives (someone's build script could be pip installing over zealously), seeing your download count go up means that people actually depend on your code. Most likely, they're not just downloading to awe, they download to use it.

  2. Github followers and Starred projects
    Being followed on Github means people see your activity on their dashboard (aka. home page). Every commit and every gist you push gets potential eyes on it.
    When people star your project it probably means that they're thinking "oh neat! this could come in handy some day
    so I'll star it for now". That's kinda flattering to be honest.

  3. Twitter followers
    This doesn't apply to everyone of course but to me it does. I really try my best to write about work or code related stuff on Twitter and personal stuff on Facebook. Whenever a blog post of mine gets featured on HN or if I present at some conference I get a couple of new followers.
    Some people do a great job curating their followers, responding and keeping it very relevant. They deserve their followers.
    Yes, there are a lot of bogus Twitter accounts that follow you but since that happens to everyone it's easy to oversee. Since you probably skim through most of the "You have new follower(s)" emails, it's quite flattering when it's a real human being who does what you do or somewhat similar.

  4. Activity on Github projects
    This one is less about fame and fortune and more of a "damage prevention". Clicking into a project and seeing that the last commit was 3 years ago most definitely means the project is dead.
    I have some projects that I don't actively work on but the code might still be relevant and doesn't need much more maintenance. For those kind of projects it's good to have some sporadic activity just to signal to people it's not completely abandoned.

  5. Hacker News posts and comments "Show HN: ..."
    I've now had quite a few posts to HN that get promoted to the front page. Whenever this happens you get those almost [embarrassing spikes in your Google Analytics account/static/cache/7c/3b/7c3be91fa89401add4f423e944878706.jpg).
    However, it happened. Enough people thought it was interesting to vote it up to the front page.
    It's important to not count the number of comments as a measure of "success" because oftentimes comments aren't simply constructive feedback but just comments on other comments.
    Keep this one simple, the fact that you have built something that is "Show HN:..." means you probably have worked hard.

What does NOT work for me

  1. Unit test code coverage metrics
    Test coverage percentages are quite a private matter. Kinda like your stool. Unless something amazing happened, keep it to yourself.
    It's nice to see a general increase of the total percentage but do not dare to obsess about it. What matters is that you look through the report and take note that what matters is covered. Coverage on code that is allowed to break and isn't embarrassing if it does, does not need to be green all the way. Who are you trying to impress? The intern you're mentoring or the family you don't have time to spend time with because you're hunting perfection?
    I must, however, admit that I too have in the past inserted pragma: no cover in my code. Also, being able to say that you have 100% test coverage on a lib can be good "advertisement" in your README as it instills confidence in your potential users.

  2. Number of tests
    When you realize that 1 nicely packaged integration test can test just as much as 22 anally verbose unit tests you realize that number of tests is a stupid measure.
    A lot of junior test driven developers write tests that cover circumstances that are just absurd. For example "what if I pass a floating point number instead of a URL string which it's supposed to be??".
    Remember, results and quality count. Having too many tests also means more things to slow you down when you refactor.

  3. Commit counts
    On projects with multiple contributors commit counts is not a measure of anything. It has no valuable implications or deductions. Adding a newline character to a README can be 1 count.
    If you skim through the commit log on a Github project you'll notice that surprisingly many commits are trivial stuff such as style semantics or updating a CREDITS file.
    Yes, someone has to do that stuff too and we're always appreciative of that but it's not a measure of excellence over others. It's just a count.

  4. Resolved bugs/issues count
    If this mattered and was a measure of anything you could simple just swallow everything with a quick turnaround and resolve or close it.
    But not every bug deserves your attention. Even if it is a genuine bug it might still be really low priority which working on costs time and focus distraction away from much more important work.

  5. Number of releases
    It's nice to see projects making releases (or tags) but don't measure things by this. There's so much good quality software that doesn't really fit the release model.

My new web marketing strategy: Begging

December 9, 2012
13 comments Web development

From one of the monthly summary emails
Building a side project is fun. Launching it is fun. Improving and measuring it is fun. But marketing it is aweful!

Marketing your side project means you're not coding, instead you're walking around the interwebs with your pants down trying your hardest to get people to not only try your little project but to also get beyond that by tweeting about it, Facebook status update about it, blog about it or use whatever devices inside it to help the viral spread. Now that! ...is freckin hard.

I'm struggling to even get my best friends and my wife to even try my side projects. I can't blame them, unlike a lemonade stand at a farmers market it's very impersonal. When I tried to get my buddies to try Around The World several did but only very briefly and granted some few did give me feedback but it's really not much to go by.

So, today I'm launching the start of my new web marketing strategy: Begging

Or rather, politely asking people to help me. Instead of using the usual "we" or "our" language I'm referring to it in first person instead. The platform for this strategy experiment is on HUGEpic and it looks like this: hugepic.io/yourhelp/

I'm recently built a feature into HUGEpic that once a month emails everyone who uploaded a picture a little summary of their upload and the number of hits and comments and boldly in the footer of this email there's a link to the /yourhelp/ page (see screenshot above).

Let's see how this works out. Mostly likely it'll be just another noise in the highways of peoples' internet lifes but perhaps it can become successful too.

Mind you, the motives of all of this is for my "insert-sideproject-name-here" to become successful. And by successful I mean popular and lots of traffic. None of my side projects make me any money which makes it easier to beg. However, none of them make any money for the people I'm asking for help. Perhaps that's what could be the version 2.0 of my web marketing strategy.

Introducing: HUGEpic - a web app for showing massive pictures

November 3, 2012
19 comments Python

So here's my latest little fun side-project: HUGEpic.io http://hugepic.io

Zoomed in on Mona Lisa
It's a web app for uploading massive pictures and looking at them like maps.

The advantages with showing pictures like this are:

  • you only download what you need
  • you can send a permanent link to a picture at a particular location with a particular zoom level
  • you can draw annotations on a layer on top of the image

All the code is here on Github and as you can see it's a Tornado that uses two databases: MongoDB and Redis and when it connects to MongoDB it uses the new Tornado specific driver called Motor which is great.

Truncated! Read the rest by clicking the link below.

Fastest way to thousands-commafy large numbers in Python/PyPy

October 13, 2012
15 comments Python

Here are two perfectly good ways to turn 123456789 into "123,456,789":



import locale

def f1(n):
    locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
    return locale.format('%d', n, True)

def f2(n):
    r = []
    for i, c in enumerate(reversed(str(n))):
        if i and (not (i % 3)):
            r.insert(0, ',')
        r.insert(0, c)
    return ''.join(r)

assert f1(123456789) == '123,456,789'
assert f2(123456789) == '123,456,789'    

Which one do you think is the fastest?

Truncated! Read the rest by clicking the link below.

hastebinit - quickly paste snippets into hastebin.com

October 11, 2012
9 comments Python, Linux

I'm quite fond of hastebin.com. It's fast. It's reliable. And it's got nice keyboard shortcuts that work for my taste.

So, I created a little program to quickly throw things into hastebin. You can have one too:

First create ~/bin/hastebinit and paste in:


#!/usr/bin/python

import urllib2
import os
import json

URL = 'http://hastebin.com/documents'

def run(*args):
    if args:
        content = [open(x).read() for x in args]
        extensions = [os.path.splitext(x)[1] for x in args]
    else:
        content = [sys.stdin.read()]
        extensions = [None]

    for i, each in enumerate(content):
        req = urllib2.Request(URL, each)
        response = urllib2.urlopen(req)
        the_page = response.read()
        key = json.loads(the_page)['key']
        url = "http://hastebin.com/%s" % key
        if extensions[i]:
            url += extensions[i]
        print url


if __name__ == '__main__':
    import sys
    sys.exit(run(*sys.argv[1:]))

Then run: chmod +x ~/bin/hastebinit

Now you can do things like:

$ cat ~/myfile | hastebinit
$ hastebinit < ~/myfile
$ hastebinit ~/myfile myotherfile

Hopefully it'll one day help at least one more soul out there!

How I stopped worrying about IO blocking Tornado

September 18, 2012
5 comments Tornado

So, the cool thing about Tornado the Python web framework is that it's based on a single thread IO loop. Aka Eventloop. This means that you can handle high concurrency with optimal performance. However, it means that can't do things that take a long time because then you're blocking all other users.

The solution to the blocking problem is to then switch to asynchronous callbacks which means a task can churn away in the background whilst your web server can crack on with other requests. That's great but it's actually not that easy. Writing callback code in Tornado is much more pleasant than say, Node, where you actually have to "fork" off in different functions with different scope. For example, here's what it might look like:



class MyHandler(tornado.web.RequestHandler):
    @asynchronous
    @gen.engine
    def get(self):
        http_client = AsyncHTTPClient()
        response = yield gen.Task(http_client.fetch, "http://example.com")
        stuff = do_something_with_response(response)
        self.render("template.html", **stuff)

It's pretty neat but it's still work. And sometimes you just don't know if something is going to be slow or not. If it's not going to be slow (e.g. fetching a simple value from a fast local database) you don't want to do it async anyway.

Truncated! Read the rest by clicking the link below.

Introducing: League of Friends on Around The World

September 15, 2012
0 comments Web development

League of Friends
After about a month of weekend development the League of Friends is finally finished.

Usually on games like this, if it has a highscore list you might find yourself at number 3,405,912 and the people at the top of the highscore list are people you've never heard of so what's the point of comparing yourself with them?

Inviting someone by email
On Around The World, you select your own friends for your league. Everyone you invite get an email asking if they want to accept it mutually. If you want to invite someone who isn't already on Around The World, you can type in their email address and complete an email that gets sent to that friend on your behalf from Around The World.

About Peter
Also with this, you can click on any of your travelling friends and get lots more details about their progress. It doesn't reveal anything about how smart or not smart that friend is so you never have to worry about looking stupid because it never reveals with easy questions you accidentally got wrong.

Real-timify Django with SockJS

September 6, 2012
4 comments Django, JavaScript, Tornado

In yesterdays DjangoCon BDFL Keynote Adrian Holovaty called out that Django needs a Real-Time story. Well, here's a response to that: django-sockjs-tornado

Immediately after the keynote I went and found a comfortable chair and wrote this app. It's basically a django app that allows you to run a socketserver with manage.py like this:

python manage.py socketserver

Chat Demo screenshot
Now, you can use all of SockJS to write some really flashy socket apps. In Django! Using Django models and stuff. The example included shows how to write a really simple chat application using Django models. check out the whole demo here

If you're curious about SockJS read the README and here's one of many good threads about the difference between SockJS and socket.io.

The reason I could write this app so quickly was because I have already written a production app using sockjs-tornado so the concepts were familiar. However, this app has (at the time of writing) not been used in any production. So mind you it might still need some more love before you show your mom your django app with WebSockets.