Filtered by JavaScript, Python

Page 26

Reset

Introducing optisorl

August 18, 2015
0 comments Python

optisorl is a Python package for sorl-thumbnail which is a kick-ass Python package for Django. sorl-thumbnail is pretty popular and used by a lot of people who have images they want to display as thumbnails.

A problem you find is that oftentimes the PNG thumbnails aren't as optimized as they can be. A great tool for having a second optimization pass on an PNG file is pngquant. You basically, run it like this:

$ ls -l bugzilla.png
-rw-r--r--@ 1 peterbe  staff  12188 Dec 12  2014 bugzilla.png
$ pngquant bugzilla.png
:~/Downloads$ ls -l bugzilla-fs8.png
-rw-r--r--@ 1 peterbe  staff  6630 Aug 18 13:15 bugzilla-fs8.png

That's a 140x140 pixel PNG that became 5,558 bytes smaller (46% saving).

Anyway, this is where optisorl comes in. It's an extension to sorl-thumbnail that is able to execute pngquant on the PNG right after the thumbnail file has been created. It does so by calling out a sub-process command to pngquant. See the code here which is all the magic there is to it really.

The reason I built this was to reduce the images on Air Mozilla. At the time I did the measurement, the PNGs total weight on the home page was 129KB and after running them all through optisorl the total weight was only 65KB.

To install, it just pip install it like so:

$ pip install optisorl

And you need to install pngquant like brew install pngquant or apt-get install pngquant.

Then, to activate it you need to set this Django setting:


THUMBNAIL_BACKEND = 'optisorl.backend.OptimizingThumbnailBackend'

If you decide to put the pngquant executable somewhere not on the PATH you can add to your settings.py file something like this:


PNGQUANT_LOCATION = '/path/to/bin/pngquant'

There's a bunch of features it doesn't have but we can work together on that. For example, there are certain PNG images that you might want to display as thumbnails but due to something about the image, e.g. its use of Alpha channels, you might want to explicitly disable optimizations.

Some tips on learning React

August 4, 2015
0 comments JavaScript, React

In the last month I've been playing with React in my spare time. Said time is extremely limited so I'm unable to read the documentation or source code as a way to learn. Instead, I follow various tutorials and snippets on stackoverflow etc. to get going. I have something now that will soon be production ready and I'm excited about it even though it only scratches the surface of what React can do.

If you're learning React, starting from more or less scratch, here are some of my tips:

  1. Start with skimming the official tutorial or this little gem on scotch.io. Both start with a simple index.html in which you include JSXTransformer.js and you write your React/JSX code in a <script type="text/jsx"> block. Not a bad idea. That helps you appreciate what JSX is and how it relates to turning your code into production code.

  2. Don't fear jQuery! For those who've lept from jQuery based apps to AngularJS or Ember are often told to stay away from jQuery and learn to do it the "new way". But don't fear jQuery. Suppose you're working on a widget/component with React code; then you can continue to let your trusted jQuery code handle some other effect or widget on the page like a bootstrap modal window or something. Apparently, at Facebook, the first production use of React was just the little commenting widget underneath posts. They didn't rewrite the whole site in React when it all started. Also, even the official tutorial advocates using jQuery to do an AJAX fetch. (Personally I prefer the built-in fetch and this polyfill for doing AJAX fetches)

  3. Avoid "super-normalization" of components. A lot of React apps is about one master component rendering one or more other components that renders itself with other components. That's fine but can easily get messy when you're starting out. Don't fear writing extra rendering functions in your class instead of always relying on writing yet another deep component. For example, this is perfectly fine.
    It's good to split up distinct functionality into sub-components but don't go overboard. Ideal things for sub-components are things that have their own and different context. Just calling other local functions is for when your render function simply gets too many lines long.

  4. Avoid ES6 (aka Babel) unless you're comfortable with tooling like Webpack and Gulp. I personally jumped into the deep end straight away writing my first big React app in ES6 which has been fun but it's been hard sometimes to find matching resources. In particular around testing frameworks. A lot of stackoverflow posts and blog posts don't use ES6 so some things just don't work. I chose ES6 because I was curious about React and this project is not on any deadline so I was OK with getting stuck here and there.

  5. Remember, React is just Javascript. For someone who has done a lot of AngularJS I sometimes stop and have to think when I'm in AngularJS, "How do this the AngularJS way?". For example, in AngularJS you can't just use var timer = setTimeout(function() { ... because you're leaving "the way AngularJS works". React has its own state-awareness pitfalls but it's not nearly has precarious. Just write your code. It's just Javascript. Use React for what it's good at but don't be scared to just write code. Having said that, it might help to be aware of how binding this works. Here's a good example. (And here's a good counter-example to avoid too much function() {...}.bind(this) noise)

  6. Learn to distinguish between state and props. It's confusing at first. Especially in terms of which should I use when? My attempt of explaining it is that you can think of the state as the database and the props as that data being extracted and passed around to be shown and changed.

  7. Let render() just render. Every component has a render function. Its job is to render the current state and nothing else. It's tempting to do too much logic in there before it returns the JSX but you should avoid it. Suppose your render function renders a list of object. You might be tempted to apply filtering or sorting of that list using other queues, like the state, before displaying. Try to avoid that, it means you can't change the state without causing that whole filtering/sorting logic to run again. Basically, keep the render function short and simple if you can.

Note, I'm a beginner too. My hope is that by sharing these tips more people will get a chance to enjoy React too without being too intimidated by all the things you think you need to learn and understand to build something.

Visual speed comparison of AngularJS and ReactJS

July 20, 2015
0 comments Web development, AngularJS, JavaScript

Last year I put together a little experiment called AJAX or Not? and blogged about it here. The basic idea was to display 1,000 rows in a table. There are several ways of doing it but I decided to compare the following three patterns:

  1. Rendering the whole table in Django server-side and return the whole HTML document.
  2. Rendering a skeleton page, then load the table content as a big chunk of HTML via AJAX.
  3. Rendering a skeleton page, and let AngularJS load all the content of the table from the server as JSON and let AngularJS render it into the DOM.

It was clear as day that the server-side rendering version was hands down the fastest. And the AngularJS rendering the slowest.

Note! AngularJS is amazing and super flexible and powerful because you don't really need to worry about how to re-render once the data changes. This is really useful when you do things like loading more data from a remote endpoint or doing some in-page filtering.

Enter ReactJS

The point of AJAX or Not was not to compare Javascript frameworks but I had some time and I thought I'd write an equivalent version of the AngularJS one with ReactJS (version 0.13.3).

Anyway, here's the code and it's using the GitHub fetch polyfill to do the AJAX query. The AngularJS code is here and here and as you can see it's using track by on the ng-repeat.

WebPageTest
To measure the difference I ran a comparison in WebPageTest which I encourage you open and study for a bit. You can watch the video and download the video here.

Also, note that the Django rendered version loads jQuery. That's because the functionality dictates that clicking on a link should show a confirmation box before going to the link. I know, it's silly but it's very realistic that every page needs some Javascript functionality.

Executive summary...

  • Django server-side takes 0.8 seconds, ReactJS version takes 2.0 seconds and the AngularJS version takes 2.9 seconds.
  • The ReactJS version is the fastest to display something. It displays the header and the image first. Only by 0.2 seconds before the Django server-side version.
  • The AngularJS version causes a lot more CPU utilization. This might really matter when you're on a low-end smartphone.
  • The ReactJS causes twice as much CPU utilization than the server-side version. The AngularJS causes twice as much CPU utilization than the ReactJS version.
  • AngularJS is slightly larger than ReactJS + fetch but I don't think this has a huge effect on the total load time.

Some other thoughts...

  • The ReactJS code is all in one place more or less. That's neat! But it's pretty darn big in terms of number of lines. AngularJS code is split half in the Javascript code and half in the HTML.
  • It's clear, if you want a fast loading page, avoid Javascript as much as you possibly can.
  • This experiment is very optimized in how it gets the data to be displayed. In fact, the server-side rendering time is close to 0 seconds because the whole HTML blob is stored in memcached. A more realistic thing is that extracting the data would take a lot longer if the query isn't so easy to cache. That would be a huge disadvantage for the fully server-side rendered version since if the data query takes a long time you'll sit and stare at a white screen longer. Doing the AJAX approach would definitely be a nicer experience.
  • The difference isn't that big. Both fancy Javascript frameworks have amazing features that leave jQuery in the dust but if you want your page to load crazy-fast, do as much server-side as you possibly can.

Premailer.io

July 8, 2015
3 comments Python, Web development, AngularJS, JavaScript

Premailer is a Python library for turning a HTML + CSS into HTML with all the CSS embedded as inline style attributes. This is sadly very necessary to ensure that your fancy HTML emails look spiffy across all email clients and email webapps.

So, last week I put together a little site to test the library via a browser: Premailer.io

It's just a simple webapp with a form where you can enter HTML in three different ways; textarea, by URL and by file upload.

You can also override all the possible advanced options that premailer supports.

What's kinda cool is that you can get a preview of how the HTML document will look like in an iframe that is dynamically loaded with the result from the conversion.

The webapp is of course open source and available on github.com/peterbe/premailer.io. The front-end is an AngularJS app and the build system is Lineman.js. The server is a Falcon server running on uWSGI via Nginx.

There's very little fancy here. There's no limitations or protections. I just hope it becomes handy for people to test premailer out.

The inspiration came from MailChimp's CSS Inliner Tool which is cute but very basic and doesn't allow you the same kinds of input.

If anybody with some AngularJS or highlight.js chops has time I'd love to help fix why the HTML is not syntax highlighted.

Find what indentation your files use

July 7, 2015
0 comments Python

Over the years, style guides have come and gone. And contributors have come and gone.
Some people, at some times use 2-spaces indentation in JavaScript. Some prefer 4-spaces.

Even I have changed my mind over the years and now I'm content to do either. I just go by whatever the projectroot/.editconfig config tells me.

So I wanted to clean up all the files so that they are use the same type of indentation (as dictated by the project's .editorconfig file). But which files are what indentation? I could open each file in turn and look at it and keep a tally of which is what. Or I can script it.

I wrote a script. Usage example included in the gist.

Now I easily see which files use what indentation. That makes it easy to file bugs for refactoring.

Some of the files in this grep search include vendor scripts that I'm not going to touch but as you can see, most files use 4 spaces but some still us 2 spaces.

4   base/static/angular/watchcounter.js
2   base/static/dropzone/dropzone.js
4   base/static/js/base.js
2   base/static/js/gallery_select.js
1   base/static/js/libs/include.js
4   base/static/js/libs/moment.js
4   base/static/select2/select2.js
4   comments/static/comments/js/comments.js
1   main/static/main/fullcalendar/gcal.js
4   main/static/main/js/autocompeter.js
4   main/static/main/js/calendar.js
4   main/static/main/js/discussion.js
4   main/static/main/js/download.js
4   main/static/main/js/edit.js
4   main/static/main/js/embed.js
4   main/static/main/js/event_video.js
4   main/static/main/js/eventstatus.js
4   main/static/main/js/include-tabzilla.js
4   main/static/main/js/jwplay.js
4   main/static/main/js/livehits.js
4   main/static/main/js/nav.js
4   main/static/main/js/playbackrate.js
4   main/static/main/js/tabzilla.js
4   main/static/main/js/tearout.js
4   manage/static/manage/js/autocompeter.js
1   manage/static/manage/js/bootstrap-datepicker.js
2   manage/static/manage/js/bootstrap-typeahead.js
4   manage/static/manage/js/channel-html-edit.js
4   manage/static/manage/js/confirm-delete.js
4   manage/static/manage/js/cronlogger.js
4   manage/static/manage/js/dashboard.js
4   manage/static/manage/js/dashboard_graphs.js
4   manage/static/manage/js/discussion-configuration.js
4   manage/static/manage/js/event-archive.js
4   manage/static/manage/js/event-assignment.js
4   manage/static/manage/js/event-edit.js
4   manage/static/manage/js/event-request.js
2   manage/static/manage/js/event-tweets.js
4   manage/static/manage/js/event-upload.js
4   manage/static/manage/js/event-vidly-submissions.js
4   manage/static/manage/js/eventmanager.js
4   manage/static/manage/js/events.js
4   manage/static/manage/js/form-errors.js
4   manage/static/manage/js/locations.js
4   manage/static/manage/js/mainmanager.js
4   manage/static/manage/js/manage.js
4   manage/static/manage/js/picture-add.js
4   manage/static/manage/js/picturegallery.js
4   manage/static/manage/js/staticpage-edit.js
4   manage/static/manage/js/suggestions.js
4   manage/static/manage/js/survey-edit.js
4   manage/static/manage/js/tagmanager.js
4   manage/static/manage/js/url-transforms.js
4   manage/static/manage/js/user-edit.js
4   manage/static/manage/js/usermanager.js
4   manage/static/manage/js/vidly-media-timings.js
4   manage/static/manage/js/vidly-media.js
4   new/static/new/js/RecordRTC.js
4   new/static/new/js/app.js
1   new/static/new/js/ccv.js
4   new/static/new/js/controllers.js
2   new/static/new/js/humanize-duration.js
4   new/static/new/js/services.js
4   starred/static/starred/js/star_event.js
4   starred/static/starred/js/starredevents.js
4   suggest/static/suggest/js/details.js
4   suggest/static/suggest/js/discussion.js
4   suggest/static/suggest/js/file.js
4   suggest/static/suggest/js/start.js
4   suggest/static/suggest/js/suggest.js
4   surveys/static/surveys/js/survey.js
2   uploads/static/uploads/js/s3upload.js
4   uploads/static/uploads/js/upload.js
2   webrtc/static/webrtc/js/camera.js
4   webrtc/static/webrtc/js/libs/RecordRTC.js
4   webrtc/static/webrtc/js/photobooth.js
4   webrtc/static/webrtc/js/summary.js
4   webrtc/static/webrtc/js/video.js
4   webrtc/static/webrtc/js/webrtc.js

How I git

June 18, 2015
1 comment Python, Linux

tl;dr I use bgg to shortcut a lot of tedious git commands.

Once a certain pattern appears where you find yourself doing the same thing over and over the first thing that should spring to mind is: let's automate that!

So a couple of years ago I started writing simple Python scripts that would wrap various git operations so I could do things like G merge or G rebase. That has helped me tremendously and when I at first showed these scripts to some people I was amazed how unimpressed they were. I guess that's because they have their own scripts or a geeky reluctance to adopting someone elses shortcuts unless you've personally be apart of going from tedious to shortcut.

So, a crucial part of my work here at Mozilla is to look at a Bugzilla and start a topic branch based on it and when it's done, push that into a Pull Request on GitHub.

The first command is G start. It takes a single optional argument. If an argument is provided it has to be a Bugzilla bug number. If you supply a Bugzilla ID it will fetch the title of that bug (assuming you're online) and store that so that it can be used to mention it in the git commit message. For example:

(airmozilla):~/dev/MOZILLA/AIRMOZILLA/airmozilla (master)$ G start 1174316
You're currently on branch master
Summary ["Start duration fetching when stopping a live event"]:
Switched to a new branch 'bug-1174316-start-duration-fetching-when-stopping-a-live-event'

The git branch name becomes a "slugified" version of the bug summary. But note, it merely sets the default. I could override it if I want to.

Then you do some work on it and when you're done you type the next command; G commit. It basically runs git commit -a -m "..." using the bug number, the bug summary, optionally asking if you want to prefix the commit message with fixes and then pushed it to your fork. Example speaks for itself:

(airmozilla):~/dev/MOZILLA/AIRMOZILLA/airmozilla (bug-1174316-start-duration-fetching-when-stopping-a-live-event *)$ G commit
MSG:
    bug 1174316 - Start duration fetching when stopping a live event

OK? [Y/n]
Add the 'fixes ' prefix? [N/y] y
NOW, feel free to run:

git checkout master
git merge bug-1174316-start-duration-fetching-when-stopping-a-live-event
git branch -d bug-1174316-start-duration-fetching-when-stopping-a-live-event

OR

git push peterbe bug-1174316-start-duration-fetching-when-stopping-a-live-event

Run that push? [Y/n]
To git@github.com:peterbe/airmozilla.git
 * [new branch]      bug-1174316-start-duration-fetching-when-stopping-a-live-event -> bug-1174316-start-duration-fetching-when-stopping-a-live-event

You get the picture. It's interactive and mostly you just hit enter and it does stuff saving you copious milliseconds.

Other noteworthy commands:

G rebase - whilst on a branch, jumps over to the master branch, updates from the origin, then goes back to the branch you were on preparing you for an interactive git rebase.

G merge - goes over to the master branch, merges the branch you were on and if it works out, deletes the branch.

G getback - you're in a branch you know was merged (using GitHub's green merge button), it switches to the master branch, updates master and deletes the local topic branch (that was merged) and deletes the remote topic branch on your fork.

G cleanup [search] - you're on some other branch other than the one you search for. It finds that branch (if only 1 match) and does that G getback does.

G branches [search] - lists all your branches sorted by most recently worked on last also indicate how long ago you worked on it and if it has already been merged.

The reason I'm mentioning this isn't to convince you to use my tool to do your git but perhaps to inspire you to write your own scripts that wrap things you find yourself doing repetitively.

I know my own battle isn't over. I'm still finding things that I have to do additionally on an almost perfectly predictable basis. Thankfully I now have an infrastructure to add more scripting.

Python slow-down of exception handling or condition checking

May 14, 2015
0 comments Python

It's the old problem of "Do I seek permission or ask for forgiveness?". It's rarely easy to know which one to use in Python because working with exceptions in Python is so damn easy.

Generally I prefer neither. I.e. just do. Don't write defensive code if you don't have to. Only seek permission or ask for forgiveness if you expect it to happen and that that's normal.

Consider the following three functions:


def f0(x):
    return PI / x


def f1(x):
    if x != 0:
        return PI / x
    else:
        return -1


def f2(x):
    try:
        return PI / x
    except ZeroDivisionError:
        return -1

Which one do you think is the fastest? If I run this 1,000,000 times and never pass in a value for x=0 will it make any difference?

Before you look at it, what do you think the result will be?


The answer is below.


Read on.


Scroll down for the results.


Have you made a guess yet?


What do you think it's going to be?


Scroll some more.


Almost there!


Ok, the results are as follows when running each of the above mentioned functions ~33,000,000 times on my MacBook:

f0 4.16087803245
f1 4.84187698364
f2 4.73760977387
(smaller is better)

Conclusion, the difference is miniscule. The fastest is to not do any exception handling or condition checking but it's generally no big difference.

This test was done with Python 2.7.9. You can try the code for yourself.

Just one more thought

As I wrote this post I started thinking more and more about the "code style aspect" rather than the performance.

Basically, I think it boils down to the following rules:

  1. If you're working with external I/O (e.g. network or a database) use the "ask for forgiveness" approach (aka. exception wrapping). I.e. don't do if requests.head(url).status_code == 200: stuff = requests.get(url)

  2. If you want to make a really user-friendly Python API, use the "seek permission" approach (aka. if-statement first). E.g. def calculate(guests): if isinstance(guests, basestring): guests = [guests]

  3. All else just do. That makes the code more Pythonic. If you have a sub-routine that sends in variable of the totally crazy-wrong type to your function, don't change the function, change the sub-routine.

UPDATE

Here are the numbers for PyPy:

f0 0.369750552707
f1 0.321069081624
f2 0.411438703537
(smaller is better)

That's after averaging 15 runs of the script.

Note that the function with the extra if statement is faster.

And here are the numbers of Python 3.4.2:

f0 4.99579153742
f1 5.77459328515
f2 5.38382162367
(smaller is better)

That's averaging 10 rounds.

One almost interesting thing about these numbers is that the sum of them are different and tells us a tiny story about performance for the language:

Python 2.7.9   13.74036478996
PyPy 2.4.0     1.102258337868
Python 3.4.2   16.15420644624
(smaller is better)

UPDATE 2

Here's the node equivalent version and its times:

f0 0.215509441
f1 0.228280196357
f2 0.316222934714
(smaller is better)

That means that my Node v0.10.35 is 45% faster than PyPy. But please, don't take that seriously.

premailer 2.9.0 and new rules for `base_url`

May 11, 2015
0 comments Python

I just pushed out a new release of premailer which comes with a pretty big change.

What it means is that the way the base_url and any href= or src= gets combined. For example, you used to be able to set Premailer(html, base_url='http://example.com/subfolder') and combined with <img src="/images/foo.png"> it would become <img src="http://example.com/subfolder/images/foo.png">.

Not any more. The joining works exactly like the Python built-in urljoin() works. E.g.


>>> from urllib.parse import urljoin  # python 3
>>> urljoin('https://example.com', '/image.png')
'https://example.com/image.png'
>>> urljoin('https://example.com/subfolder', '/image.png')
'https://example.com/image.png'
>>> urljoin('https://example.com/subfolder/', '/image.png')
'https://example.com/image.png'
>>> urljoin('https://example.com/subfolder/', '//image.png')
'https://image.png'
>>> urljoin('https://example.com/subfolder/', '//mycdn.com/image.png')
'https://mycdn.com/image.png'
>>> urljoin('http://example.com/subfolder/', '//mycdn.com/image.png')
'http://mycdn.com/image.png'
>>> urljoin('https://example.com/subfolder', 'image.png')
'https://example.com/image.png'
>>> urljoin('https://example.com/subfolder/', 'image.png')
'https://example.com/subfolder/image.png'

So basically, if you think you tried to do something odd with your base_url check it over carefully when you upgrade to version 2.9.0.

Thank you @ewjoachim and @graingert for your help!

Use closure for your Django context processors

May 9, 2015
11 comments Python, Django

The idea with template context processors in Django is to inject some defaults thing to be available when rendering a template that is rendered with a request.

I.e. instead of...:


def view1(request):
    context = {
        'name': 'View 1', 
        'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES
    }
    return render(request, 'view1.html', context)

def view2(request):
    context = {
        'name': 'View 2', 
        'other': 'things', 
        'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES
    }
    return render(request, 'view2.html', context)

And in your nominal templates/base.html you might have something like this:


  ...
  <footer>
  <p>&copy; You 2015</p>
  {% if on_dev_server %}
    <p color="red">Note! We're currently on a dev server!</p>
  {% endif %}
  </footer>
  ...

Instead you do this trick; in your settings.py you write down the list of defaults plus the one you want to always have available:


TEMPLATE_CONTEXT_PROCESSORS = (
    "django.contrib.auth.context_processors.auth",
    "django.template.context_processors.static",
    "myproject.myapp.context_processors.debug_info",
)

And to accompany that you define your myprojects/myapp/context_processors.py like so:


def debug_info(request):
    return {
        'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES,
    }

So far so good.

However, there's a problem with this. Two problems in fact.

First problem is that when all the templates in your big complicated website renders, it's quite possible that some pages don't need everything you set up in your context processors. That might mean a heck of a lot of extra computation when it won't ever be displayed.

For example, I have a project where most pages have a sidebar where I show "Trending Events" which is something I compute in a context_processors.py function called def sidebar_events(request):. But the sidebar is not always shown and on the pages where it's not shown it's a waste to compute the stuff that sidebar_events computes. Also, I have management pages which uses a totally different base.html template. So there's a big chance you're wasting precious CPU.

Another problem is that of code-readability (aka. how frustrating is this to debug for someone else or yourself after months of idle activity). If you're skimming through your base.html and you see this "random" variable called on_dev_server it's very very hard to tell where the heck that's defined. Hopefully grepping the whole source code is a way to go. A much better way to solve that problem would be sensible namespace naming.

And also, by being too liberal with globally scoped variables there's a chance you might clash from a different piece of functionality that uses the same variable names. That chance is smaller when you use namespaces.

So, to remedy this, let your template context processor functions return closures. It wraps the request automagically.

Let's rewrite our trivial example from above, the context_processors.py should now look like this:


def debug_info(request):
    def inner():
        return {
            'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES,
        }
    return {'debug_info': inner}

Now executing that becomes more optional and more deliberate in the template instead. E.g.


  ...
  <footer>
  <p>&copy; You 2015</p>
  {% set debug_info = debug_info() %}
  {% if debug_info['on_dev_server'] %}
    <p color="red">Note! We're currently on a dev server!</p>
  {% endif %}
  </footer>
  ...

This makes it more explicity which is a good thing. It also has the potential to be avoided if the stuff in there isn't needed in some templates.