Make .local domains NOT slow in macOS

January 29, 2018
19 comments Linux, macOS

Problem

I used to have a bunch of domains in /etc/hosts like peterbecom.dev for testing Nginx configurations locally. But then it became impossible to test local sites in Chrome because an .dev is force redirected to HTTPS. No problem, so I use .local instead. However, DNS resolution was horribly slow. For example:


▶ time curl -I http://peterbecom.local/about/minimal.css > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  1763    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
curl -I http://peterbecom.local/about/minimal.css > /dev/null  0.01s user 0.01s system 0% cpu 5.585 total

5.6 seconds to open a local file in Nginx.

Solution

Here's that one weird trick to solve it: Add an entry for IPv4 AND IPv6 in /etc/hosts.

So now I have:

cat /etc/hosts | grep peterbecom
127.0.0.1       peterbecom.local
::1             peterbecom.local

Verification

Ah! Much better. Thing are fast again:


▶ time curl -I http://peterbecom.local/about/minimal.css > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  1763    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl -I http://peterbecom.local/about/minimal.css > /dev/null  0.01s user 0.01s system 37% cpu 0.041 total

0.04 seconds instead of 5.6.

Even more aggressively trying to preload your next page load

January 22, 2018
2 comments Web development, JavaScript

In 2014 I tried out an experiment to "Aggressively prefetching everything you might click". It was received with mixed reviews. Today, 4 years later, I stand by that experiment/solution and I even like it so much that I've decided to extend it.

How it works

The gist of the solution is that if you mouse hover over an internal link, with a 200ms delay, an XHR request is made to that URL as a simple GET. Suppose the XHR finishes loading in, say 300ms, and you eventually click the link, by the time it tries to load it, it loads it straight from your browser cache. You get that "instant load" feel and it makes navigating the site more enjoyable. Suppose that you're really fast with your mouse/trackpad and you click the link faster than 500ms (but slower than 200ms) the XHR request gets automatically cancelled by the browser. When your browser loads the new page, it basically has to start from scratch. No harm done. Just not as fast.

Sure, there is a chance that you hover over a link, and stay hovering for more than 200ms but then decide to not click on it. Then the XHR preload was a waste of resources.
But!! If you even have a mouse cursor, the chances that you're on a WiFi connected laptop.

None of this "kicks in" when you're on a mobile device. The onMouseOver event won't trigger. And, I dare to say that only on mobile devices does it strongly matter to reduce the stuff the client has to download. So what's the harm of forcing your laptop to download a couple of extra kilobytes? If you hover over the link, the chances are, after all, that you will click the link.

Even more aggressive

Today I decided to step it up even more. Now, after the HTML has been downloaded, the HTML downloaded is scanned with a regular expression for image URLs that sit on my CDN (where I host all images with far-future cache headers). The first 5 image URLs are preloaded so that when you eventually make that link click, not only is the page load instant, but most images are too.

What do you think? Too aggressive or genius?

Before hovering
Before hovering over the "About" link

After hovering
After hovering over the "About" link

Now, if I go ahead and make the click, the HTML load will be instant and the first 3 images will be instant too.

Show me the code!

It ain't pretty but it works: prefetcher.js

Yes, it's jQuery and I'm OK with that. Yes, the CDN domain name is hardcoded and if this was a work project I'd never do that. Heck, the ultimate reason I'm blogging about this is ultimately to share/teach. When you build something similar you can do it more robustly.

minimalcss 0.6.2 now strips all unused font faces

January 22, 2018
0 comments Web development, JavaScript, Node

minimalcss is a Node API and cli app to analyze the minimal CSS needed for initial load. One of it's killer features is that all CSS parsing is done the "proper way". Meaning, it's reduced down to an AST that can be iterated over, mutated and serialized back to CSS as a string.

Thanks to this, together with my contributors @stereobooster and @lahmatiy, minimalcss can now figure out which @font-face rules are redundant and can be "safely" removed. It can make a big difference on web performance. Either because it prevents expensive network requests of downloading some https://fonts.gstatic.com/s/lato/v14/hash.woff2 or downloading base64 encoded fonts.

For example, this very blog uses Semantic UI which is a wonderful CSS framework. But it's quite expensive and contains a bunch of base64 encoded fonts. The Ratings module uses a @font-face rule that weighes about 15KB.

Sure, you don't have to download and insert semanticui.min.css in your HTML but it's just sooo convenient. Especially when there's tools like minimalcss that allows you to be "lazy" but get that perfect first load web performance thing.
So, the CSS when doing a search looks like this:

Unoptimized
126KB of CSS (gzipped) transferred and 827KB of CSS parsed.

Let's run this through minimalcss instead:

$ minimalcss.js --verbose -o /tmp/peterbe.search.css "https://www.peterbe.com/search?q=searching+for+something"
$ ls -lh /tmp/peterbe.search.css
-rw-r--r--  1 peterbe  wheel    27K Jan 22 09:59 /tmp/peterbe.search.css
$ head -n 14 /tmp/peterbe.search.css
/*
Generated 2018-01-22T14:59:05.871Z by minimalcss.
Took 4.43 seconds to generate 26.85 KB of CSS.
Based on 3 stylesheets totalling 827.01 KB.
Options: {
  "urls": [
    "https://www.peterbe.com/search?q=searching+for+something"
  ],
  "debug": false,
  "loadimages": false,
  "withoutjavascript": false,
  "viewport": null
}
*/

And let's simulate it being gzipped:

$ gzip /tmp/peterbe.search.css
$ ls -lh /tmp/peterbe.search.css.gz
-rw-r--r--  1 peterbe  wheel   6.0K Jan 22 09:59 /tmp/peterbe.search.css.gz

Wow! Instead of downloading 27KB you only need 6KB. CSS parsing isn't as expensive as JavaScript parsing but it's nevertheless a saving of 827KB - 27KB = 800KB of CSS for the browser to not have to worry about. That's awesome!

By the way, the produced minimal CSS contains a lot of license preamble as left over from the fact that the semanticui.min.css is made up of components. See the gist itself.
Out of the total size of 27KB (uncompressed) 8KB is just the license preambles. minimalcss does not attempt to touch that when it minifies but you could easily add your own little tooling to re-write it, since there's a lot of repetition and save another ~7KB. However, all that repetition compresses well so it might not be worth it.

Conditional aggregation in Django 2.0

January 12, 2018
4 comments Python, Django, PostgreSQL

Django 2.0 came out a couple of weeks ago. It now supports "conditional aggregation" which is SQL standard I didn't even know about.

Before

So I have a Django app which has an endpoint that generates some human-friendly stats about the number of uploads (and their total size) in various different time intervals.

First of all, this is how it set up the time intervals:


today = timezone.now()
start_today = today.replace(hour=0, minute=0, second=0)
start_yesterday = start_today - datetime.timedelta(days=1)
start_this_month = today.replace(day=1)
start_this_year = start_this_month.replace(month=1)

And then, for each of these, there's a little function that returns a dict for each time interval:


def count_and_size(qs, start, end):
    sub_qs = qs.filter(created_at__gte=start, created_at__lt=end)
    return {
        'count': sub_qs.count(),
        'total_size': sub_qs.aggregate(size=Sum('size'))['size'],
}

numbers['uploads'] = {
    'today': count_and_size(upload_qs, start_today, today),
    'yesterday': count_and_size(upload_qs, start_yesterday, start_today),
    'this_month': count_and_size(upload_qs, start_this_month, today),
    'this_year': count_and_size(upload_qs, start_this_year, today),
}

What you get is exactly 2 x 4 = 8 queries. One COUNT and one SUM for each time interval. E.g.

SELECT SUM("upload_upload"."size") AS "size" 
FROM "upload_upload" 
WHERE ("upload_upload"."created_at" >= ...

SELECT COUNT(*) AS "__count" 
FROM "upload_upload" 
WHERE ("upload_upload"."created_at" >= ...

...6 more queries...

Middle

Oops. I think this code comes from a slightly rushed job. We can do the COUNT and the SUM at the same time for each query.


# New, improved count_and_size() function!
def count_and_size(qs, start, end):
    sub_qs = qs.filter(created_at__gte=start, created_at__lt=end)
    return sub_qs.aggregate(
        count=Count('id'),
        total_size=Sum('size'),
    )

numbers['uploads'] = {
    'today': count_and_size(upload_qs, start_today, today),
    'yesterday': count_and_size(upload_qs, start_yesterday, start_today),
    'this_month': count_and_size(upload_qs, start_this_month, today),
    'this_year': count_and_size(upload_qs, start_this_year, today),
}

Much better, now there's only one query per time bucket. So 4 queries in total. E.g.

SELECT COUNT("upload_upload"."id") AS "count", SUM("upload_upload"."size") AS "total_size" 
FROM "upload_upload" 
WHERE ("upload_upload"."created_at" >= ...

...3 more queries...

After

But we can do better than that! Instead, we use conditional aggregation. The syntax gets a bit hairy because there's so many keyword arguments, but I hope I've indented it nicely so it's easy to see how it works:


def make_q(start, end):
    return Q(created_at__gte=start, created_at__lt=end)

q_today = make_q(start_today, today)
q_yesterday = make_q(start_yesterday, start_today)
q_this_month = make_q(start_this_month, today)
q_this_year = make_q(start_this_year, today)

aggregates = upload_qs.aggregate(
    today_count=Count('pk', filter=q_today),
    today_total_size=Sum('size', filter=q_today),

    yesterday_count=Count('pk', filter=q_yesterday),
    yesterday_total_size=Sum('size', filter=q_yesterday),

    this_month_count=Count('pk', filter=q_this_month),
    this_month_total_size=Sum('size', filter=q_this_month),

    this_year_count=Count('pk', filter=q_this_year),
    this_year_total_size=Sum('size', filter=q_this_year),
)
numbers['uploads'] = {
    'today': {
        'count': aggregates['today_count'],
        'total_size': aggregates['today_total_size'],
    },
    'yesterday': {
        'count': aggregates['yesterday_count'],
        'total_size': aggregates['yesterday_total_size'],
    },
    'this_month': {
        'count': aggregates['this_month_count'],
        'total_size': aggregates['this_month_total_size'],
    },
    'this_year': {
        'count': aggregates['this_year_count'],
        'total_size': aggregates['this_year_total_size'],
    },
}

Voila! One single query to get all those pieces of data.
The SQL sent to PostgreSQL looks something like this:

SELECT 
  COUNT("upload_upload"."id") FILTER (WHERE ("upload_upload"."created_at" >= ...)) AS "today_count", 
  SUM("upload_upload"."size") FILTER (WHERE ("upload_upload"."created_at" >= ...)) AS "today_total_size", 

  COUNT("upload_upload"."id") FILTER (WHERE ("upload_upload"."created_at" >= ...)) AS "yesterday_count", 
  SUM("upload_upload"."size") FILTER (WHERE ("upload_upload"."created_at" >= ...)) AS "yesterday_total_size", 

  ...

FROM "upload_upload";

Is this the best thing to do? I'm starting to have my doubts.

Watch Out!

When I take this now 1 monster query for a spin with an EXPLAIN ANALYZE prefix I notice something worrying!

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=74.33..74.34 rows=1 width=16) (actual time=0.587..0.587 rows=1 loops=1)
   ->  Seq Scan on upload_upload  (cost=0.00..62.13 rows=813 width=16) (actual time=0.012..0.210 rows=813 loops=1)
 Planning time: 0.427 ms
 Execution time: 0.674 ms
(4 rows)

A sequential scan! That's terrible. The created_at column is indexed in a BTREE so why can't it use the index.

The short answer is: I don't know!
I've uploaded a reduced, but still complete, example demonstrating this in a gist. It's very similar to the example in the stackoverflow question I asked.

So what did I do? I went back to the "middle" solution. One SELECT query per time bucket. So 4 queries in total, but at least all 4 is able to use an index.

When Docker is too slow, use your host

January 11, 2018
3 comments Web development, Django, macOS, Docker

I have a side-project that is basically a React frontend, a Django API server and a Node universal React renderer. The killer feature is its Elasticsearch database that searches almost 2.5M large texts and 200K named objects. All the data is stored in a PostgreSQL and there's some Python code that copies that stuff over to Elasticsearch for indexing.

Timings for searches in Songsearch
The PostgreSQL database is about 10GB and the Elasticsearch (version 6.1.0) indices are about 6GB. It's moderately big and even though individual searches take, on average ~75ms (in production) it's hefty. At least for a side-project.

On my MacBook Pro, laptop I use Docker to do development. Docker makes it really easy to run one command that starts memcached, Django, a AWS Product API Node app, create-react-app for the search and a separate create-react-app for the stats web app.

At first I tried to also run PostgreSQL and Elasticsearch in Docker too, but after many attempts I had to just give up. It was too slow. Elasticsearch would keep crashing even though I extended my memory in Docker to 4GB.

This very blog (www.peterbe.com) has a similar stack. Redis, PostgreSQL, Elasticsearch all running in Docker. It works great. One single docker-compose up web starts everything I need. But when it comes to much larger databases, I found my macOS host to be much more performant.

So the dark side of this is that I have remember to do more things when starting work on this project. My PostgreSQL was installed with Homebrew and is always running on my laptop. For Elasticsearch I have to open a dedicated terminal and go to a specific location to start the Elasticsearch for this project (e.g. make start-elasticsearch).

The way I do this is that I have this in my Django projects settings.py:


import dj_database_url
from decouple import config


DATABASES = {
    'default': config(
        'DATABASE_URL',
        # Hostname 'docker.for.mac.host.internal' assumes
        # you have at least Docker 17.12.
        # For older versions of Docker use 'docker.for.mac.localhost'
        default='postgresql://peterbe@docker.for.mac.host.internal/songsearch',
        cast=dj_database_url.parse
    )
}

ES_HOSTS = config('ES_HOSTS', default='docker.for.mac.host.internal:9200', cast=Csv())

(Actually, in reality the defaults in the settings.py code is localhost and I use docker-compose.yml environment variables to override this, but the point is hopefully still there.)

And that's basically it. Now I get Docker to do what various virtualenvs and terminal scripts used to do but the performance of running the big databases on the host.

Understanding Redis hash-max-ziplist-entries

January 8, 2018
2 comments Python, Redis

This is an advanced topic for people who do serious stuff in Redis. I need to do serious stuff in Redis so I'm trying to learn about the best way to store lots of keys with hash maps.

It seems that this article by Salvatore Sanfilippo (creator of Redis) himself seems to be a much cited article for this topic. If you haven't read it, the gist is that Redis can employ some clever optimizations for storing hash maps in a very memory efficient way instead of storing each key-value separately.

"Hashes, Lists, Sets composed of just integers, and Sorted Sets, when smaller than a given number of elements, and up to a maximum element size, are encoded in a very memory efficient way that uses up to 10 times less memory (with 5 time less memory used being the average saving)"

This efficient storage optimization is called a ziplist.

Truncated! Read the rest by clicking the link below.

Display current React version

January 7, 2018
1 comment JavaScript, React

Usually you know what version of React your app is using by opening the package.json, or poking around in node_modules/react/index.js. But perhaps there are many packaging abstractions in between your command line and the server. Especially if you have a continous integration server that builds your static assets and if that CI uses caching. It might get scary.

If you really want to print out what version of React is rendering your app here's one way to do that:


import React from 'react'

class Introspection extends React.Component {
  render() {
    return <div>
      Currently using React {React.version}
    </div>
  }
}

Suppose that you want this display to depend on the app being in dev or prod mode:


import React from 'react'

class Introspection extends React.Component {
  render() {
    return <div>
      {
        process.env.NODE_ENV === 'development' ?
        <p>Currently using React {React.version}</p> : null
      }
    </div>
  }
}

Note that there's no need to import process.

See this CodeSandbox snippet for a live example.

Whatsdeployed facelift

January 5, 2018
0 comments Python, Web development, Mozilla, Docker

tl;dr; Whatsdeployed.io is an impressively simple web app to help web developers and web ops people quickly see what GitHub commits have made it into your Dev, Stage or Prod environment. Today it got a facelift.

The code is now more than 5 years old and has served me well. It's weird to talk too positively about the app because I actually wrote it but because it's so simple in terms of design and effort it feels less personal to talk about it.

Here's what's in the facelift

  • Upgraded to Bootstrap 4.
  • Instead of relying on downloading a heavy Glyphicon web font, just to display a single checkmark, that's now a simple image.
  • Ability to use a GitHub developer personal token to avoid rate limitations on GitHub's API.
  • The first lookup to get all commits is now done via the Flask app to use my auth token to avoid the rate limit.
  • Much better error handling if any of the underlying requests.get() that the Flask app does, fails. Also includes which URL it failed on.
  • Basic validation to prevent submitting the main form without typing anything in.
  • You can hack on it with Docker. Thanks @willkg.
  • Improved the code that extracts Bugzilla bug numbers out of commit messages. Thanks @edmorely.
  • Refreshed screenshots in the README.md
  • A brand new introduction text on the home page for people who end up on the site not knowing what it is.
  • If any XHR errors happen figuring out the "culprits", you now get a pretty error describing this instead of swallowing it all.

Please let me know if there's anything broken or missing.

How to rotate a video on OSX with ffmpeg

January 3, 2018
5 comments Linux, macOS

Every now and then, I take a video with my iPhone and even though I hold the camera in landscape mode, the video gets recorded in portrait mode. Probably because it somehow started in portrait and didn't notice that I rotated the phone.

So I'm stuck with a 90° video. Here's how I rotate it:

ffmpeg -i thatvideo.mov -vf "transpose=2" ~/Desktop/thatvideo.mov

then I check that ~/Desktop/thatvideo.mov looks like it should.

I can't remember where I got this command originally but I've been relying on my bash history for a looong time so it's best to write this down.
The "transpose=2" means 90° counter clockwise. "transpose=1" means 90° clockwise.

What is ffmpeg??

If you're here because you Googled it and you don't know what ffmpeg is, it's a command line program where you can "programmatically" do almost anything to videos such as conversion between formats, put text in, chop and trim videos. To install it, install Homebrew then type:

brew install ffmpeg

Fastest way to uniquify a list in Python >=3.6

December 23, 2017
7 comments Python

This is an update to a old blog post from 2006 called Fastest way to uniquify a list in Python. But this, time for Python 3.6. Why, because Python 3.6 preserves the order when inserting keys to a dictionary. How, because the way dicts are implemented in 3.6, the way it does that is different and as an implementation detail the order gets preserved. Then, in Python 3.7, which isn't released at the time of writing, that order preserving is guaranteed.

Anyway, Raymond Hettinger just shared a neat little way to uniqify a list. I thought I'd update my old post from 2006 to add list(dict.fromkeys('abracadabra')).

Functions

Reminder, there are two ways to uniqify a list. Order preserving and not order preserving. For example, the unique letters in peter is p, e, t, r in their "original order". As opposed to t, e, p, r.


def f1(seq):  # Raymond Hettinger
    hash_ = {}
    [hash_.__setitem__(x, 1) for x in seq]
    return hash_.keys()

def f3(seq):
    # Not order preserving
    keys = {}
    for e in seq:
        keys[e] = 1
    return keys.keys()

def f5(seq, idfun=None):  # Alex Martelli ******* order preserving
    if idfun is None:
        def idfun(x): return x
    seen = {}
    result = []
    for item in seq:
        marker = idfun(item)
        # in old Python versions:
        # if seen.has_key(marker)
        # but in new ones:
        if marker in seen:
            continue
        seen[marker] = 1
        result.append(item)
    return result

def f5b(seq, idfun=None):  # Alex Martelli ******* order preserving
    if idfun is None:
        def idfun(x): return x
    seen = {}
    result = []
    for item in seq:
        marker = idfun(item)
        # in old Python versions:
        # if seen.has_key(marker)
        # but in new ones:
        if marker not in seen:
            seen[marker] = 1
            result.append(item)

    return result

def f7(seq):
    # Not order preserving
    return list(set(seq))

def f8(seq):  # Dave Kirby
    # Order preserving
    seen = set()
    return [x for x in seq if x not in seen and not seen.add(x)]

def f9(seq):
    # Not order preserving, even in Py >=3.6
    return {}.fromkeys(seq).keys()

def f10(seq, idfun=None):  # Andrew Dalke
    # Order preserving
    return list(_f10(seq, idfun))

def _f10(seq, idfun=None):
    seen = set()
    if idfun is None:
        for x in seq:
            if x in seen:
                continue
            seen.add(x)
            yield x
    else:
        for x in seq:
            x = idfun(x)
            if x in seen:
                continue
            seen.add(x)
            yield x

def f11(seq):  # f10 but simpler
    # Order preserving
    return list(_f10(seq))

def f12(seq):
    # Raymond Hettinger
    # https://twitter.com/raymondh/status/944125570534621185
    return list(dict.fromkeys(seq))

Results

FUNCTION        ORDER PRESERVING     MEAN       MEDIAN
f12             yes                  111.0      112.2
f8              yes                  266.3      266.4
f10             yes                  304.0      299.1
f11             yes                  314.3      312.9
f5              yes                  486.8      479.7
f5b             yes                  494.7      498.0
f7              no                   95.8       95.1
f9              no                   100.8      100.9
f3              no                   143.7      142.2
f1              no                   406.4      408.4

Not order preserving

Order preserving

Conclusion

The fastest way to uniqify a list of hashable objects (basically immutable things) is:


list(set(seq))

And the fastest way, if the order is important is:


list(dict.fromkeys(seq))

Now we know.