This morning I came across this site on Hacker News. It's a cute site with some basic tips on how to make your sites faster.
It's very much a for-beginners document as all the tips are quite basic. For example it doesn't even mention the use of CDNs.
One tip in particular stood out to me: "it can be useful to minify your HTML with automated tools."
And it links to the htmlcompressor project. Ignore this advice.
What matters 10 times more is Gzip compression. This is usually very easy to set up with Nginx or Apache. It's not something you do in your web framework and if you don't have a web framework, you don't need to manully Gzip HTML files on the filesystem.
For example, downloading the home page here on my blog, at the time of writing, this is: 66,770 bytes big. Hefty, sure, but with all excess whitespace removed it reduces down to 59,356 bytes. But that really doesn't matter when you Gzip.
Gzipped from original version: 18,470 bytes
Gzipped from whitespace trimmed version: 18,086 bytes
The gain is 2% which is definitely not worth the hassle of adding a whitespace compressor.
Comments
Ah you really made a post about it! :D
Well a purist could argue that if you know about the waste: prevent it!
For the browser itself there is no need for the whitespace and xml can be indented automatically to your very taste locally when you want to edit it.
Even gzipped this saves about 4mb traffic on 10000 pageloads...
I read people arguing against SEO-style-readable-urls being a tremendous waste counted on a large scale ;]
So why stop there? You're already doing stuff WAY beyond most of the web devs! And I think thats awesome!! :]
for big sites 2% is a lot tho. i'm surprise the diff is so big.
How do you estimate all this stuffs? (I mean with ur old posts fancy-cache, ...) Can u post steps or tutorial? Thank you
How does this sound?
Pages are quantized into packets. If that 2% happens to tip you over a TCP packet boundary, you've saved 1 RTT.