Brotli compression quality comparison in the real world

December 1, 2021
2 comments Node, JavaScript

At work, we use Brotli (using the Node builtin zlib) to compress these large .json files to .json.br files. When using zlib.brotliCompress you can set options to override the quality number. Here's an example of it at quality 6:


import { promisify } from 'util'
import zlib from 'zlib'
const brotliCompress = promisify(zlib.brotliCompress)

const options = {
  params: {
    [zlib.constants.BROTLI_PARAM_MODE]: zlib.constants.BROTLI_MODE_TEXT,
    [zlib.constants.BROTLI_PARAM_QUALITY]: 6,
  },
}

export async function compress(data) {
  return brotliCompress(data, options)
}

But what if you mess with that number. Surely, the files will become smaller, but at what cost? Well, I wrote a Node script that measured how long it would take to compress 6 large (~25MB each) .json file synchronously. Then, I put them into a Google spreadsheet and voila:

Size

Total size per level

Time

Total seconds per level

Miles away from rocket science but I thought it was cute to visualize as a way of understanding the quality option.

How to string pad a string in Python with a variable

October 19, 2021
1 comment Python

I just have to write this down because that's the rule; if I find myself googling something basic like this more than once, it's worth blogging about.

Suppose you have a string and you want to pad with empty spaces. You have 2 options:


>>> s = "peter"
>>> s.ljust(10)
'peter     '
>>> f"{s:<10}"
'peter     '

The f-string notation is often more convenient because it can be combined with other formatting directives.
But, suppose the number 10 isn't hardcoded like that. Suppose it's a variable:


>>> s = "peter"
>>> width = 11
>>> s.ljust(width)
'peter      '
>>> f"{s:<width}"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Invalid format specifier

Well, the way you need to do it with f-string formatting, when it's a variable like that is this syntax:


>>> f"{s:<{width}}"
'peter      '

How to bulk-insert Firestore documents in a Firebase Cloud function

September 23, 2021
1 comment Node, Firebase, JavaScript

You can't batch-add/bulk-insert documents in the Firebase Web SDK. But you can with the Firebase Admin Node SDK. Like, in a Firebase Cloud Function. Here's an example of how to do that:


const firestore = admin.firestore();
let batch = firestore.batch();
let counter = 0;
let totalCounter = 0;
const promises = [];
for (const thing of MANY_MANY_THINGS) {
  counter++;
  const docRef = firestore.collection("MY_COLLECTION").doc();
  batch.set(docRef, {
    foo: thing.foo,
    bar: thing.bar,
    favNumber: 0,
  });
  counter++;
  if (counter >= 500) {
    console.log(`Committing batch of ${counter}`);
    promises.push(batch.commit());
    totalCounter += counter;
    counter = 0;
    batch = firestore.batch();
  }
}
if (counter) {
  console.log(`Committing batch of ${counter}`);
  promises.push(batch.commit());
  totalCounter += counter;
}
await Promise.all(promises);
console.log(`Committed total of ${totalCounter}`);

I'm using this in a Cloud HTTP function where I can submit a large amount of data and have each one fill up a collection.

I'm a GitHubber now

September 21, 2021
1 comment Work, GitHub

Starting today, I'm a Hubber. Meaning, I work for GitHub. I'll be joining the GitHub Docs team to help technical writers document all the various products that GitHub have. Since I haven't actually started coding anything yet, I don't want to claim I know how it works or exactly what I'll be working but on, but I do know that the site at hand is docs.github.com and I've previously taken a lot of inspiration from this site when building the MDN rewrite.

GitHub profile

If you are a Hubber, too, and reading this; Hi! Let's be friends! I'm a friendly guy. Please ping me and say hi.
I'll be working from home here in Mount Pleasant, South Carolina. I have 2 young kids; Tucker and Charlotte, who mean the world to me. And my backbone-of-life wife Ashley. My hobbies are coding, swimming, golf, and cooking.

TypeScript generic async function wrapper function

September 12, 2021
0 comments JavaScript

I find this so fiddly! I love TypeScript and will continue to use it if there's a choice. But I just wanted to write a simple async function wrapper and I had to Google for it and nothing was quite right. Here's my simple solution, as an example:


function wrappedAsyncFunction<T>(
    fn: (...args: any[]) => Promise<T>
  ): (...args: any[]) => Promise<T> {
    return async function(...args: any[]) {
      console.time("Took");
      try {
        return await fn(...args);
      } catch(error) {
        console.warn("FYI, an error happened:", error);
        throw error;
      } finally {
        console.timeEnd("Took");
      }

    };
  }

Here's a Playground demo

What I use it for is to wrap my Firebase Cloud Functions so that if any error happens, I can send that error to Rollbar. In particular, here's an example of it in use:


diff --git a/functions/src/cleanup-thumbnails.ts b/functions/src/cleanup-thumbnails.ts
index 46bdb34..a3e8d54 100644
--- a/functions/src/cleanup-thumbnails.ts
+++ b/functions/src/cleanup-thumbnails.ts
@@ -2,6 +2,8 @@ import * as admin from "firebase-admin";
 import * as functions from "firebase-functions";
 import { logger } from "firebase-functions";

+import { wrappedLogError } from "./rollbar-logger";
+
 const OLD_DAYS = 30 * 6; // 6 months
 // const ADDITIONAL_DAYS_BACK = 5;
 // const ADDITIONAL_DAYS_BACK = 15;
@@ -9,7 +11,7 @@ const PREFIX = "thumbnails";

 export const scheduledCleanupThumbnails = functions.pubsub
   .schedule("every 24 hours")
-  .onRun(async () => {
+  .onRun(wrappedLogError(async () => {
     logger.debug("Running scheduledCleanupThumbnails");

...

And my wrappedLogError looks like this:


export function wrappedLogError<T>(
  fn: (...args: any[]) => Promise<T>
): (...args: any[]) => Promise<T> {
  return async function(...args: any[]) {
    try {
      return await fn(...args);
    } catch (error) {
      logError(error);
      throw error;
    }
  };
}

I'm not sure it's the best or correct way to do it, but it seems to work. Perhaps there's a more correct solution but for now I'll ship this because it seems to work fine.

From photo of ingredients, to your shopping list

September 10, 2021
0 comments Web development, That's Groce!, Firebase

Today I launched a really cool new feature to That's Groce!: Ability to upload photos of ingredients and have the food words automatically suggested to be added to your shopping list.

It's best explained with this 26-second video.

The general idea

Food words found
The idea is that you know you're going to cook that Vegetarian Curry Lasagna on page 123 in Jamie's Summer Cookbook. Either you read the ingredients and type in each ingredient you're going to need to buy (because you know what's in your pantry and fridge) or just take a photo of the whole ingredient listing. Now, when you're in the store and wonder: "I remember I need red peppers, but how many was it again?!".
But not only that, once you've taken a photo of the list of ingredients, it can help you populate your shopping list.

How it works in That's Groce! is that your photo is turned into a block of text, and from that text certain "food words" are extracted and all else is ignored. The food words are based on a database of 3,600+ English food words that I've gathered and also manually curated. There are lots of caveats. The 3,600+ food words are not perfect and there are surprisingly many combinations and plural vs. singular that can make the list incomplete. But, that's where you can help! If you find that there's a word it correctly scanned but didn't suggest, you can type in your own suggestions for everyone to benefit from. If you type in something it didn't manage to spot, I'll review that and add it to the global database.

Food-word recognition

The word recognition is done using Google Cloud Vision API which is a powerful machine-learning-based service from Google Cloud that is stunningly accurate. Tidy camera photos of cookbooks with good light is nearly perfect, but it can also do an impressive job with photos of handwritten recipes, like this:

Handwritten recipe photo

One thing you will find is that it's often hard to only take a photo of the actual list of ingredients. Often, a chunk of the cooking instructions or the recipe "back story" gets into the photo frame. These words aren't actually ingredients and can lead to surprising food word suggestions that you definitely don't need to buy. It's not perfect but at least you'll have a visual memory of what you're cooking as you're standing there in the grocery store.

Options

Understandably, it's nearly impossible for the app to know what you have in your pantry/fridge/freezer/spice rack. But a lot of recipes spell out exactly everything you need, not for buying, but for cooking. E.g. 1 teaspoon salt. But salt (or pepper or sugar or butter) is the kind of foodstuff you probably always have at home. So then it doesn't make sense to suggest that you put "salt" on the shopping list. To solve for that, you can override your own preferred options of special keywords. For example, I've added "salt", "pepper", "sugar, "table salt", "black pepper" as words that can always be ignored. You can also train the system to set up aliases. For example, a lot of recipes call for "lemon zest" but what you actually purchase is a lemon and then grate it yourself on the grater. So you can add special keywords that act as aliases for other words. Here's one example:

Options

Help out!

Foodwords database sample
Over the last couple of days, I've been snapping lots of photos from lots of different cookbooks, and every time I begin to think the database is getting good, I stumble on some new food word. For example, I recently tested a photo of a recipe that called for "Jackfruit". What even is that?! Anyway, it is inevitable that certain words are missing. But that's where we can help each other. If you test it out and you notice that it correctly scanned the text but the word wasn't suggested, click the "Suggest" button and type in your suggestions. Together we can, one food word at a time, eradicate misses.

There's still some work to be done to make the database even stronger by setting up clever aliases for everyone. For example, a lot of recipes call for "(some number) cloves garlic" but it can easily get confused for the spice "cloves" and the root vegetable "garlic". So, perhaps we can train it to recognize "cloves garlic" to actually just mean "garlic".

Also, the database is currently only in (primarily American) English. The platform would support other languages but I would definitely need a hand to seed it with more words in other languages.

Try it out

If you haven't set up an account yet (it's still free!), to test it, you can go to https://thatsgroce.web.app, click "Get started without signing in", go into your newly created shopping list, and press the "Photos" button. Try uploading some snapped photos from your cookbooks. Please please let me know what you think!

TypeScript function keyword arguments like Python

September 8, 2021
0 comments Python, JavaScript

To do this in Python:


def print_person(name="peter", dob=1979):
    print(f"name={name}\tdob={dob}")


print_person() 
# prints: name=peter dob=1979

print_person(name="Tucker")
# prints: name=Tucker    dob=1979

print_person(dob=2013)
# prints: name=peter dob=2013

print_person(sex="boy")
# TypeError: print_person() got an unexpected keyword argument 'sex'

...in TypeScript:


function printPerson({
  name = "peter",
  dob = 1979
}: { name?: string; dob?: number } = {}) {
  console.log(`name=${name}\tdob=${dob}`);
}

printPerson();
// prints: name=peter    dob=1979

printPerson({});
// prints: name=peter    dob=1979

printPerson({ name: "Tucker" });
// prints: name=Tucker   dob=1979

printPerson({ dob: 2013 });
// prints: name=peter    dob=2013


printPerson({ gender: "boy" })
// Error: Object literal may only specify known properties, and 'gender' 

Here's a Playground copy of it.

It's not a perfect "transpose" across the two languages but it's sufficiently similar.
The trick is that last = {} at the end of the function signature in TypeScript which makes it possible to omit keys in the passed-in object.

By the way, the pure JavaScript version of this is:


function printPerson({ name = "peter", dob = 1979 } = {}) {
  console.log(`name=${name}\tdob=${dob}`);
}

But, unlike Python and TypeScript, you get no warnings or errors if you'd do printPerson({ gender: "boy" }); with the JavaScript version.

How to use letsencrypt-acme-challenge.conf in Nginx

September 5, 2021
1 comment Nginx

Because I always forget, if you're using certbot to create certs for your Nginx server, you'll need to it up so it works on HTTP as well as HTTPS. But once you're done, you're going to want all HTTP traffic to redirect to HTTPS. The correct syntax is:


server {
    server_name mydomain.example.com;
    include /etc/nginx/snippets/letsencrypt-acme-challenge.conf;
    location / {
      return 301 https://mydomain.example.com$request_uri;
    }
}

And that letsencrypt-acme-challenge.conf looks like this (code comments stripped):


location ^~ /.well-known/acme-challenge/ {
    default_type "text/plain";
    root         /var/www/html;
    break;
}
location = /.well-known/acme-challenge/ {
    return 404;
}

This way, a GET request for http://mydomain.example.com/.well-known/acme-challenge/test.html will be 200 OK if there's a file called /var/www/html/.well-known/acme-challenge/test.html. And http://mydomain.example.com/.well-known/acme-challenge/does-not-exist.html will 404 Not Found.

But all and any other GET request will redirect. E.g. http://mydomain.example.com/whatever -- 301 Moved Permanently --> https://mydomain.example.com/whatever.

How I upload Firebase images optimized

September 2, 2021
0 comments JavaScript, Web development, Firebase

I have an app that allows you to upload images. The images are stored using Firebase Storage. Then, once uploaded I have a Firebase Cloud Function that can turn that into a thumbnail. The problem with this is that it takes a long time to wake up the cloud function, the first time, and generating that thumbnail. Not to mention the download of the thumbnail payload for the client. It's not unrealistic that the whole thumbnail generation plus download can take multiple (single digit) seconds. But you don't want to have the user sit and wait that long. My solution is to display the uploaded file in a <img> tag using URL.createObjectURL().

The following code is most pseudo-code but should look familiar if you're used to how Firebase and React/Preact works. Here's the FileUpload component:


interface Props {
  onUploaded: ({ file, filePath }: { file: File; filePath: string }) => void;
  onSaved?: () => void;
}

function FileUpload({
  onSaved,
  onUploaded,
}: Props) => {
  const [file, setFile] = useState<File | null>(null);

  // ...some other state stuff omitted for example.

  useEffect(() => {
    if (file) {
      const metadata = {
        contentType: file.type,
    };

    const filePath = getImageFullPath(prefix, item ? item.id : list.id, file);
    const storageRef = storage.ref();

    uploadTask = storageRef.child(filePath).put(file, metadata);
    uploadTask.on(
      "state_changed",
      (snapshot) => {
        // ...set progress percentage
      },
      (error) => {
        setUploadError(error);
      },
      () => {
        onUploaded({ file, filePath });  // THE IMPORTANT BIT!

        db.collection("pictures")
          .add({ filePath })
          .then(() => { onSaved() })

      }
    }
  }, [file])

  return (
      <input
        type="file"
        accept="image/jpeg, image/png"
        onInput={(event) => {
          if (event.target.files) {
            const file = event.target.files[0];
            validateFile(file);
            setFile(file);
          }
        }}
      />
  );
}

The important "trick" is that we call back after the storage is complete by sending the filePath and the file back to whatever component triggered this component. Now, you can know, in the parent component, that there's going to soon be an image reference with a file path (filePath) that refers to that File object.

Here's a rough version of how I use this <FileUpload> component:


function Images() {

  const [uploadedFiles, setUploadedFiles] = useState<Map<string, File>>(
    new Map()
  );

  return (<div>  
    <FileUpload
      onUploaded={({ file, filePath }: { file: File; filePath: string }) => {
        const newMap: Map<string, File> = new Map(uploadedFiles);
        newMap.set(filePath, file);
        setUploadedFiles(newMap);
      }}
      />

    <ListUploadedPictures uploadedFiles={uploadedFiles}/>
    </div>
  );
}

function ListUploadedPictures({ uploadedFiles}: {uploadedFiles: Map<string, File>}) {

  // Imagine some Firebase Firestore subscriber here
  // that watches for uploaded pictures. 
  return <div>
    {pictures.map(picture => (
      <Picture picture={picture} uploadedFiles={uploadedFiles} />
    ))}
  </div>
}

function Picture({ 
  uploadedFiles,
  picture,
}: {
  uploadedFiles: Map<string, File>;
  picture: {
    filePath: string;
  }
}) {
  const thumbnailURL = getThumbnailURL(filePath, 500);
  const [loaded, setLoaded] = useState(false);

  useEffect(() => {
    const preloadImg = new Image();
    preloadImg.src = thumbnailURL;

    const callback = () => {
      if (mounted) {
        setLoaded(true);
      }
    };
    if (preloadImg.decode) {
      preloadImg.decode().then(callback, callback);
    } else {
      preloadImg.onload = callback;
    }

    return () => {
      mounted = false;
    };
  }, [thumbnailURL]);

  return <img
    style={{
      width: 500,
      height: 500,
      "object-fit": "cover",
    }}
    src={
      loaded
        ? thumbnailURL
        : file
        ? URL.createObjectURL(file)
        : PLACEHOLDER_IMAGE
    }
  />
}

Phew! That was a lot of code. Sorry about that. But still, this is just a summary of the real application code.

The point is that; I send the File object back to the parent component immediately after having uploaded it to Firebase Cloud Storage. Then, having access to that as a File object, I can use that as the thumbnail while I wait for the real thumbnail to come in. Now, it doesn't matter that it takes 1-2 seconds to wake up the cloud function and 1-2 seconds to perform the thumbnail creation, and then 0.1-2 seconds to download the thumbnail. All the while this is happening you're looking at the File object that was uploaded. Visually, the user doesn't even notice the difference. If you refresh the page, that temporary in-memory uploadedFiles (Map instance) is empty so you're now relying on the loading of the thumbnail which should hopefully, at this point, be stored in the browser's native HTTP cache.

The other important part of the trick is that we're using const preloadImg = new Image() for loading the thumbnail. And by relying on preloadImage.decode ? preloadImage.decode().then(...) : preload.onload = ... we can be informed only when the thumbnail has been successfully created and successfully downloaded to make the swap.

10 years a Mozillian, always a Mozillian

August 30, 2021
2 comments Web development, Mozilla, MDN

As of September 2021, I am leaving Mozilla after 10 years. It hasn't been perfect but it's been a wonderful time with fond memories and an amazing career rocket ship.

In April 2011, I joined as a web developer to work on internal web applications that support the Firefox development engineering. In rough order, I worked on...

  • Elmo: The web application for managing the state of Firefox localization
  • Socorro: When Firefox crashes and asks to send a crash dump, this is the storage plus website for analyzing that
  • Peekaboo: When people come to visit a Mozilla office, they sign in on a tablet at the reception desk
  • Balrog: For managing what versions are available for Firefox products to query when it's time to self-upgrade
  • Air Mozilla: For watching live streams and video archive of all recordings within the company
  • MozTrap: When QA engineers need to track what, and the results, of QA testing Firefox products
  • Symbol Server: Where all C++ debug symbols are stored from the build pipeline to be used to source-map crash stack traces
  • Buildhub: To get a complete database of all and every individual build shipped of Firefox products
  • Remote Settings: Managing experiments and for Firefox to "phone home" for smaller updates/experiments between releases
  • MDN Web Docs: Where web developers go to look up all the latest and most detailed details about web APIs

This is an incomplete list because at Mozilla you get to help each other and I shipped a lot of smaller projects too, such as Contribute.json, Whatsdeployed, GitHub PR Triage, Bugzilla GitHub Bug Linker.

Reflecting back, the highlight of any project is when you get to meet or interact with the people you help. Few things are as rewarding as when someone you don't know, in person, finds out what you do and they say: "Are you Peter?! The one who built XYZ? I love that stuff! We use it all the time now in my team. Thank you!" It's not a brag because oftentimes what you build for fellow humans it isn't engineering'ly brilliant in any way. It's just something that someone needed. Perhaps the lesson learned is the importance of not celebrating what you've built but just put you into the same room as who uses what you built. And, in fact, if what you've built for someone else isn't particularly loved, by meeting and fully interactive with the people who use "your stuff" gives you the best of feedback and who doesn't love constructive criticism so you can become empowered to build better stuff.

Mozilla is a great company. There is no doubt in my mind. We ship high-quality products and we do it with pride. There have definitely been some rough patches over the years but that happens and you just have to carry on and try to focus on delivering value. Firefox Nightly will continue to be my default browser and I'll happily click any Google search ads to help every now and then. THANK YOU everyone I've ever worked with at Mozilla! You are a wonderful bunch of people!