Introducing hylite - a Node code-syntax-to-HTML highlighter written in Bun

October 3, 2023
0 comments Node, Bun, JavaScript

hylite is a command line tool for syntax highlight code into HTML. You feed it a file or some snippet of code (plus what language it is) and it returns a string of HTML.

Suppose you have:


❯ cat example.py
# This is example.py
def hello():
    return "world"

When you run this through hylite you get:


❯ npx hylite example.py
<span class="hljs-keyword">def</span> <span class="hljs-title function_">hello</span>():
    <span class="hljs-keyword">return</span> <span class="hljs-string">&quot;world&quot;</span>

Now, if installed with the necessary CSS, it can finally render this:


# This is example.py
def hello():
    return "world"

(Note: At the time of writing this, npx hylite --list-css or npx hylite --css don't work unless you've git clone the github.com/peterbe/hylite repo)

How I use it

This originated because I loved how highlight.js works. It supports numerous languages, can even guess the language, is fast as heck, and the HTML output is compact.

Originally, my personal website, whose backend is in Python/Django, was using Pygments to do the syntax highlighting. The problem with that is it doesn't support JSX (or TSX). For example:


export function Bell({ color }: {color: string}) {
  return <div style={{ backgroundColor: color }}>Ding!</div>
}

The problem is that Python != Node so to call out to hylite I use a sub-process. At the moment, I can't use bunx or npx because that depends on $PATH and stuff that the server doesn't have. Here's how I call hylite from Python:


command = settings.HYLITE_COMMAND.split()
assert language
command.extend(["--language", language, "--wrapped"])
process = subprocess.Popen(
    command,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    cwd=settings.HYLITE_DIRECTORY,
)
process.stdin.write(code)
output, error = process.communicate()

The settings are:


HYLITE_DIRECTORY = "/home/django/hylite"
HYLITE_COMMAND = "node dist/index.js"

How I built hylite

What's different about hylite compared to other JavaScript packages and CLIs like this is that the development requires Bun. It's lovely because it has a built-in test runner, TypeScript transpiler, and it's just so lovely fast at starting for anything you do with it.

In my current view, I see Bun as an equivalent of TypeScript. It's convenient when developing but once stripped away it's just good old JavaScript and you don't have to worry about compatibility.

So I use bun for manual testing like bun run src/index.ts < foo.go but when it comes time to ship, I run bun run build (which executes, with bun, the src/build.ts) which then builds a dist/index.js file which you can run with either node or bun anywhere.

By the way, the README as a section on Benchmarking. It concludes two things:

  1. node dist/index.js has the same performance as bun run dist/index.js
  2. bunx hylite is 7x times faster than npx hylite but it's bullcrap because bunx doesn't check the network if there's a new version (...until you restart your computer)

Shallow clone vs. deep clone, in Node, with benchmark

September 29, 2023
0 comments Node, JavaScript

A very common way to create a "copy" of an Object in JavaScript is to copy all things from one object into an empty one. Example:


const original = {foo: "Foo"}
const copy = Object.assign({}, original)
copy.foo = "Bar"
console.log([original.foo, copy.foo])

This outputs


[ 'Foo', 'Bar' ]

Obviously the problem with this is that it's a shallow copy, best demonstrated with an example:


const original = { names: ["Peter"] }
const copy = Object.assign({}, original)
copy.names.push("Tucker")
console.log([original.names, copy.names])

This outputs:


[ [ 'Peter', 'Tucker' ], [ 'Peter', 'Tucker' ] ]

which is arguably counter-intuitive. Especially since the variable was named "copy".
Generally, I think Object.assign({}, someThing) is often a red flag because if not today, maybe in some future the thing you're copying might have mutables within.

The "solution" is to use structuredClone which has been available since Node 16. Actually, it was introduced within minor releases of Node 16, so be a little bit careful if you're still on Node 16.

Same example:


const original = { names: ["Peter"] };
// const copy = Object.assign({}, original);
const copy = structuredClone(original);
copy.names.push("Tucker");
console.log([original.names, copy.names]);

This outputs:


[ [ 'Peter' ], [ 'Peter', 'Tucker' ] ]

Another deep copy solution is to turn the object into a string, using JSON.stringify and turn it back into a (deeply copied) object using JSON.parse. It works like structuredClone but full of caveats such as unpredictable precision loss on floating point numbers, and not to mention date objects ceasing to be date objects but instead becoming strings.

Benchmark

Given how much "better" structuredClone is in that it's more intuitive and therefore less dangerous for sneaky nested mutation bugs. Is it fast? Before even running a benchmark; no, structuredClone is slower than Object.assign({}, ...) because of course. It does more! Perhaps the question should be: how much slower is structuredClone? Here's my benchmark code:


import fs from "fs"
import assert from "assert"

import Benchmark from "benchmark"

const obj = JSON.parse(fs.readFileSync("package-lock.json", "utf8"))

function f1() {
  const copy = Object.assign({}, obj)
  copy.name = "else"
  assert(copy.name !== obj.name)
}

function f2() {
  const copy = structuredClone(obj)
  copy.name = "else"
  assert(copy.name !== obj.name)
}

function f3() {
  const copy = JSON.parse(JSON.stringify(obj))
  copy.name = "else"
  assert(copy.name !== obj.name)
}

new Benchmark.Suite()
  .add("f1", f1)
  .add("f2", f2)
  .add("f3", f3)
  .on("cycle", (event) => {
    console.log(String(event.target))
  })
  .on("complete", function () {
    console.log("Fastest is " + this.filter("fastest").map("name"))
  })
  .run()

The results:

❯ node assign-or-clone.js
f1 x 8,057,542 ops/sec ±0.84% (93 runs sampled)
f2 x 37,245 ops/sec ±0.68% (94 runs sampled)
f3 x 37,978 ops/sec ±0.85% (92 runs sampled)
Fastest is f1

In other words, Object.assign({}, ...) is 200 times faster than structuredClone.
By the way, I re-ran the benchmark with a much smaller object (using the package.json instead of the package-lock.json) and then Object.assign({}, ...) is only 20 times faster.

Mind you! They're both ridiculously fast in the grand scheme of things.

If you do this...


for (let i = 0; i < 10; i++) {
  console.time("f1")
  f1()
  console.timeEnd("f1")

  console.time("f2")
  f2()
  console.timeEnd("f2")

  console.time("f3")
  f3()
  console.timeEnd("f3")
}

the last bit of output of that is:

f1: 0.006ms
f2: 0.06ms
f3: 0.053ms

which means that it took 0.06 milliseconds for structuredClone to make a convenient deep copy of an object that is 5KB as a JSON string.

Conclusion

Yes Object.assign({}, ...) is ridiculously faster than structuredClone but structuredClone is a better choice.

Pip-Outdated.py with interactive upgrade

September 21, 2023
0 comments Python

Last year I wrote a nifty script called Pip-Outdated.py "Pip-Outdated.py - a script to compare requirements.in with the output of pip list --outdated". It basically runs pip list --outdated but filters based on the packages mentioned in your requirements.in. For people familiar with Node, it's like checking all installed packages in node_modules if they have upgrades, but filter it down by only those mentioned in your package.json.

I use this script often enough that I added a little interactive input to ask if it should edit requirements.in for you for each possible upgrade. Looks like this:


❯ Pip-Outdated.py
black               INSTALLED: 23.7.0    POSSIBLE: 23.9.1
click               INSTALLED: 8.1.6     POSSIBLE: 8.1.7
elasticsearch-dsl   INSTALLED: 7.4.1     POSSIBLE: 8.9.0
fastapi             INSTALLED: 0.101.0   POSSIBLE: 0.103.1
httpx               INSTALLED: 0.24.1    POSSIBLE: 0.25.0
pytest              INSTALLED: 7.4.0     POSSIBLE: 7.4.2

Update black from 23.7.0 to 23.9.1? [y/N/q] y
Update click from 8.1.6 to 8.1.7? [y/N/q] y
Update elasticsearch-dsl from 7.4.1 to 8.9.0? [y/N/q] n
Update fastapi from 0.101.0 to 0.103.1? [y/N/q] n
Update httpx from 0.24.1 to 0.25.0? [y/N/q] n
Update pytest from 7.4.0 to 7.4.2? [y/N/q] y

and then,


❯ git diff requirements.in | cat
diff --git a/requirements.in b/requirements.in
index b7a246e..0e996e5 100644
--- a/requirements.in
+++ b/requirements.in
@@ -9,7 +9,7 @@ python-decouple==3.8
 fastapi==0.101.0
 uvicorn[standard]==0.23.2
 selectolax==0.3.16
-click==8.1.6
+click==8.1.7
 python-dateutil==2.8.2
 gunicorn==21.2.0
 # I don't think this needs `[secure]` because it's only used by
@@ -18,7 +18,7 @@ requests==2.31.0
 cachetools==5.3.1

 # Dev things
-black==23.7.0
+black==23.9.1
 flake8==6.1.0
-pytest==7.4.0
+pytest==7.4.2
 httpx==0.24.1

That's it. Then if you want to actually make these upgrades you run:


❯ pip-compile --generate-hashes requirements.in && pip install -r requirements.txt

To install it, download the script from: https://gist.github.com/peterbe/a2b158c39f1f835c0977c82befd94cdf
and put it in your ~/bin and make it executable.
Now go into a directory that has a requirements.in and run Pip-Outdated.py

Parse a CSV file with Bun

September 13, 2023
0 comments Bun

I'm really excited about Bun and look forward to trying it out more and more.
Today I needed a quick script to parse a CSV file to compute some simple arithmetic on some numbers in it.

To do that, here's what I did:


bun init
bun install csv-simple-parser
code index.ts

And the code:


import parse from "csv-simple-parser";

console.time("total");
const numbers: number[] = [];
const file = Bun.file(process.argv.slice(2)[0]);
type Rec = {
  Pageviews: string;
};
const csv = parse(await file.text(), { header: true }) as Rec[];
for (const row of csv) {
  numbers.push(parseInt(row["Pageviews"] || "0"));
}
console.timeEnd("total");
console.log("Mean  ", numbers.reduce((a, b) => a + b, 0) / numbers.length);
console.log("Median", numbers.sort()[Math.floor(numbers.length / 2)]);

And running it:

wc -l file.csv
   13623 file.csv

❯ /usr/bin/time bun run index.ts file.csv
[8.20ms] total
Mean   7.205534757395581
Median 1
        0.04 real         0.03 user         0.01 sys

(On my Intel MacBook Pro...) The reading in the file and parsing the 13k lines took 8.2 milliseconds. The whole execution took 0.04 seconds. Pretty neat.

Hello-world server in Bun vs Fastify

September 9, 2023
4 comments Node, JavaScript, Bun

Bun 1.0 just launched and I'm genuinely impressed and intrigued. How long can this madness keep going? I've never built anything substantial with Bun. Just various scripts to get a feel for it.

At work, I recently launched a micro-service that uses Node + Fastify + TypeScript. I'm not going to rewrite it in Bun, but I'm going to get a feel for the difference.

Basic version in Bun

No need for a package.json at this point. And that's neat. Create a src/index.ts and put this in:


const PORT = parseInt(process.env.PORT || "3000");

Bun.serve({
  port: PORT,
  fetch(req) {
    const url = new URL(req.url);
    if (url.pathname === "/") return new Response(`Home page!`);
    if (url.pathname === "/json") return Response.json({ hello: "world" });
    return new Response(`404!`);
  },
});
console.log(`Listening on port ${PORT}`);

What's so cool about the convenience-oriented developer experience of Bun is that it comes with a native way for restarting the server as you're editing the server code:


❯ bun --hot src/index.ts
Listening on port 3000

Let's test it:


❯ xh http://localhost:3000/
HTTP/1.1 200 OK
Content-Length: 10
Content-Type: text/plain;charset=utf-8
Date: Sat, 09 Sep 2023 02:34:29 GMT

Home page!

❯ xh http://localhost:3000/json
HTTP/1.1 200 OK
Content-Length: 17
Content-Type: application/json;charset=utf-8
Date: Sat, 09 Sep 2023 02:34:35 GMT

{
    "hello": "world"
}

Basic version with Node + Fastify + TypeScript

First of all, you'll need to create a package.json to install the dependencies, all of which, at this gentle point are built into Bun:


❯ npm i -D ts-node typescript @types/node nodemon
❯ npm i fastify

And edit the package.json with some scripts:


  "scripts": {
    "dev": "nodemon src/index.ts",
    "start": "ts-node src/index.ts"
  },

And of course, the code itself (src/index.ts):


import fastify from "fastify";

const PORT = parseInt(process.env.PORT || "3000");

const server = fastify();

server.get("/", async () => {
  return "Home page!";
});

server.get("/json", (request, reply) => {
  reply.send({ hello: "world" });
});

server.listen({ port: PORT }, (err, address) => {
  if (err) {
    console.error(err);
    process.exit(1);
  }
  console.log(`Server listening at ${address}`);
});

Now run it:


❯ npm run dev

> fastify-hello-world@1.0.0 dev
> nodemon src/index.ts

[nodemon] 3.0.1
[nodemon] to restart at any time, enter `rs`
[nodemon] watching path(s): *.*
[nodemon] watching extensions: ts,json
[nodemon] starting `ts-node src/index.ts`
Server listening at http://[::1]:3000

Let's test it:


❯ xh http://localhost:3000/
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 10
Content-Type: text/plain; charset=utf-8
Date: Sat, 09 Sep 2023 02:42:46 GMT
Keep-Alive: timeout=72

Home page!

❯ xh http://localhost:3000/json
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 17
Content-Type: application/json; charset=utf-8
Date: Sat, 09 Sep 2023 02:43:08 GMT
Keep-Alive: timeout=72

{
    "hello": "world"
}

For the record, I quite like this little setup. nodemon can automatically understand TypeScript. It's a neat minimum if Node is a desire.

Quick benchmark

Bun

Note that this server has no logging or any I/O.


❯ bun src/index.ts
Listening on port 3000

Using hey to test 10,000 requests across 100 concurrent clients:

❯ hey -n 10000 -c 100 http://localhost:3000/

Summary:
  Total:    0.2746 secs
  Slowest:  0.0167 secs
  Fastest:  0.0002 secs
  Average:  0.0026 secs
  Requests/sec: 36418.8132

  Total data:   100000 bytes
  Size/request: 10 bytes

Node + Fastify


❯ npm run start

Using hey again:

❯ hey -n 10000 -c 100 http://localhost:3000/

Summary:
  Total:    0.6606 secs
  Slowest:  0.0483 secs
  Fastest:  0.0001 secs
  Average:  0.0065 secs
  Requests/sec: 15138.5719

  Total data:   100000 bytes
  Size/request: 10 bytes

About a 2x advantage to Bun.

Serving an HTML file with Bun


Bun.serve({
  port: PORT,
  fetch(req) {
    const url = new URL(req.url);
    if (url.pathname === "/") return new Response(`Home page!`);
    if (url.pathname === "/json") return Response.json({ hello: "world" });
+   if (url.pathname === "/index.html")
+     return new Response(Bun.file("src/index.html"));
    return new Response(`404!`);
  },
});

Serves the src/index.html file just right:


❯ xh --headers http://localhost:3000/index.html
HTTP/1.1 200 OK
Content-Length: 889
Content-Type: text/html;charset=utf-8

Serving an HTML file with Node + Fastify

First, install the plugin:

❯ npm i @fastify/static

And make this change:


+import path from "node:path";
+
 import fastify from "fastify";
+import fastifyStatic from "@fastify/static";

 const PORT = parseInt(process.env.PORT || "3000");

 const server = fastify();

+server.register(fastifyStatic, {
+  root: path.resolve("src"),
+});
+
 server.get("/", async () => {
   return "Home page!";
 });
 server.get("/json", (request, reply) => {
   reply.send({ hello: "world" });
 });

+server.get("/index.html", (request, reply) => {
+  reply.sendFile("index.html");
+});
+
 server.listen({ port: PORT }, (err, address) => {
   if (err) {
     console.error(err);

And it works great:


❯ xh --headers http://localhost:3000/index.html
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: public, max-age=0
Connection: keep-alive
Content-Length: 889
Content-Type: text/html; charset=UTF-8
Date: Sat, 09 Sep 2023 03:04:15 GMT
Etag: W/"379-18a77e4e346"
Keep-Alive: timeout=72
Last-Modified: Sat, 09 Sep 2023 03:03:23 GMT

Quick benchmark of serving the HTML file

Bun


❯ hey -n 10000 -c 100 http://localhost:3000/index.html

Summary:
  Total:    0.6408 secs
  Slowest:  0.0160 secs
  Fastest:  0.0001 secs
  Average:  0.0063 secs
  Requests/sec: 15605.9735

  Total data:   8890000 bytes
  Size/request: 889 bytes

Node + Fastify


❯ hey -n 10000 -c 100 http://localhost:3000/index.html

Summary:
  Total:    1.5473 secs
  Slowest:  0.0272 secs
  Fastest:  0.0078 secs
  Average:  0.0154 secs
  Requests/sec: 6462.9597

  Total data:   8890000 bytes
  Size/request: 889 bytes

Again, a 2x performance win for Bun.

Conclusion

There isn't much to conclude here. Just an intro to the beauty of how quick Bun is, both in terms of developer experience and raw performance.
What I admire about Bun being such a convenient bundle is that Python'esque feeling of simplicity and minimalism. (For example python3.11 -m http.server -d src 3000 will make http://localhost:3000/index.html work)

The basic boilerplate of Node with Fastify + TypeScript + nodemon + ts-node is a great one if you're not ready to make the leap to Bun. I would certainly use it again. Fastify might not be the fastest server in the Node ecosystem, but it's good enough.

What's not shown in this little intro blog post, and is perhaps a silly thing to focus on, is the speed with which you type bun --hot src/index.ts and the server is ready to go. It's as far as human perception goes instant. The npm run dev on the other hand has this ~3 second "lag". Not everyone cares about that, but I do. It's more of an ethos. It's that wonderful feeling that you don't pause your thinking.

npm run dev GIF

It's hard to see when I press the Enter key but compare that to Bun:

bun --hot GIF

UPDATE (Sep 11, 2023)

I found this: github.com/SaltyAom/bun-http-framework-benchmark
It's a much better benchmark than mine here. Mind you, as long as you're not using something horribly slow, and you're not doing any I/O the HTTP framework performances don't matter much.

ts-node vs. esrun vs. esno vs. bun

August 28, 2023
0 comments Node, JavaScript

UPDATE (Jan 31, 2024)

Since this was published, I've added tsx to the benchmark. The updated results, if you skip the two slowest are:


Summary
  bun src/index.ts ran
    4.69 ± 0.20 times faster than esrun src/index.ts
    7.07 ± 0.30 times faster than tsx src/index.ts
    7.24 ± 0.33 times faster than esno src/index.ts
    7.40 ± 0.68 times faster than ts-node --transpileOnly src/index.ts

END OF UPDATE

From the totally unscientific bunker research lab of executing TypeScript files on the command line...

I have a very simple TypeScript app that you can run from the command line:


// This is src/index.ts

import { Command } from "commander";
const program = new Command();
program
  .option("-d, --debug", "output extra debugging")
  .option("-s, --small", "small pizza size")
  .option("-p, --pizza-type <type>", "flavour of pizza");

program.parse(process.argv);

const options = program.opts();

console.log("options", options);

tsc

In the original days, there was just tsc which, when given your *.ts would create an equivalent *.js file. Remember this?:


> tsc src/index.ts
> node src/index.js
> rm src/index.js

(note, most likely you'd put "outDir": "./build", in your tsconfig.json so it creates build/index.js instead)

Works. And it checks potential faults in your TypeScript code itself. For example:

❯ tsc src/index.ts
src/index.ts:8:21 - error TS2339: Property 'length' does not exist on type 'Command'.

8 console.log(program.length);
                      ~~~~~~

I don't know about you, but I rarely encounter these kinds of errors. If you view a .ts[x] file you're working on in Zed or VS Code it's already red and has squiggly lines.

VS Code with active TypeScript error

Sure, you'll make sure, one last time in your CI scripts that there are no TypeScript errors like this:

ts-node

ts-node, from that I gather is the "original gangster" of abstractions on top of TypeScript. It works quite similarly to tsc except you don't bother dumping the .js file to disk to then run it with node.

tsc src/index.ts && node src/index.js is the same as ts-node src/index.ts

It also has error checking, by default, when you run it. It can look like this:

❯ ts-node src/index.ts
/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:859
    return new TSError(diagnosticText, diagnosticCodes, diagnostics);
           ^
TSError: ⨯ Unable to compile TypeScript:
src/index.ts:8:21 - error TS2339: Property 'length' does not exist on type 'Command'.

8 console.log(program.length);
                      ~~~~~~

    at createTSError (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:859:12)
    at reportTSError (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:863:19)
    at getOutput (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1077:36)
    at Object.compile (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1433:41)
    at Module.m._compile (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1617:30)
    at Module._extensions..js (node:internal/modules/cjs/loader:1310:10)
    at Object.require.extensions.<computed> [as .ts] (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1621:12)
    at Module.load (node:internal/modules/cjs/loader:1119:32)
    at Function.Module._load (node:internal/modules/cjs/loader:960:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12) {
  diagnosticCodes: [ 2339 ]
}

But, suppose you don't really want those TypeScript errors right now. Suppose you are confident it doesn't error, then you want it to run as fast as possible. That's where ts-node --transpileOnly src/index.ts comes in. It's significantly faster. If you compare ts-node src/index.ts with ts-node --transpileOnly src/index.ts:

❯ hyperfine "ts-node src/index.ts" "ts-node --transpileOnly src/index.ts"
Benchmark 1: ts-node src/index.ts
  Time (mean ± σ):     990.7 ms ±  68.5 ms    [User: 1955.5 ms, System: 124.7 ms]
  Range (min … max):   916.5 ms … 1124.7 ms    10 runs

Benchmark 2: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     301.5 ms ±  10.6 ms    [User: 286.7 ms, System: 44.4 ms]
  Range (min … max):   283.0 ms … 313.9 ms    10 runs

Summary
  ts-node --transpileOnly src/index.ts ran
    3.29 ± 0.25 times faster than ts-node src/index.ts

In other words, ts-node --transpileOnly src/index.ts is 3 times faster than ts-node src/index.ts

esno and @digitak/esrun

@digitak/esrun and esno are improvements to ts-node, as far as I can understand, are improvements on ts-node that can only run. I.e. you still have to use tsc --noEmit in your CI scripts. But they're supposedly both faster than ts-node --transpileOnly:

❯ hyperfine "ts-node --transpileOnly src/index.ts" "esrun src/index.ts" "esno src/index.ts"
Benchmark 1: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     291.8 ms ±  10.5 ms    [User: 276.9 ms, System: 43.9 ms]
  Range (min … max):   280.3 ms … 309.1 ms    10 runs

Benchmark 2: esrun src/index.ts
  Time (mean ± σ):     226.4 ms ±   6.0 ms    [User: 187.9 ms, System: 42.8 ms]
  Range (min … max):   216.8 ms … 237.5 ms    13 runs

Benchmark 3: esno src/index.ts
  Time (mean ± σ):     237.2 ms ±   3.9 ms    [User: 222.8 ms, System: 45.2 ms]
  Range (min … max):   229.6 ms … 244.6 ms    12 runs

Summary
  esrun src/index.ts ran
    1.05 ± 0.03 times faster than esno src/index.ts
    1.29 ± 0.06 times faster than ts-node --transpileOnly src/index.ts

In other words, esrun is 1.05e times faster than esno and 1.29 times faster than ts-node --transpileOnly.

But given that I quite like running npm run dev to use ts-node without the --transpileOnly error for realtime TypeScript errors in the console that runs a dev server, I don't know if it's worth it.

(BONUS) bun

If you haven't heard of bun in the Node ecosystem, you've been living under a rock. It's kinda like deno but trying to appeal to regular Node projects from the ground up and it does things like bun install so much faster than npm install that you wonder if it even ran. It too can run in transpile-only mode and just execute the TypeScript code as if it was JavaScript directly. And it's fast!

Because ts-node --transpileOnly is a bit of a "standard", let's compare the two:

❯ hyperfine "ts-node --transpileOnly src/index.ts" "bun src/index.ts"
Benchmark 1: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     286.9 ms ±   6.9 ms    [User: 274.4 ms, System: 41.6 ms]
  Range (min … max):   272.0 ms … 295.8 ms    10 runs

Benchmark 2: bun src/index.ts
  Time (mean ± σ):      40.3 ms ±   2.0 ms    [User: 29.5 ms, System: 9.9 ms]
  Range (min … max):    36.5 ms …  47.1 ms    60 runs

Summary
  bun src/index.ts ran
    7.12 ± 0.40 times faster than ts-node --transpileOnly src/index.ts

Wow! Given its hype, I'm not surprised bun is 7 times faster than ts-node --transpileOnly.

But admittedly, not all programs work seamlessly in bun like my sample app did this in example.

Here's the complete result comparing all of them:

❯ hyperfine "tsc src/index.ts && node src/index.js" "ts-node src/index.ts" "ts-node --transpileOnly src/index.ts" "esrun src/index.ts" "esno src/index.ts" "bun src/index.ts"
Benchmark 1: tsc src/index.ts && node src/index.js
  Time (mean ± σ):      2.158 s ±  0.097 s    [User: 5.145 s, System: 0.201 s]
  Range (min … max):    2.032 s …  2.276 s    10 runs

Benchmark 2: ts-node src/index.ts
  Time (mean ± σ):     942.0 ms ±  40.6 ms    [User: 1877.2 ms, System: 115.6 ms]
  Range (min … max):   907.4 ms … 1012.4 ms    10 runs

Benchmark 3: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     307.1 ms ±  14.4 ms    [User: 291.0 ms, System: 45.3 ms]
  Range (min … max):   283.1 ms … 329.0 ms    10 runs

Benchmark 4: esrun src/index.ts
  Time (mean ± σ):     276.4 ms ± 121.0 ms    [User: 198.9 ms, System: 45.7 ms]
  Range (min … max):   212.2 ms … 619.2 ms    10 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (619.2 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.

Benchmark 5: esno src/index.ts
  Time (mean ± σ):     257.7 ms ±  14.3 ms    [User: 238.3 ms, System: 48.0 ms]
  Range (min … max):   238.8 ms … 282.0 ms    10 runs

Benchmark 6: bun src/index.ts
  Time (mean ± σ):      40.5 ms ±   1.6 ms    [User: 29.9 ms, System: 9.8 ms]
  Range (min … max):    36.4 ms …  44.8 ms    62 runs

Summary
  bun src/index.ts ran
    6.36 ± 0.44 times faster than esno src/index.ts
    6.82 ± 3.00 times faster than esrun src/index.ts
    7.58 ± 0.47 times faster than ts-node --transpileOnly src/index.ts
   23.26 ± 1.38 times faster than ts-node src/index.ts
   53.29 ± 3.23 times faster than tsc src/index.ts && node src/index.js

Bar chart comparing bun to esno, esrun, ts-node and tsc

Conclusion

Perhaps you can ignore bun. It might best fastest, but it's also "weirdest". It usually works great in small and simple apps and especially smaller ones that just you have to maintain (if "maintain" is even a concern at all).

I don't know how to compare them in size. ts-node is built on top of acorn which is written in JavaScript. @digitak/esrun is a wrapper for esbuild (and esno is wrapper for tsx which is also on top of esbuild) which is a fast bundler written in Golang. So it's packaged as a binary in your node_modules which hopefully works between your laptop, your CI, and your Dockerfile but it's nevertheless a binary.

Given that esrun and esno isn't that much faster than ts-node and ts-node can check your TypeScript that's a bonus for ts-node.
But esbuild is an actively maintained project that seems to become stable and accepted.

As always, this was just a quick snapshot of an unrealistic app that is less than 10 lines of TypeScript code. I'd love to hear more about what kind of results people are getting comparing the above tool when you apply it on much larger projects that have more complex tsconfig.json for things like JSX.

Switching from Next.js to Vite + wouter

July 28, 2023
0 comments React, Node, JavaScript

Next.js is a full front-end web framework. Vite is a build tool so they don't easily compare. But if you're building a single-page app ("SPA"), the difference isn't that big, especially if you bolt on a routing library which is something that Next.js has built in.

My SPA is a relatively straight forward one. It's a React app that uses wonderful Mantine UI framework. The app is CRM for real-estate agents that I've been hacking on with my wife. SEO is not a concern because you can't do anything until you've signed in. So server-side rendering is not a requirement. In that sense, it's like loading Gmail. Yes, users might want a speedy first load when they open it in a fresh new browser tab, but the static assets are most likely going to be heavily (browser) cached by the few users it has.

With that out of the way, let's skim through some of the differences.

Build times

Immediately, this is a tricky one to compare because Next.js has the ability to cache. You get that .next/cache/ directory which is black magic to me, but it clearly speeds things up. And it's incremental so the caching can help partially when only some of the code has changed.

Running, npm run build && npm run export a couple of times yields:

Next.js

Without no .next/cache/ directory

Total time to run npm run build && npm run export: 52 seconds

With the .next/cache/ left before each build

Total time to run npm run build && npm run export: 30 seconds

Vite

Total time to run npm run build: 12 seconds

A curious thing about Vite here is that its output contains a measurement of the time it took. But I ignored that and used /usr/bin/time -h ... instead. This gives me the total time.
I.e. the output of npm run build will say:

✓ built in 7.67s

...but it actually took 12.2 seconds with /usr/bin/time.

Build artifacts

Perhaps not very important because Next.js automatically code splits in its wonderfully clever way.

Next.js

❯ du -sh out
1.8M    out
❯ tree out | rg '\.js|\.css' | wc -l
      52

Vite

❯ du -sh dist
960K    dist

and

❯ tree dist/assets
dist/assets
├── index-1636ae43.css
└── index-d568dfbf.js

Again, it's probably unfair to compare at this point. Most of the weight of these static assets (particularly the .js files) is due to Mantine components being so heavy.

Routing

This isn't really a judgment in any way. More of a record how it differs in functionality.

Next.js

In my app, that I'm switching from Next.js to Vite + wouter, I use the old way of using Next.js which is to use a src/pages/* directory. For example, to make a route to the /account/settings page I first create:


// src/pages/account/settings.tsx

import { Settings } from "../../components/account/settings"

const Page = () => {
  return <Settings />
}
export default Page

I'm glad I built it this way in the first place. When I now port to Vite + wouter, I don't really have to touch that src/components/account/settings.tsx code because that component kinda assumes it's been invoked by some routing.

Vite + wouter

First I installed the router in the src/App.tsx. Abbreviated code:


// src/App.tsx

import { Routes } from "./routes"

export default function App() {
  const { myTheme, colorScheme, toggleColorScheme } = useMyTheme()
  return (
    <ColorSchemeProvider
      colorScheme={colorScheme}
      toggleColorScheme={toggleColorScheme}
    >
      <MantineProvider withGlobalStyles withNormalizeCSS theme={myTheme}>
        <Routes />
      </MantineProvider>
    </ColorSchemeProvider>
  )
}

By the way, the code for Next.js looks very similar in its src/pages/_app.tsx with all those contexts that Mantine make you wrap things in.

And here's the magic routing:


// src/routes.tsx

import { Router, Switch, Route } from "outer"

import { Home } from "./components/home"
import { Authenticate } from "./components/authenticate"
import { Settings } from "./components/account/settings"
import { Custom404 } from "./components/404"

export function Routes() {
  return (
    <Router>
      <Switch>
        <Route path="/signin" component={Authenticate} />
        <Route path="/account/settings" component={Settings} />
        {/* many more lines like this ... */}

        <Route path="/" component={Home} />

        <Route>
          <Custom404 />
        </Route>
      </Switch>
    </Router>
  )
}

Redirecting with router

This is a made-up example, but it demonstrates the pattern with wouter compared to Next.js

Next.js


const { push } = useRouter()

useEffect(() => {
  if (user) {
    push('/signedin')
  }
}, [user])

wouter


const [, setLocation] = useLocation()

useEffect(() => {
  if (user) {
    setLocation('/signedin')
  }
}, [user])

Linking

Next.js


import Link from 'next/link'

// ...

<Link href="/settings" passHref>
  <Anchor>Settings</Anchor>
</Link>

wouter


import { Link } from "wouter"

// ...

<Link href="/settings">
  <Anchor>Settings</Anchor>
</Link>

Getting a query string value

Next.js


import { useRouter } from "next/router"

// ...

const { query } = useRouter()

if (query.name) {
  const name = Array.isArray(query.name) ? query.name[0] : query.name
  // ...
}

wouter


import { useSearch } from "wouter/use-location"

// ...

const search = useSearch()
const searchParams = new URLSearchParams(search)

if (searchParams.get('name')) {
  const name = searchParams.get('name')
  // ...
}

Conclusion

The best thing about Next.js is its momentum. It gets lots of eyes on it. Lots of support opportunities and great chance of its libraries being maintained well into the future. Vite also has great momentum and adaptation. But wouter is less "common".

Comparing apples and oranges is often counter-productive if you don't take all constraints and angles into account and those are usually quite specific. In my case, I just want to build a single-page app. I don't want a Node server. In fact, my particular app is a Python backend that does all the API responses from a fetch in the JavaScript app. That Python app also serves the built static files, including the dist/index.html file. That's how my app can serve the app straight away if the current URL is something like /account/settings. A piece of Python code (more or less the only code that doesn't serve /api/* URLs) collapses all initial serving URLs to serve the dist/index.html file. It's a classic pattern and honestly feels a bit dated in 2023. But it works. And what's so great about all of this is that I have a multi-stage Dockerfile that first does the npm run build (and some COPY --from=frontend /home/node/app/dist ./server/out) and now I can "lump" together the API backend and the front-end code in just 1 server (which I host on Digital Ocean).

If you had to write a SPA in 2023 what would you use? In particular, if it has to be React. Remix is all about server-side rendering. Create-react-app is completely unsupported. Building it from scratch yourself rolling your own TypeScript + Eslint + Rollup/esbuild/Parcel/Webpack does not feel productive unless you have enough time and energy to really get it all right.

In terms of comparing the performance between Next.js and Vite + wouter, the time it takes to build the whole app is actually not that big a deal. It's a rare thing to do. It's something I do after a long coding/debugging session. What's more pressing is how npm run dev works.
With Vite, I type npm run dev and hit Enter. Faster than I can almost notice, after hitting Enter I see...

VITE v4.4.6  ready in 240 ms

  ➜  Local:   http://localhost:3000/
  ➜  Network: use --host to expose
  ➜  press h to show help

and I'm ready to open http://localhost:3000/ to play. With Next.js, after having typed npm run dev and Enter, there's this slight but annoying delay before it's ready.

Benchmark comparison of Elasticsearch highlighters

July 5, 2023
0 comments Elasticsearch

tl;dr; fvh is marginally faster than unified and unified is a bit faster than plain.

When you send a full-text search query to Elasticsearch, you can specify (if and) how it should highlight, with HTML tags, highlights. E.g.


The correct way to index data into <mark>Elasticsearch</mark> with (Python) <mark>elasticsearch</mark>-dsl

Among other configuration options, you can pick one of 3 different highlighter algorithms:

  1. unified (default)
  2. plain
  3. fvh

The last one, fvh, requires that you index more at index-time (in particular to add term_vector="with_positions_offsets" to the mapping). In a previous benchmark I did, the total document size on disk, as described by http://localhost:9200/_cat/indices?v grew by 38%.

I bombarded my local Elasticsearch 7.7 instance with thousands of queries collected from logs. Some single-word, some multi-word. The fields it highlights are things like title (~5-50 words) and body (~100-2,000 words).
Basically, I edited the search query by testing one at a time. For example:


search_query = search_query.highlight(
-   "title", fragment_size=120, number_of_fragments=1, type="unified"
+   "title", fragment_size=120, number_of_fragments=1, type="plain"
)

...etc.

After doing 1,000 searches 3 different times per each highlighter type option, and recording the times it took I recorded the following:

(milliseconds per query, lower is better)

UNIFIED:
  MEAN  18.1ms
  MEDIAN 19.0ms

PLAIN:
  MEAN  24.5ms
  MEDIAN 27.5ms

FVH:
  MEAN  16.1ms
  MEDIAN 17.6ms

Thin marginal win for fvh over unified.

Conclusion

Conclusion? Or should I say "Caveats" instead? There's a lot more to it than raw performance speed. In this benchmark, it takes ~20 milliseconds to search on 2 different indexes, each with a scoring function and indexes containing between 1,000 and 5,000 documents with hundreds of thousands of words. So it's pretty minor.

Each highlighter performs slightly differently too, so you'd have to study the outcome a bit more carefully to get a better feel for if it works the way you and your team prefer it to work.

If there's any conclusion, other than the boring usual "it depends on your setup and preferences", the performance difference is noticeable but not blowing you away. It makes sense that fvh is a bit faster because you've paid for it by indexing more upfront (the offsets) at the expense of memory.

How I used Parcel to "manually" bundle CSS files in a Remix app

May 31, 2023
0 comments JavaScript

I recently switch from Nextjs to Remix for my personal website. One thing I struggled with was to have it merge individual .css files into one. So I solved it with the Parcel CLI. This blog post demonstrates how.

The problem

Note, first of all, this talks about the global CSS. You can and should still employ CSS Modules or something equivalent for CSS that is tied directly to a React component.

But global CSS has its place and purpose. The problem is that there's no convenient way to bundle multiple little .css files into one which you can then nest into routes in Remix.

The way you inject CSS into a Remix page is like this:


import highlight from "~/styles/highlight.css";
import blogpost from "~/styles/blogpost.css";

...


export function links() {
  return [
    { rel: "stylesheet", href: highlight },
    { rel: "stylesheet", href: blogpost },
}

And for the record, suppose you have a nested route that needs those, and another one you do:


import banner from "~/styles/banner.css";
import { links as rootLinks } from "./_index";

...


export function links() {
  return [
    ...rootLinks().filter((x) => !x.extra),
    { rel: "stylesheet", href: banner },
  ];
}

This will nicely pick up those source .css files, minify them and produce in the final HTML SSR output:


<link rel="stylesheet" href="/build/_assets/highlight-KI4AX52K.css"/>
<link rel="stylesheet" href="/build/_assets/blogpost-75V4EYTP.css"/>

Nice. Http2 is famously good at parallel downloads. But even that has its physical limits. Especially if you have many little .css files that make up all the CSS you need. Now you have multiple files that can get stuck on the network. Yes, you might be able to update 1 and keep caching the others if their fingerprint don't change, but this is likely to be rare.

Parcel to the rescue

I solved it by using the Parcel CLI. In package.json I have:

"parcel:build": "parcel build --dist-dir app/styles/build app/*.css",

And in app/global.css I have this:


/* This is app/global.css */

@import "../node_modules/@picocss/pico/css/pico.css";
@import "./styles/globals.css";
@import "./styles/message.css";
@import "./styles/nav.css";
@import "./styles/comments.css";
@import "./styles/carbonads.css";
@import "./styles/carbonads-outer.css";
@import "./styles/modal-search.css";

That means, that Parcel will bundle all of these app/*.css files into 1 app/styles/build/global.css
Now, I can refer to that built on in the Remix app:


import global from "~/styles/build/global.css";

...

export function links() {
  return [
    { rel: "stylesheet", href: global },
  ]
}

Build vs. dev

Ok, so that explains how to bundle individual CSS files before you actually use the bundled CSS files. Remix doesn't care (a good thing).
At this point, we've modularized the problem. Now Parcel can do what it does best (CSS bundling (among other things it can do)) and Remix can do what it does (serving the .css files into the HTML).

But just like it's ergonomically pleasant to bundle CSS files like this, we still want it so that you don't have to manually run a separate step to build the bundle every time you edit an individual source .css file (e.g. app/styles/nav.css)

Here's how I solved that split up by Dev and Build

Build


  "scripts": {
-   "build": "remix build",
+   "build": "npm run parcel:build && remix build",
+   "parcel:build": "parcel build --dist-dir app/styles/build app/*.css",

Now, npm run build will do both things.

Dev


  "scripts": {
    "dev": "npm-run-all build --parallel \"dev:*\"",
    "dev:node": "cross-env NODE_ENV=development esrun --watch ./server.ts ",
    "dev:remix": "remix watch",
+   "dev:parcel": "parcel watch --dist-dir app/styles/build app/global.css",

In conclusion

I admit, I'm a CSS Modules fan-boy and it saddens me how much global CSS I have. One thing at a time, I guess. They both have their powers; global and modular CSS, but I'll admit that my own personal site still relies a bit too much on global CSS. At least, little goes to waste because Remix makes it relatively easy to pick exactly which files you need for individual routes.

Be careful with Date.toLocaleDateString() in JavaScript

May 8, 2023
4 comments Node, macOS, JavaScript

tl;dr; Always pass timeZone:"UTC" when calling Date.toLocaleDateString

The surprise

In my browser's web console:

>>> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
"26"

On my server located in the same time zone:

Welcome to Node.js v16.13.0.
Type ".help" for more information.
> process.env.TZ
undefined
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'26'

Here on my laptop:

Welcome to Node.js v16.20.0.
Type ".help" for more information.
> process.env.TZ
undefined
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'27'

What! Despite $TZ not being set, it formats according to something else.

02:50 Zulu means, to me, in the US Eastern time zone, the day before.

Why this matters

Web console server React errors
I kept getting this production error from React that the SSR-rendered HTML differed from the client-side rendered HTML. Strangely, I could never reproduce this locally and the error doesn't say what's different. All the Stack Overflow suggestions and Google results speak of the most basic easy things to check. It's not unusual that this happens when dealing with dates because even though the database (PostgreSQL) stores the dates in full UTC, sometimes when data travels via app servers through JSON pipelines, date formatting can drop important bits.
But here, '2014-11-27T02:50:49Z' is specific.

What made this so incredibly hard to debug was that it worked on one page but not on the other even though the two had the same exact component code. I broke it apart thinking there was something nasty in the content of the Markdown-rendered HTML. No. The reason it only happened on some pages was that I had a function that looked like this:


export function formatDateBasic(date: string) {
  return new Date(date).toLocaleDateString("en-us", {
    year: "numeric",
    month: "long",
    day: "numeric",
  });
}

And, different pages listed, almost non-deterministic, with different dates for related content which was referred to along with their dates. So on one page, there might be a single date that formats differently in EDT (Eastern daylight-saving time) compared to UTC. For example, Apr 1 at 18:00 Zulu, is still Apr 1 in EDT.

The explanation

I'm sorry that I don't understand this better, but Node's implementation of Date.toLocaleDateString does more than depend on process.env.TZ. I think $TZ is just a way to gain control.

For example, start the node REPL like this:

On my Ubuntu 20.04 server:

$ TZ=utc node
Welcome to Node.js v16.20.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'27'

On my MacBook:

❯ TZ=utc node
Welcome to Node.js v16.13.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'27'

To find out what timezone your computer has:

On Ubuntu:

$ timedatectl
               Local time: Mon 2023-05-08 12:42:03 UTC
           Universal time: Mon 2023-05-08 12:42:03 UTC
                 RTC time: Mon 2023-05-08 12:42:04
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

On macOS:

❯ sudo systemsetup -gettimezone
Password:
Time Zone: America/New_York

The solution

Setting TZ is probably a good thing. That can get a bit tricky though. Your code needs to run consistently on your laptop, in GitHub Actions, on a VPS server, in an Edge cloud function, etc.

A better way is to force Date.toLocaleString to be fed a timezone. Now it's controlled at the highest level:


export function formatDateBasic(date: string) {
  return new Date(date).toLocaleDateString("en-us", {
    year: "numeric",
    month: "long",
    day: "numeric",
+   timeZone: "UTC"
  });
}

Now, it no longer depends on the OS it runs on.

On my Ubuntu server:

Welcome to Node.js v16.20.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric", timeZone: "UTC"})
'27'

On my macOS:

Welcome to Node.js v16.13.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric", timeZone: "UTC"})
'27'

Fun fact

I once made it unnecessarily weird for me in the debugging session, when I figured out about the timeZone option. What I ran was this:

Welcome to Node.js v16.13.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric", zimeZone: "UTC"})
'26'

I expected it to be '27' now but why did it revert?? Notice the typo? And Date.toLocaleDateString won't throw an error for passing in options it doesn't expect.