Filtered by JavaScript, Python

Page 4

Reset

Be careful with Date.toLocaleDateString() in JavaScript

May 8, 2023
4 comments Node, macOS, JavaScript

tl;dr; Always pass timeZone:"UTC" when calling Date.toLocaleDateString

The surprise

In my browser's web console:

>>> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
"26"

On my server located in the same time zone:

Welcome to Node.js v16.13.0.
Type ".help" for more information.
> process.env.TZ
undefined
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'26'

Here on my laptop:

Welcome to Node.js v16.20.0.
Type ".help" for more information.
> process.env.TZ
undefined
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'27'

What! Despite $TZ not being set, it formats according to something else.

02:50 Zulu means, to me, in the US Eastern time zone, the day before.

Why this matters

Web console server React errors
I kept getting this production error from React that the SSR-rendered HTML differed from the client-side rendered HTML. Strangely, I could never reproduce this locally and the error doesn't say what's different. All the Stack Overflow suggestions and Google results speak of the most basic easy things to check. It's not unusual that this happens when dealing with dates because even though the database (PostgreSQL) stores the dates in full UTC, sometimes when data travels via app servers through JSON pipelines, date formatting can drop important bits.
But here, '2014-11-27T02:50:49Z' is specific.

What made this so incredibly hard to debug was that it worked on one page but not on the other even though the two had the same exact component code. I broke it apart thinking there was something nasty in the content of the Markdown-rendered HTML. No. The reason it only happened on some pages was that I had a function that looked like this:


export function formatDateBasic(date: string) {
  return new Date(date).toLocaleDateString("en-us", {
    year: "numeric",
    month: "long",
    day: "numeric",
  });
}

And, different pages listed, almost non-deterministic, with different dates for related content which was referred to along with their dates. So on one page, there might be a single date that formats differently in EDT (Eastern daylight-saving time) compared to UTC. For example, Apr 1 at 18:00 Zulu, is still Apr 1 in EDT.

The explanation

I'm sorry that I don't understand this better, but Node's implementation of Date.toLocaleDateString does more than depend on process.env.TZ. I think $TZ is just a way to gain control.

For example, start the node REPL like this:

On my Ubuntu 20.04 server:

$ TZ=utc node
Welcome to Node.js v16.20.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'27'

On my MacBook:

❯ TZ=utc node
Welcome to Node.js v16.13.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric"})
'27'

To find out what timezone your computer has:

On Ubuntu:

$ timedatectl
               Local time: Mon 2023-05-08 12:42:03 UTC
           Universal time: Mon 2023-05-08 12:42:03 UTC
                 RTC time: Mon 2023-05-08 12:42:04
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

On macOS:

❯ sudo systemsetup -gettimezone
Password:
Time Zone: America/New_York

The solution

Setting TZ is probably a good thing. That can get a bit tricky though. Your code needs to run consistently on your laptop, in GitHub Actions, on a VPS server, in an Edge cloud function, etc.

A better way is to force Date.toLocaleString to be fed a timezone. Now it's controlled at the highest level:


export function formatDateBasic(date: string) {
  return new Date(date).toLocaleDateString("en-us", {
    year: "numeric",
    month: "long",
    day: "numeric",
+   timeZone: "UTC"
  });
}

Now, it no longer depends on the OS it runs on.

On my Ubuntu server:

Welcome to Node.js v16.20.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric", timeZone: "UTC"})
'27'

On my macOS:

Welcome to Node.js v16.13.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric", timeZone: "UTC"})
'27'

Fun fact

I once made it unnecessarily weird for me in the debugging session, when I figured out about the timeZone option. What I ran was this:

Welcome to Node.js v16.13.0.
Type ".help" for more information.
> new Date('2014-11-27T02:50:49Z').toLocaleDateString("en-us", {day: "numeric", zimeZone: "UTC"})
'26'

I expected it to be '27' now but why did it revert?? Notice the typo? And Date.toLocaleDateString won't throw an error for passing in options it doesn't expect.

Automatically 'npm install'

April 6, 2023
0 comments Node, JavaScript

I implemented this at work recently and although it felt like a hack, I've come to like it and it's been very helpful to our many contributors.
As (Node) engineers, we know that you should keep your node_modules up-to-date by running npm install periodically or every time you git pull from the upstream. It could be that some package got upgraded last night since you git pulled last time.
But not everyone remembers to run npm install often enough. They might do git pull origin main && npm start and now the code that starts up depends on some latest version that was upgraded in package.json and package-lock.json.

How we solved it was that we added this script:


node script/cmp-files.js package-lock.json .installed.package-lock.json || npm install && cp package-lock.json .installed.package-lock.json

And it's hooked up as a script in package.json called prestart:

"scripts": {
  ...
  "prestart": "node script/cmp-files.js ...",
  ...
}

Now, every time you run npm start to start up the local development server, it will run that piece of bash. No more having to remember to run npm install after every git pull.

A note on performance

The npm install command is fast when all packages are already updated. You can see it with:


# First time
$ npm install

# Second time when nothing should happen
$ time npm install
...
2.53s user 0.37s system 134% cpu 2.166 total

So it only takes 2 seconds. Not bad.


$ time node script/cmp-files.js package-lock.json .installed.package-lock.json
...
0.08s user 0.03s system 100% cpu 0.110 total

But 0.08 seconds is better :)

The comparison script

The cmp-files.js script looks like this:


#!/usr/bin/env node

// Given N files. Exit 0 if they all exist and are identical in content.

import fs from 'fs'

import { program } from 'commander'

program.description('Compare N files').arguments('[files...]', '').parse(process.argv)

main(program.args)

function main(files) {
  if (files.length < 2) throw new Error('Must be at least 2 files')
  try {
    const contents = files.map((file) => fs.readFileSync(file, 'utf-8'))
    if (new Set(contents).size > 1) {
      process.exit(1)
    }
  } catch (error) {
    if (error.code === 'ENOENT') {
      process.exit(1)
    } else {
      throw error
    }
  }
}

The file .installed.package-lock.json file is added to the repo's .gitignore

Note; given how well this works for running before npm start we can probably add this to a post-checkout git hook too.

Benchmarking npm install with or without audit

February 23, 2023
1 comment Node, JavaScript

By default, running npm install will do a security audit of your installed packages. That audit is fast but it still takes a bit of time. To disable it you can either add --no-audit or you can...:

cat .npmrc
audit=false

But how much does the audit take when running npm install? To find out, I wrote this:


import random
import statistics
import subprocess
import time
from collections import defaultdict


def f1():
    subprocess.check_output("npm install".split())


def f2():
    subprocess.check_output("npm install --no-audit".split())


functions = f1, f2

times = defaultdict(list)
for i in range(25):
    f = random.choice(functions)

    t0 = time.time()
    f()
    t1 = time.time()
    times[f.__name__].append(t1 - t0)
    time.sleep(5)


for f_name in sorted(times.keys()):
    print(
        f_name,
        f"mean: {statistics.mean(times[f_name]):.1f}s".ljust(10),
        f"median: {statistics.median(times[f_name]):.1f}s",
    )

Note how it runs a lot of times in case there are network hiccups and it sleeps between each run just to spread out the experiment over a longer period of time. And the results are:

f1 mean: 2.81s median: 2.57s
f2 mean: 2.25s median: 2.21s

Going by the median time, the --no-audit makes the npm install 16% faster. If you look at the mean time dropping the --no-audit can make it 25% faster.

The technology behind You Should Watch

January 28, 2023
0 comments You Should Watch, React, Firebase, JavaScript

I recently launched You Should Watch which is a mobile-friendly web app to have a to-watch list of movies and TV shows as well being able to quickly share the links if you want someone to "you should watch" it.

I'll be honest, much of the motivation of building that web app was to try a couple of newish technologies that I wanted to either improve on or just try for the first time. These are the interesting tech pillars that made it possible to launch this web app in what was maybe 20-30 hours of total time.

All the code for You Should Watch is here: https://github.com/peterbe/youshouldwatch-next

The Movie Database API

The cornerstone that made this app possible in the first place. The API is free for developers who don't intend to earn revenue on whatever project they build with it. More details in their FAQ.

The search functionality is important. The way it works is that you can do a "multisearch" which means it finds movies, TV shows, or people. Then, when you have each search result's id and media_type you can fetch a lot more information specifically. For example, that's how the page for a person displays things differently than the page for a movie.

Next.js and the new App dir

In Next.js 13 you have a choice between regular pages directory or an app directory where every page (which becomes a URL) has to be called page.tsx.

No judgment here. It was a bit cryptic to rewrap my brain on how this works. In particular, the head.tsx is now different from the page.tsx and since both, in server-side rendering, need some async data I have to duplicate the await getMediaData() instead of being able to fetch it once and share with drop-drilling or context.

Vercel deployment

Wow! This was the most pleasant experience I've experienced in years. So polished and so much "just works". You sign in, with your GitHub auth, click to select the GitHub repo (that has a next.config.js and package.json etc) and you're done. That's it! Now, not only does every merged PR automatically (and fast!) get deployed, but you also get a preview deployment for every PR (which I didn't use).

I'm still using the free hobby tier but god forbid this app gets serious traffic, I'd just bump it up to $20/month which is cheap. Besides, the app is almost entirely CDN cacheable so only the search XHR backend would linearly increase its load with traffic I think.

Well done Vercel!

Playwright and VS Code

Not the first time I used Playwright but it was nice to return and start afresh. It definitely has improved in developer experience.

Previously I used npx and the terminal to run tests, but this time I tried "Playwright Test for VSCode" which was just fantastic! There are some slightly annoying things in that I had to use the mouse cursor more than I'd hoped, but it genuinely helped me be productive. Playwright also has the ability to generate JS code based on me clicking around in a temporary incognito browser window. You do a couple of things in the browser then paste in the generated source code into tests/basics.spec.ts and do some manual tidying up. To run the debugger like that, one simply types pnpm dlx playwright codegen

pnpm

It seems hip and a lot of people seem to recommend it. Kinda like yarn was hip and often recommended over npm (me included!).

Sure it works and it installs things fast but is it noticeable? Not really. Perhaps it's 4 seconds when it would have been 5 seconds with npm. Apparently pnpm does clever symlinking to avoid a disk-heavy node_modules/ but does it really matter ...much?
It's still large:

du -sh node_modules
468M    node_modules

A disadvantage with pnpm is that GitHub Dependabot currently doesn't support it :(
An advantage with pnpm is that pnpm up -i --latest is great interactive CLI which works like yarn upgrade-interactive --latest

just

just is like make but written in Rust. Now I have a justfile in the root of the repo and can type shortcut commands like just dev or just emu[TAB] (to tab autocomplete).

In hindsight, my justfile ended up being just a list of pnpm run ... commands but the idea is that just would be for all and any command under one roof.

End of the day, it becomes a nifty little file of "recipes" of useful commands and you can easily string them together. For example just lint is the combination of typing pnpm run prettier:check and pnpm run tsc and pnpm run lint.

Pico.css

A gorgeously simple looking pure-CSS framework. Yes, it's very limited in components and I don't know how well it "tree shakes" but it's so small and so neat that it had everything I needed.

My favorite React component library is Mantine but I definitely love the piece of mind that Pico.css is just CSS so you're not A) stuck with React forever, and B) not unnecessary JS code that slows things down.

Firebase

Good old Firebase. The bestest and easiest way to get a reliable and realtime database that is dirt cheap, simple, and has great documentation. I do regret not trying Supabase but I knew that getting the OAuth stuff to work with Google on a custom domain would be tricky so I stayed with Firebase.

react-lite-youtube-embed

A port of Paul Irish's Lite YouTube Embed which makes it easy to display YouTube thumbnails in a web performant way. All you have to do is:


import LiteYouTubeEmbed from "react-lite-youtube-embed";

<LiteYouTubeEmbed
   id={youtubeVideo.id}
   title={youtubeVideo.title} />

In conclusion

It's amazing how much time these tools saved compared to just years ago. I could build a fully working side-project with automation and high quality entirely thanks to great open source or well-tuned proprietary components, in just about one day if you sum up the hours.

Pip-Outdated.py - a script to compare requirements.in with the output of pip list --outdated

December 22, 2022
0 comments Python

Simply by posting this, there's a big chance you'll say "Hey! Didn't you know there's already a well-known script that does this? Better." Or you'll say "Hey! That'll save me hundreds of seconds per year!"

The problem

Suppose you have a requirements.in file that is used, by pip-compile to generate the requirements.txt that you actually install in your Dockerfile or whatever server deployment. The requirements.in is meant to be the human-readable file and the requirements.txt is for the computers. You manually edit the version numbers in the requirements.in and then run pip-compile --generate-hashes requirements.in to generate a new requirements.txt. But the "first-class" packages in the requirements.in aren't the only packages that get installed. For example:

▶ cat requirements.in | rg '==' | wc -l
      54

▶ cat requirements.txt | rg '==' | wc -l
     102

In other words, in this particular example, there are 76 "second-class" packages that get installed. There might actually be more stuff installed that you didn't describe. That's why pip list | wc -l can be even higher. For example, you might have locally and manually done pip install ipython for a nicer interactive prompt.

The solution

The command pip list --outdated will list packages based on the requirements.txt not the requirements.in. To mitigate that, I wrote a quick Python CLI script that combines the output of pip list --outdated with the packages mentioned in requirements.in:


#!/usr/bin/env python

import subprocess


def main(*args):
    if not args:
        requirements_in = "requirements.in"
    else:
        requirements_in = args[0]
    required = {}
    with open(requirements_in) as f:
        for line in f:
            if "==" in line:
                package, version = line.strip().split("==")
                package = package.split("[")[0]
                required[package] = version

    res = subprocess.run(["pip", "list", "--outdated"], capture_output=True)
    if res.returncode:
        raise Exception(res.stderr)

    lines = res.stdout.decode("utf-8").splitlines()
    relevant = [line for line in lines if line.split()[0] in required]

    longest_package_name = max([len(x.split()[0]) for x in relevant]) if relevant else 0

    for line in relevant:
        p, installed, possible, *_ = line.split()
        if p in required:
            print(
                p.ljust(longest_package_name + 2),
                "INSTALLED:",
                installed.ljust(9),
                "POSSIBLE:",
                possible,
            )


if __name__ == "__main__":
    import sys

    sys.exit(main(*sys.argv[1:]))

Installation

To install this, you can just download the script and run it in any directory that contains a requirements.in file.

Or you can install it like this:

curl -L https://gist.github.com/peterbe/099ad364657b70a04b1d65aa29087df7/raw/23fb1963b35a2559a8b24058a0a014893c4e7199/Pip-Outdated.py > ~/bin/Pip-Outdated.py
chmod +x ~/bin/Pip-Outdated.py

Pip-Outdated.py

How to change the current query string URL in NextJS v13 with next/navigation

December 9, 2022
4 comments React, JavaScript

At the time of writing, I don't know if this is the optimal way, but after some trial and error, I got it working.

This example demonstrates a hook that gives you the current value of the ?view=... (or a default) and a function you can call to change it so that ?view=before becomes ?view=after.

In NextJS v13 with the pages directory:


import { useRouter } from "next/router";

export function useNamesView() {
    const KEY = "view";
    const DEFAULT_NAMES_VIEW = "buttons";
    const router = useRouter();

    let namesView: Options = DEFAULT_NAMES_VIEW;
    const raw = router.query[KEY];
    const value = Array.isArray(raw) ? raw[0] : raw;
    if (value === "buttons" || value === "table") {
        namesView = value;
    }

    function setNamesView(value: Options) {
        const [asPathRoot, asPathQuery = ""] = router.asPath.split("?");
        const params = new URLSearchParams(asPathQuery);
        params.set(KEY, value);
        const asPath = `${asPathRoot}?${params.toString()}`;
        router.replace(asPath, asPath, { shallow: true });
    }

    return { namesView, setNamesView };
}

In NextJS v13 with the app directory.


import { useRouter, useSearchParams, usePathname } from "next/navigation";

type Options = "buttons" | "table";

export function useNamesView() {
    const KEY = "view";
    const DEFAULT_NAMES_VIEW = "buttons";
    const router = useRouter();
    const searchParams = useSearchParams();
    const pathname = usePathname();

    let namesView: Options = DEFAULT_NAMES_VIEW;
    const value = searchParams.get(KEY);
    if (value === "buttons" || value === "table") {
        namesView = value;
    }

    function setNamesView(value: Options) {
        const params = new URLSearchParams(searchParams);
        params.set(KEY, value);
        router.replace(`${pathname}?${params}`);
    }

    return { namesView, setNamesView };
}

The trick is that you only want to change 1 query string value and respect whatever was there before. So if the existing URL was /page?foo=bar and you want that to become /page?foo=bar&and=also you have to consume the existing query string and you do that with:


const searchParams = useSearchParams();
...
const params = new URLSearchParams(searchParams);
params.set('and', 'also')

How much faster is Cheerio at parsing depending on xmlMode?

December 5, 2022
0 comments Node, JavaScript

Cheerio is a fantastic Node library for parsing HTML and then being able to manipulate and serialize it. But you can also just use it for parsing HTML and plucking out what you need. We use that to prepare the text that goes into our search index for our site. It basically works like this:


const body = await getBody('http://localhost:4002' + eachPage.path)
const $ = cheerio.load(body)
const title = $('h1').text()
const intro = $('p.intro').text()
...

But it hit me, can we speed that up? cheerio actually ships with two different parsers:

  1. parse5
  2. htmlparser2

One is faster and one is more strict.
But I wanted to see this in a real-world example.

So I made two runs where I used:


const $ = cheerio.load(body)

in one run, and:


const $ = cheerio.load(body, { xmlMode: true })

in another.

After having parsed 1,635 pages of HTML of various sizes the results are:

FILE: load.txt
MEAN:   13.19457640586797
MEDIAN: 10.5975

FILE: load-xmlmode.txt
MEAN:   3.9020372860635697
MEDIAN: 3.1020000000000003

So, using {xmlMode:true} leads to roughly a 3x speedup.

I think it pretty much confirms the original benchmark, but now I know based on a real application.

First impressions trying out Rome to format/lint my TypeScript and JavaScript

November 14, 2022
1 comment Node, JavaScript

Rome is a new contender to compete with Prettier and eslint, combined. It's fast and its suggestions are much easier to understand.

I have a project that uses .js, .ts, and .tsx files. At first, I thought, I'd just use rome to do formatting but the linter part was feeling nice as I was experimenting so I thought I'd kill two birds with one stone.

Things that worked well

It is fast

My little project only has 28 files, but time rome check lib scripts components *.ts consistently takes 0.08 seconds.

The CLI looks great

You get this nice prompt after running npx rome init the first time:

rome init

Suggestions just look great

Easy to understand and needs no explanation because the suggested fix tells a story that means it's immediately easy to understand what the warning is trying to say.

suggestion

It is smaller

If I run npx create-next-app@latest, say yes to Eslint, and then run npm I -D prettier, the node_modules becomes 275.3 MiB.
Whereas if I run npx create-next-app@latest, say no to Eslint, and then run npm I -D rome, the node_modules becomes 200.4 MiB.

Editing the rome.json's JSON schema works in VS Code

I don't know how this magically worked, but I'm guessing it just does when you install the Rome VS Code extension. Neat with autocomplete!

editing the rome.json file

Things that didn't work so well

Almost all things that I'm going to "complain" about is down to usability. I might look back at this in a year (or tomorrow!) and laugh at myself for being dim, but it nevertheless was part of my experience so it's worth pointing out.

Lint, check, or format?

It's confusing what is what. If lint means checking without modifying, what is check then? I'm guessing rome format means run the lint but with permission to edit my files.

What is rome format compared to rome check --apply then??

I guess rome check --apply doesn't just complain but actually applies the things it spots. So what is rome check --apply-suggested?? (if you're reading this and feel eager to educate me with a comment, please do, but I'm trying to point out that it's not user-friendly)

How do I specify wildcards?

Unfortunately, in this project, not all files are in one single directory (e.g. rome check src/ is not an option). How do I specify a wildcard expression?


▶ rome check *.ts
Checked 3 files in 942µs

Cool, but how do I do all .ts files throughout the project?


▶ rome check "**/*.ts"
**/*.ts internalError/io ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  ✖ No such file or directory (os error 2)


Checked 0 files in 66µs

Clearly, it's not this:


▶ rome check **/*.ts

...

The number of diagnostics exceeds the number allowed by Rome.
Diagnostics not shown: 1018.
Checked 2534 files in 1387ms
Skipped 1 files
Error: errors where emitted while running checks

...because bash will include all the files from node_modules/**/*.ts.

In the end, I ended up with this (in my package.json):

"scripts": {
    "code:lint": "rome check lib scripts components *.ts",
    ...

There's no documentation about how to ignore certain rules

Yes, I can contribute this back to the documentation, but today's not the day to do that.

It took me a long time to find out how to disable certain rules (in the rome.json file) and finally I landed on this:

{
  "linter": {
    "enabled": true,
    "rules": {
      "recommended": true,
      "style": {
        "recommended": true,
        "noImplicitBoolean": "off"
      },
      "a11y": {
        "useKeyWithClickEvents": "off",
        "useValidAnchor": "warn"
      }
    }
  }
}

Much better than having to write inline code comments with the source files themselves.

However, it's still not clear to me what "recommended": true means. Is it shorthand for listing all the default rules all set to true? If I remove that, are no rules activated?

The rome.json file is JSON

JSON is cool for many things, but writing comments is not one of them.

For example, I don't know what would be better, Yaml or Toml, but it would be nice to write something like:

"a11y": {
    # Disabled because of issue #1234
    # Consider putting this back in December after the refactor launch
    "useKeyWithClickEvents": "off",

Nextjs and rome needs to talk

When create-react-app first came onto the scene, the coolest thing was the zero-config webpack. But, if you remember, it also came with a really nice zero-config eslint configuration for React apps. It would even print warnings when the dev server was running. Now it's many years later and good linting config is something you depend/rely on in a framework. Like it or not, there are specific things in Nextjs that is exclusive to that framework. It's obviously not an easy people-problem to solve but it would be nice if Nextjs and rome could be best friends so you get all the good linting ideas from the code Nextjs framework but all done using rome instead.

Spot the JavaScript bug with recursion and incrementing

September 28, 2022
0 comments JavaScript

What will this print?


function doSomething(iterations = 0) {
  if (iterations < 10) {
    console.log("Let's do this again!")
    doSomething(iterations++)    
  }
}
doSomething()

The answer is it will print

Let's do this again!
Let's do this again!
Let's do this again!
Let's do this again!
Let's do this again!
Let's do this again!
Let's do this again!
Let's do this again!
...forever...

The bug is the use of a "postfix increment" which is a bug I had in some production code (almost, it never shipped).

The solution is simple:


     console.log("Let's do this again!")
-    doSomething(iterations++)
+    doSomething(++iterations)    

That's called "prefix increment" which means it not only changes the variable but returns what the value became rather than what it was before increment.

The beautiful solution is actually the simplest solution:


     console.log("Let's do this again!")
-    doSomething(iterations++)
+    doSomething(iterations + 1)    

Now, you don't even mutate the value of the iterations variable but create a new one for the recursion call.

All in all, pretty simple mistake but it can easily happen. Particular if you feel inclined to look cool by using the spiffy ++ shorthand because it looks neater or something.

Programmatically render a NextJS page without a server in Node

September 6, 2022
1 comment Web development, Node, JavaScript

If you use getServerSideProps() in Next you can render a page by visiting it. E.g. GET http://localhost:3000/mypages/page1
Or if you use getStaticProps() with getStaticPaths(), you can use npm run build to generate the HTML file (e.g. .next/server/pages directory).
But what if you don't want to start a server. What if you have a particular page/URL in mind that you want to generate but without starting a server and sending an HTTP GET request to it? This blog post shows a way to do this with a plain Node script.

Here's a solution to programmatically render a page:


#!/usr/bin/env node

import http from "http";

import next from "next";

async function main(uris) {
  const nextApp = next({});
  const nextHandleRequest = nextApp.getRequestHandler();
  await nextApp.prepare();

  const htmls = Object.fromEntries(
    await Promise.all(
      uris.map((uri) => {
        try {
          // If it's a fully qualified URL, make it its pathname
          uri = new URL(uri).pathname;
        } catch {}
        return renderPage(nextHandleRequest, uri);
      })
    )
  );
  console.log(htmls);
}

async function renderPage(handler, url) {
  const req = new http.IncomingMessage(null);
  const res = new http.ServerResponse(req);
  req.method = "GET";
  req.url = url;
  req.path = url;
  req.cookies = {};
  req.headers = {};
  await handler(req, res);
  if (res.statusCode !== 200) {
    throw new Error(`${res.statusCode} on rendering ${req.url}`);
  }
  for (const { data } of res.outputData) {
    const [, body] = data.split("\r\n\r\n");
    if (body) return [url, body];
  }
  throw new Error("No output data has a body");
}

main(process.argv.slice(2)).catch((err) => {
  console.error(err);
  process.exit(1);
});

To demonstrate I created this sample repo: https://github.com/peterbe/programmatically-render-next-page

Note, that you need to run npm run build first so Next can have all the static assets ready.

In conclusion

The alternative, in automation, would be run something like this:


▶ npm run build && npm run start &
▶ sleep 5  # give the server a chance to start
▶ xh http://localhost:3000/aboutus
HTTP/1.1 200 OK
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html; charset=utf-8
Date: Tue, 06 Sep 2022 12:23:42 GMT
Etag: "m8ff9sdduo1hk"
Keep-Alive: timeout=5
Transfer-Encoding: chunked
Vary: Accept-Encoding
X-Powered-By: Next.js

<!DOCTYPE html><html><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><title>About Us page</title><meta name="description" content="We do things. I hope."/><link rel="icon" href="/favicon.ico"/><meta name="next-head-count" content="5"/><link rel="preload" href="/_next/static/css/ab44ce7add5c3d11.css" as="style"/><link rel="stylesheet" href="/_next/static/css/ab44ce7add5c3d11.css" data-n-g=""/><link rel="preload" href="/_next/static/css/ae0e3e027412e072.css" as="style"/><link rel="stylesheet" href="/_next/static/css/ae0e3e027412e072.css" data-n-p=""/><noscript data-n-css=""></noscript><script defer="" nomodule="" src="/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js"></script><script src="/_next/static/chunks/webpack-7ee66019f7f6d30f.js" defer=""></script><script src="/_next/static/chunks/framework-db825bd0b4ae01ef.js" defer=""></script><script src="/_next/static/chunks/main-3123a443c688934f.js" defer=""></script><script src="/_next/static/chunks/pages/_app-deb173bd80cbaa92.js" defer=""></script><script src="/_next/static/chunks/996-f1475101e84cf548.js" defer=""></script><script src="/_next/static/chunks/pages/aboutus-41b1f037d974ef60.js" defer=""></script><script src="/_next/static/REJUWXI26y-lp9JVmzJB5/_buildManifest.js" defer=""></script><script src="/_next/static/REJUWXI26y-lp9JVmzJB5/_ssgManifest.js" defer=""></script></head><body><div id="__next"><div class="Home_container__bCOhY"><main class="Home_main__nLjiQ"><h1 class="Home_title__T09hD">About Use page</h1><p class="Home_description__41Owk"><a href="/">Go to the <b>Home</b> page</a></p></main><footer class="Home_footer____T7K"><a href="/">Home page</a></footer></div></div><script id="__NEXT_DATA__" type="application/json">{"props":{"pageProps":{}},"page":"/aboutus","query":{},"buildId":"REJUWXI26y-lp9JVmzJB5","nextExport":true,"autoExport":true,"isFallback":false,"scriptLoader":[]}</script></body></html>

There are probably many great ideas that this can be used for. At work we use getServerSideProps() and we have too many pages to build them all statically. We need a solution like this to do custom analysis of the rendered HTML to check for broken links by analyzing every generated <a href> tag.