Inspecting the index size in PostgreSQL

Monday, Apr 21, 2025
0 comments PostgreSQL

tl;dr ...


SELECT 
  indexrelname, 
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size 
FROM pg_stat_user_indexes 
WHERE relname LIKE 'search_terms%';

I'm working on a new table that is "search terms". A search term is constructed from extracting combinations of words, from titles, on my blog. For example, if the title is "Inspecting the index size in PostgreSQL" the search terms for that are:

inspecting
inspecting index
inspecting index size
inspecting index size postgresql
index
index size
index size postgresql
size
size postgresql
postgresql

It's created by lower casing the input and taking one, two, three, and four non-stopwords.

To search these in PostgreSQL I use trigrams using pg_trgm. For example,


SELECT id, term
FROM search_terms
WHERE
  term LIKE 'conf%'
  OR term LIKE '% conf%'
LIMIT 10

There's more to it, but this is a good start.

Anyway, I was curious how much space this is going to take up.
By indexing trigrams, you're indexing a lot of small individual strings. So lots of combinations.

The index is defined like this:


CREATE INDEX search_terms_term_trgm 
ON search_terms 
USING GIST (term gist_trgm_ops)

After inserting a bit over 10,000 of these "search terms" the size of the index becomes:


peterbecom=#
SELECT
  indexrelname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE relname LIKE 'search_terms%';
      indexrelname      | index_size
------------------------+------------
 search_terms_pkey      | 248 kB
 search_terms_term_key  | 488 kB
 search_terms_term_trgm | 856 kB
(3 rows)

Curiously, if you set a higher siglen (e.g. GIST (term gist_trgm_ops(siglen=24))), it actually becomes smaller.

Conclusion(?)

If anything, it's not that large. In this particular example, there are only 10k of these smaller word combinations. It would be curious to see what the numbers become at 100k and 1M.

Comments

Previous:: Adding client-to-server sync to PissueTracker March 20, 2025 React, JavaScript, Bun
Next:: How to SSG a Vite SPA April 26, 2025 Web Performance, React, JavaScript, Bun

Related by category:: The 3 queries I use with pg_stat_statements to analyze slow PostgreSQL queries September 30, 2024 PostgreSQL; Select all relations in PostgreSQL December 10, 2015 PostgreSQL; How much faster is Redis at storing a blob of JSON compared to PostgreSQL? September 28, 2019 PostgreSQL; How to sort case insensitively with empty strings last in Django April 3, 2022 PostgreSQL

Related by keyword:: The correct way to index data into Elasticsearch with (Python) elasticsearch-dsl May 14, 2021 Python, MDN, Elasticsearch; DateIndex in Zope doesn't have indexed attributes October 28, 2007 Zope

Inspecting the index size in PostgreSQL

Conclusion(?)

Comments

Related posts