Thursday, September 4 at 16:15 – 17:00

Room: Sophie

1 Billion Row Challenge: Comparing Postgres, DuckDB, and Extensions

Ryan Booz
pganalyze

In late 2023, the Java community started a challenge to find the most efficient way to process a file with 1 billion rows of data. Unsurprisingly, many database communities quickly took on the same challenge with varying results. Postgres, in many cases, performed the worst without close attention to settings and efficient resource utilization. But, with a little more effort, could it compete head-to-head?

In this session, we’ll look at the original challenge and how to approach it with vanilla Postgres beyond the basics. Next, we’ll explore how the increasingly popular in-memory analytics database, DuckDB, handles the same challenge. Finally, we’ll explore recent opportunities to integrate the two databases together to provide a powerful analytical engine with Postgres for the best of both worlds.

Postgres.sql script:

--CREATE DATABASE obrc;

SHOW shared_buffers;

CREATE UNLOGGED TABLE obrc
(
  station_name VARCHAR(26),
  measurement  NUMERIC(3,1)
);

-- Use to override server default
ALTER TABLE obrc SET (parallel_workers=16);
-- this took ~3.0 minutes in a single table
COPY obrc(station_name, measurement)
 FROM '/mnt/1brc/measurements.txt'
 WITH
 (
  FORMAT CSV,
  DELIMITER ';'
 );

SELECT * FROM obrc LIMIT 20;

-- This will take a very long time to complete
SELECT count(*) FROM obrc;

ANALYZE obrc;
-- If ANALYZE is up to date
SELECT reltuples::numeric as count
FROM pg_class
WHERE relname='obrc';
/*
 * ~34 seconds
 */
EXPLAIN analyze
SELECT
  station_name,
  MIN(measurement) AS min_measurement,
  ROUND(AVG(measurement),1) AS mean_measurement,
  MAX(measurement) AS max_measurement
 FROM obrc
 GROUP BY station_name
 ORDER BY station_name;

/*
 * ~34 seconds
 */
EXPLAIN analyze
SELECT
  '{' ||
    STRING_AGG(station_name || '=' || min_measurement || '/' || mean_measurement || '/' || max_measurement, ', ' ORDER BY station_name) ||
  '}' AS result
 FROM
  (SELECT station_name,
          MIN(measurement) AS min_measurement,
          ROUND(AVG(measurement), 1) AS mean_measurement,
          MAX(measurement) AS max_measurement
    FROM obrc
     GROUP BY station_name
  );

-- partitioned by station name
CREATE UNLOGGED TABLE obrc_partitioned
(
  station_name VARCHAR(26),
  measurement  NUMERIC(3,1)
) PARTITION BY hash(station_name);

TRUNCATE obrc_partitioned;

CREATE UNLOGGED TABLE obrc_p0
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 0);

CREATE UNLOGGED TABLE obrc_p1
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 1);

CREATE UNLOGGED TABLE obrc_p2
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 2);

CREATE UNLOGGED TABLE obrc_p3
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 3);

CREATE UNLOGGED TABLE obrc_p4
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 4);

CREATE UNLOGGED TABLE obrc_p5
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 5);

CREATE UNLOGGED TABLE obrc_p6
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 6);

CREATE UNLOGGED TABLE obrc_p7
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 7);

CREATE UNLOGGED TABLE obrc_p8
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 8);

CREATE UNLOGGED TABLE obrc_p9
PARTITION OF obrc_partitioned 
FOR VALUES WITH (modulus 10, remainder 9);

-- This doesn't make a difference because COPY is 
-- single threaded
COPY obrc_partitioned(station_name, measurement)
 FROM '/tmp/measurements.txt'
 WITH
 (
  FORMAT CSV,
  DELIMITER ';'
 );

EXPLAIN analyze
SELECT
  station_name,
  MIN(measurement) AS min_measurement,
  ROUND(AVG(measurement),1) AS mean_measurement,
  MAX(measurement) AS max_measurement
 FROM obrc_partitioned
 GROUP BY station_name
 ORDER BY station_name;


EXPLAIN analyze
SELECT
  '{' ||
    STRING_AGG(station_name || '=' || min_measurement || '/' || mean_measurement || '/' || max_measurement, ', ' ORDER BY station_name) ||
  '}' AS result
 FROM
  (SELECT station_name,
          MIN(measurement) AS min_measurement,
          ROUND(AVG(measurement), 1) AS mean_measurement,
          MAX(measurement) AS max_measurement
    FROM obrc_partitioned
     GROUP BY station_name
  );


 /*
  * DuckDB
  * Stop Postgres Container
  * 
  * 
  */
 .duckdb/cli/latest/duckdb
 .timer ON
 
 SELECT * FROM READ_CSV_AUTO('/tmp/measurements.txt') LIMIT 5;
 
 CREATE OR REPLACE TABLE measurements AS
        SELECT * FROM READ_CSV('/tmp/measurements.txt', 
        		header=false, 
        		columns= {'station_name':'VARCHAR','measurement':'double'}, 
        		delim=';');

SELECT station_name,count(*) FROM measurements
GROUP BY 1;

SELECT station_name, 
           MIN(measurement),
           AVG(measurement),
           MAX(measurement)
    FROM measurements 
    GROUP BY station_name
    LIMIT 5;

   
WITH src AS (SELECT station_name, 
                    MIN(measurement) AS min_measurement,
                    CAST(AVG(measurement) AS DECIMAL(8,1)) AS mean_measurement,
                    MAX(measurement) AS max_measurement
            FROM measurements 
            GROUP BY station_name
            LIMIT 5)
    SELECT station_name || '=' || CONCAT_WS('/',min_measurement, mean_measurement, max_measurement)
    FROM src;
 
   
/*
 * Stop duckdb
 * Start pg_duckdb container
 */
--CALL duckdb.recycle_ddb();
--ALTER SYSTEM SET duckdb.max_memory='15GB';
--SELECT * FROM pg_available_extensions ORDER BY name;
--CREATE EXTENSION pg_duckdb;
--truncate obrc;
--SHOW duckdb.max_memory;

-- Can still use duckdb functions for reading data
SELECT * FROM read_csv('/mnt/1brc/measurements.txt') LIMIT 20;
SELECT count(*) FROM read_csv('/mnt/1brc/measurements.txt');

-- Not supported yet
INSERT INTO obrc
SELECT * FROM read_csv('/mnt/1brc/measurements.txt') r LIMIT 20;

CREATE TEMPORARY TABLE obrc_temp(
	station_name text, 
	measurment NUMERIC(10,2)) USING duckdb;

/*
 * also takes ~1.5 minutes
 */
COPY obrc_temp FROM '/mnt/1brc/measurements.txt';

SELECT count(*) FROM obrc_temp;

/*
 * ~7 seconds
 */
SELECT station_name,count(*) FROM obrc_temp
GROUP BY 1;

/*
 * On my Docker image this didn't have an impact
 * for some reason.
 */
SET duckdb.max_workers_per_postgres_scan=12;
SET duckdb.worker_threads=12;
SET duckdb.force_execution=true;

/*
 * We can utilize the native file formats
 */
COPY (SELECT * FROM obrc_temp) TO '/mnt/1brc/measurements.parqet' (format parquet);

/*
 * pull station names out for later transformation and JOIN queries
 */
COPY (SELECT DISTINCT(station_name) FROM obrc_test) TO '/mnt/1brc/stations.txt';

CREATE TABLE station(
	id INT generated ALWAYS AS identity, 
	station_name text);

COPY station(station_name) FROM '/mnt/1brc/stations.txt';

--DROP TABLE station;
SELECT * FROM station;

/*
 * 12 seconds
 */
--EXPLAIN analyze
SELECT
  id, obrc_temp.station_name,
  MIN(measurment) AS min_measurement,
  ROUND(AVG(measurment),1) AS mean_measurement,
  MAX(measurment) AS max_measurement
 FROM obrc_temp JOIN station using (station_name)
 GROUP BY id, station_name
 ORDER BY station_name;

SELECT station_name, count(*) FROM obrc_temp
GROUP BY 1;

/*
 * playing with timestamps
 */
WITH dates AS (
	--SELECT o,t FROM generate_series('2024-01-01','2025-12-31','30 seconds'::interval) WITH ordinality AS g(t,o)
	SELECT ROW_NUMBER() OVER() AS o,t FROM generate_series('2024-01-01','2025-12-31','30 seconds'::interval) AS g(t)
),
m AS (
	SELECT row_number() over() AS o, station_name, measurment 
	FROM obrc_temp
	WHERE station_name='Helsinki'
)
INSERT INTO obrc_temp SELECT t, station_name, measurment
FROM dates
JOIN m USING (o);

SELECT time_bucket('1 day',ts) bucket, station_name, avg(measurement) FROM obrc_test2
GROUP BY 1,2
ORDER BY 1;

CREATE TEMPORARY TABLE obrc_temp2(
	ts timestamptz,
	station_name text,
	measurement numeric(10,2)
) using duckdb;

/*
 * OOM error trying to process
 */
SELECT row_number() OVER(PARTITION BY station_name), * FROM obrc_test LIMIT 10;




/*
 * Stop pg_duckdb container
 * Start pg_mooncake container
 */
CREATE DATABASE obrc;
CREATE EXTENSION pg_mooncake;
DROP TABLE obrc;
CREATE TABLE obrc
(
  station_name VARCHAR(26),
  measurement  NUMERIC(3,1)
);

CALL mooncake.create_table('obrc_iceberg', 'obrc');

-- not supported
ALTER TABLE obrc SET (parallel_workers=16);


-- Not supported yet
INSERT INTO obrc
SELECT station_name,measurement 
FROM mooncake.read_csv('/mnt/obrc/measurements.txt') 
	AS (station_name text, measurement float4) LIMIT 20;


-- Works as normal
-- Back of napkind ~20% faster
COPY obrc(station_name, measurement)
 FROM '/mnt/obrc/measurements.txt'
 WITH
 (
  FORMAT CSV,
  DELIMITER ';'
 );

-- This didn't seem to work while COPY was happening.
SELECT * FROM pg_stat_progress_copy;


select pg_size_pretty(pg_total_relation_size('obrc'));
SELECT count(*) FROM obrc;

 
SELECT * FROM obrc LIMIT 10;

--EXPLAIN analyze
SELECT
  station_name,
  MIN(measurement) AS min_measurement,
  ROUND(AVG(measurement),1) AS mean_measurement,
  MAX(measurement) AS max_measurement
 FROM obrc_iceberg
 GROUP BY station_name
 ORDER BY station_name;

TRUNCATE obrc;

postgres_config.sql script:

ALTER SYSTEM SET max_worker_processes=16;
ALTER SYSTEM SET max_parallel_workers=8;
ALTER SYSTEM SET max_parallel_workers_per_gather=8;
ALTER SYSTEM SET work_mem='1GB';
ALTER SYSTEM SET shared_buffers='20GB';
ALTER SYSTEM SET random_page_cost=1.1;
ALTER SYSTEM SET jit=OFF;

Download the presentation slides

Thursday, September 4 at 15:00 – 15:45

Room: Sophie

Securing Postgres with streaming replication

Jan Karremans
CYBERTEC

Join this talk to learn how to secure your database in Postgres using streaming replication, from the very BASIC to the very ADVANCED. After all, even if your database is small, your data might still be precious. And as your project grows and gets bigger, your Postgres environment will need to be adapted to scale with it.

This talk will give you a beginner’s guide to streaming replication in Postgres and more specifically you will learn about WAL (write-ahead logging), how to prepare for DR, and how to use the `pg_receivewal` utility in a large setup with cascading replicas, continuous backup, etc. You’ll also learn about the concepts of RTO and RPO and how to scale your data protection architecture as your application grows.

Why attend this talk? Because the amazing power of streaming replication in PostgreSQL is too under-appreciated.

Based on over two decades of practical experience, you will be guided through some of the pitfalls and get introduced to the full range of Postgres Disaster Resilience techniques.

Securing Postgres

Thursday, September 4 at 14:10 – 14:55

Room: Sophie

Anatomy of Table-Level Locks in PostgreSQL

Gülçin Yıldırım Jelinek
Xata

In PostgreSQL, managing schema changes without downtime can be a challenging task. Table-level locks, which control access during Data Definition Language (DDL) operations like ALTER or DROP TABLE, can result in unintended application slowdowns or even service interruptions when not fully understood. This talk will provide a comprehensive dive into table-level locking and lock queueing in PostgreSQL, helping attendees gain the insights they need to perform efficient schema changes.

We’ll start by explaining the various types of table-level locks in PostgreSQL such as Access Share, Exclusive, and Access Exclusive and how they are automatically managed during common DDL operations. Then, we’ll break down lock queuing: how PostgreSQL organizes lock requests, what happens when transactions wait for locks, and how deadlocks can arise in complex environments.

Next, we’ll focus on practical approaches to managing table-level locks for near-zero downtime. Attendees will learn techniques to minimize locking impact, including understanding lock conflicts, using online schema migration patterns, and identifying lock-heavy queries. We’ll introduce open-source tools like pgroll, which utilizes the expand/contract pattern to make schema changes in small, lock-free steps.

By the end of this session, attendees will be equipped with practical strategies and knowledge to control lock behavior during schema changes, ensuring data integrity and reducing operational disruptions. This talk will provide the tools needed to manage PostgreSQL schema changes with confidence and minimal impact on production environments.

Download the presentation slides

Thursday, September 4 at 13:20 – 14:05

Room: Sophie

Crash to Recovery: The PostgreSQL DBA’s Emergency Playbook

When a critical production database can’t service to majority of its users we can call this a “crisis”. These problems can show up in different forms like data corruptions, data loss arising from user fault, after change issues (upgrade, patch, migration etc.), hardware problems, load issues and more. It’s crucial to follow right direction for the solution when “downtime” starts ticking. While supporting big customer systems for many years, I experienced these “crisis” at first hand and prepared this presentation for PostgreSQL DBAs to get ready for these unwanted but possible cases.

This session will cover how to overcome the crisis situations with both technical and non-technical aspects just like behavioral approach, problem solving diplomacy, crisis management rules etc. We will start with example scenarios to understand and accept the risk in every condition. Then, we will cover how to prepare organizationally (preparing crisis management plan and technical procedures, performing periodic test etc.) and technically (best practices for scenarios like database backup/restore/recovery, failover, database upgrade/migration). At the last part there will be golden rules to follow, extracted from tens of real-life lessons learned.

People attending this session will enhance their readiness for any possible crisis situations on their database systems. This readiness must include both organizational and technical considerations and both sides will be explained by blending them.

Download the presentation slides

Thursday, September 4 at 11:25 – 12:10

Room: Sophie

Upgrades don’t have to hurt

Ants Aasma
CYBERTEC

PostgreSQL is notorious for having a painful upgrade process. So much so that many users push PostgreSQL way past the end-of-life date, but it doesn’t have to be this way. In this talk, I will take a tour through tools and tricks that a modern DBA can use to get PostgreSQL clusters upgraded with a minimum of hassle. What you need to look out for to ensure your upgrades go smooth like butter. With the correct procedures, upgrading your PostgreSQL cluster can be something to look forward to, rather than dreading it.

Download the presentation slides

Thursday, September 4 at 10:35 – 11:20

Room: Sophie

Operational hazards of managing PostgreSQL DBs over 100TB

Teresa Lopes
Adyen

Picture this: you start a new role, eager to learn and contribute with your ideas! Your next task is to get familiar with the database setup, and then you start encountering these massive PostgreSQL databases — 100TB, 200TB, 300TB…

And you start questioning yourself: how do you backup (and restore) a +100TB database? And how about HA? Upgrades?

It should work the same way as for a 100GB database, right? Well, maybe not exactly.

When reading blog articles, the best practices, the “how-to” guides, things seem straightforward. But when you start bending PostgreSQL’s limits, you will end up needing to question the most fundamental assumptions about how PostgreSQL works.

Over the last years, my team at Adyen has been exploring the boundaries of what PostgreSQL can do, and today I will share our findings with you (at least the ones I can!).

Download the presentation slides

Room: Sophie

1 Billion Row Challenge: Comparing Postgres, DuckDB, and Extensions

Ryan Booz

Securing Postgres with streaming replication

Jan Karremans

Anatomy of Table-Level Locks in PostgreSQL

Gülçin Yıldırım Jelinek

Crash to Recovery: The PostgreSQL DBA’s Emergency Playbook

Emre Baransel

Utku Demir

Upgrades don’t have to hurt

Ants Aasma

Operational hazards of managing PostgreSQL DBs over 100TB

Teresa Lopes