[syndicated profile] planetposgresql_feed

PG Day France 2025 took place on June 3 and 4 in Mons, Belgium, organized by Leila Bakkali, Matt Cornillon, Stefan Fercot, Flavio Gurgel, Anthony Nowocien, and Julien Riou, with the help of Jean-Paul Argudo, Laetitia Avrot, Sylvain Beorchia, Damien Clochard, Yves Collin, Bertrand Drouvot, Cédric Duprez, Maxime Longuet, Helene Nguyen, and Anaïs Oberto as volunteers.

Speakers:

[syndicated profile] planetposgresql_feed

Most applications start with a single PostgreSQL database, but over time, the need to scale out, distribute the load, or integrate naturally arises. PostgreSQL's logical replication is one of the features that meets these demands by streaming row-level changes from one PostgreSQL instance to another, all using a publish-subscribe model. Logical replication is more than an advanced feature; it provides a flexible framework you can build on to further distribute and integrate PostgreSQL within your architecture.

In this article, we will start with the foundation, explore the core ideas behind logical replication, and learn how to use it.

Physical vs. Logical Replication

Before we can dive deeper, let's understand the role of replication in PostgreSQL and how it's built on top of the Write-Ahead Log (WAL).

The WAL is a sequential, append-only log that records every change made to the cluster data. For durability purposes, all modifications are first written to the WAL and only then permanently written to disk. This allows PostgreSQL to recover from crashes by replaying logged changes.

Versioned changes, necessitated by concurrent transactions, are managed through Multi-Version Concurrency Control (MVCC). Instead of overwriting data directly, MVCC creates multiple versions of rows, allowing each transaction to see a consistent snapshot of the database. It is the WAL that captures these versioned changes along with the transactional metadata to ensure data consistency at any given point in time.

Physical replication is built directly on the Write-Ahead Log. It enables streaming of the binary WAL data from the primary server to one or more standby servers (replicas), effectively creating a byte-for-byte copy of the entire cluster. This requirement makes the replicas read-only, making them ideal candidates for failover or scaling purposes.

Compared to this, Logical replication, while also being built on top of the WAL data, takes a fundamentally different approach. Instead of streaming raw change data, logical replication decodes the WAL into logical, row-level changes – such as INSERT, UPDATE, and DELETE – and only then sends them to the subscribers using a Publish-Subscribe model. Compared to physical replication, this allows selective replication, while allowing writable subscribers which are not strictly tied to a single publisher. This might increase the flexibility of available setups, however logical replication does not replicate DDL changes.

PhysicalLogical
Data StreamedBinary WAL segmentsRow-level SQL changes
ScopeByte-for-byte streamSelected tables
Node TypeRead-only standbyFully writable instance
PostgreSQL VersionAll servers must match major versionSupports across versions
Database SchemaChanges automatically replicatedChanges must be applied on subscriber(s) separately
Use CaseFailover, high-availability, and read scalingIntegration, zero-downtime upgrades and schema migrations, complex topologies

Physical replication is your go-to for high availability and disaster recovery, where you want a fast, exact copy of the entire database cluster that can take over in case of failure. It’s simple to set up and very efficient but limited in flexibility.

Logical replication shines when you need fine-grained control over what data is replicated, require writable replicas, or want to integrate PostgreSQL with other systems or versions. It’s ideal for zero-downtime upgrades, multi-region deployments, and building scalable, modular architectures.

Setting up Logical Replication

Enough of boring theory for now. To get started, you will need two instances of PostgreSQL – it's up to you whether you provision two virtual machines or two clusters running on the same computer. We will call one publisher and the other subscriber.

Preparing the Publisher

First, you need to prepare the publisher to emit the logical changes. You will need to modify postgresql.conf (or add particular conf.d configuration) with the following parameters:

wal_level = logical
max_replication_slots = 10
max_wal_senders = 10

Your publisher also needs to be reachable by the subscriber, in most cases via a TCP socket.

listen_addresses = '*'

Here, the configuration is:

  • wal_level set to logical is a crucial piece of the configuration. It tells PostgreSQL how much information to write to the WAL, in this case, to support logical decoding.
  • max_replication_slots defines the maximum number of replication slots that can be created on the server. Each logical replication subscription needs its own slot.
  • max_wal_senders should be set high enough to accommodate all expected concurrent replication connections from subscribers. Each active replication subscription consumes a wal_sender slot.

should match the maximum number of connections from the publisher (primary) to subscribers or replication clients. The number should be higher than or equal to the number of replication slots to avoid replication problems.

You will also need to configure client authentication in pg_hba.conf to allow the subscriber to connect for replication (for simplification, we allow all users):

# TYPE      DATABASE        USER            ADDRESS                 METHOD
host        replication     all             subscriber_ip/32        scram-sha-256

While for the purposes of the article we will assume the use of a superuser, making it convenient:

CREATE USER my_user_name SUPERUSER PASSWORD 'my_secure_password';

It is recommended to use a dedicated replication user for real-life deployments:

CREATE USER replication_user WITH REPLICATION ENCRYPTED PASSWORD 'my_secure_password';

Once configured, restart PostgreSQL for the configuration changes to take effect. After that, connect to the server and prepare the environment.

-- example for psql

CREATE DATABASE logical_demo_publisher;
\c logical_demo_publisher;

Create a sample table and seed initial data:

CREATE TABLE products (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    category TEXT,
    price DECIMAL(10, 2) NOT NULL,
    stock_quantity INT DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    description TEXT,
    is_active BOOLEAN DEFAULT TRUE
);

-- seed some data (10 records)
INSERT INTO products (
    name,
    category,
    price,
    stock_quantity,
    description,
    is_active
)
SELECT
    'Product Batch ' || s.id AS name,
    CASE (s.id % 5)
        WHEN 0 THEN 'Electronics'
        WHEN 1 THEN 'Books'
        WHEN 2 THEN 'Home Goods'
        WHEN 3 THEN 'Apparel'
        ELSE 'Miscellaneous'
    END AS category,
    ROUND((RANDOM() * 500 + 10)::numeric, 2) AS price,
    FLOOR(RANDOM() * 200)::int AS stock_quantity,
    'Auto-generated description for product ID ' || s.id || '. Lorem ipsum dolor sit amet, consectetur adipiscing elit.' AS description,
    (s.id % 10 <> 0) AS is_active
FROM generate_series(1, 10) AS s(id);

The final step for the publisher is to create a publication to define what data we want to publish.

CREATE PUBLICATION my_publication FOR TABLE products;

Preparing the Subscriber

Next, we will use our subscriber instance to receive the logical changes. There's no need to make any configuration changes just for subscribing, as it's not emitting changes itself (please note the "just").

And create the target database and schema. The schema creation is an important part as:

-- example for psql

CREATE DATABASE my_subscriber_db;
\c my_subscriber_db;

CREATE TABLE products (
    id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name TEXT NOT NULL,
    category TEXT,
    price DECIMAL(10, 2) NOT NULL,
    stock_quantity INT DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    description TEXT,
    is_active BOOLEAN DEFAULT TRUE
);

Now we have the foundation for testing logical replication. The simplest way is to create a subscription using connection details and the name of the publication to subscribe to.

CREATE SUBSCRIPTION my_subscription
    CONNECTION 'host=publisher_ip_address port=5432 user=your_replication_user password=my_secure_password dbname=logical_demo_publisher'
    PUBLICATION my_publication;
  • my_subscription should be a descriptive name for your subscription.
  • CONNECTION defines how to connect to the publisher (which is a regular connection string and could also be a service).
  • PUBLICATION specifies the name of the publication you created earlier on the publisher.

If your connection was correct, the subscription will by default start with the initial data sync and listen for incoming changes. If you have used the queries above, you can validate that the data is now on the subscriber.

# SELECT count(1) FROM products;
 count
---------
    10
(1 row)

The number of records should match the number of records created using generate_series above (10 in our example). You can go ahead and insert a single row again on your publisher instance and validate the data being replicated to the subscriber.

Congratulations! You have set up your first logical replication in PostgreSQL!

Core Concepts of Logical Replication

That was easy, right? And that's the goal. Now that you've seen logical replication in action, let's delve deeper into the core concepts that made this work.

Publication

As the name implies, a publication is where it all starts. It's essentially a catalogue of data you offer from the publisher. You can publish a number of objects:

--- specific tables
CREATE PUBLICATION my_publication FOR TABLE products, orders;

--- everything in your database (make sure you really want to do this)
CREATE PUBLICATION all_data FOR ALL TABLES;

--- replicate specific columns only (PostgreSQL 15 and higher)
CREATE PUBLICATION generic_data FOR TABLE products (id, name, price);

--- filter published data (PostgreSQL 15 and higher)
CREATE PUBLICATION active_products FOR TABLE products WHERE (is_active = true);

--- filter different operations
CREATE PUBLICATION my_publication FOR TABLE products, orders
WITH (publish = 'insert');

--- or you can mix & match it to get exactly what you need
CREATE PUBLICATION eu_customers FOR
    TABLE customers (id, email, country, created_at)
        WHERE (country IN ('DE', 'FR', 'IT')),
    TABLE orders (id, customer_id, total_amount)
        WHERE (total_amount > 100)
WITH (publish = 'insert, update');

As you can see, logical replication really gives you quite a lot of options and is far from the rigid, byte-for-byte copying of physical replication. This is exactly the characteristic that allows you to build complex topologies.

Once you set up your publication(s), you can review it:

--- in psql
\dRp
\dRp+ my_publication

--- using pg_catalog
SELECT * FROM pg_publication_tables WHERE pubname = 'my_publication';

From these options, it's easy to think of a publication as a customisable data feed.

Subscription

The other side of logical replication is a subscription. It defines what and how to consume events from a publisher. You subscribe to a publication using connection details and a number of options.

--- basic subscription with full connection detail
CREATE SUBSCRIPTION my_subscription
    CONNECTION 'host=publisher_host port=5432 user=repl_user password=secret dbname=source_db'
    PUBLICATION my_publication;

-- subscription using service definition
CREATE SUBSCRIPTION realtime_only
    CONNECTION 'service=my_source_db'
    PUBLICATION my_publication;

The default behaviour of the subscription is to copy existing data, start immediately, and continue with streaming data changes. Once you start experimenting with logical replication, you can control that behaviour.

--- subscribe without initial copy of the data
CREATE SUBSCRIPTION streaming_only
    CONNECTION 'service=my_source_db'
    PUBLICATION my_publication
    WITH (copy_data = false);

--- or defer the start for later
CREATE SUBSCRIPTION manual_start
    CONNECTION 'service=my_source_db'
    PUBLICATION my_publication
    WITH (enabled = false);

You can monitor your subscriptions:

--- in psql
\dRs
\dRs+ my_subscription

--- or their status using pg_catalog
SELECT subname, received_lsn, latest_end_lsn, latest_end_time
FROM pg_stat_subscription;

Replication Slot

Publications and subscriptions establish the data flow, but there's a critical piece missing: how does the publisher keep track of multiple subscribers reading the WAL at different speeds? Replication slots are the answer. They act as a persistent booking in the WAL stream that tracks exactly where each subscriber is (or was last time), ensuring no changes are lost even if the connection stops.

Let's have a look at what a replication slot can tell us about itself.

# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+---------------
slot_name           | my_subscription
plugin              | pgoutput
slot_type           | logical
datoid              | 16390
database            | my_db_source
temporary           | f
active              | t
active_pid          | 3162
xmin                |
catalog_xmin        | 777
restart_lsn         | 0/1DEFBF0
confirmed_flush_lsn | 0/1DEFC28
wal_status          | reserved
safe_wal_size       |
two_phase           | f
inactive_since      |
conflicting         | f
invalidation_reason |
failover            | f
synced              | f

The most interesting attributes are:

  • slot_name
  • plugin used for logical replication
  • slot_type confirming it's for logical replication
  • active indicates whether there's a subscriber reading from this slot

And most important of those being:

  • restart_lsn identifying the LSN where this slot "holds" WAL files from being released
  • confirmed_flush_lsn being the last LSN position the subscriber confirmed it has successfully processed (i.e., applied).

While we won't go into details about WAL, it's enough to say LSN (Log Sequence Number) is a unique address or position in the WAL that identifies the position in the stream of database changes.

In most cases, you don't have to create replication slots manually – subscriptions create and manage them. The persistent nature of the slots guarantees they survive the restart of both publisher and subscriber.

Please note, if the subscription is not actively consuming the changes, the publisher PostgreSQL won't release WAL files that contain changes a slot has not consumed. This prevents data loss, but can easily fill up disk if any of the subscribers fall too far behind.

You can monitor the WAL retention per each replication slot.

# SELECT
    slot_name,
    active,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS wal_retained
FROM pg_replication_slots;

-[ RECORD 1 ]+--------------------
slot_name    | my_subscription
active       | t
wal_retained | 265 MB

Replication Identity

As we have already covered, logical replication works with row-level changes, such as INSERT, UPDATE, and DELETE. And not all operations are equal. While INSERT is relatively straightforward (a new row is sent to the subscriber), for both UPDATE and DELETE operations, PostgreSQL needs to be able to uniquely identify the target row to modify on the subscriber.

This is managed by the REPLICA IDENTITY property of each published table. By default, if a table has a primary key (it has, right?), or you can specify it USING INDEX for a unique index, or as an alternative, specify FULL (should be generally avoided) to all old column's row values to find the target row to update. Please note, PostgreSQL might default to REPLICA IDENTITY FULL if no primary key or suitable key index exists.

Whenever possible, please always:

  • Choose REPLICA IDENTITY DEFAULT (for primary key), giving you the most efficient choice. If you don't have a primary key, please consider using it (a surrogate index is always an option).
  • REPLICA IDENTITY USING INDEX index_name if there would be a better (or smaller) unique index matching the business logic or architecture of your database model.

There are special cases of what not to do when it comes to replication identity, but we will touch on those cases in the advanced section of this guide.

Schema Changes

Although it was already mentioned, it's vital to reiterate that logical replication, as it streams only row-level changes rather than full WAL segments, does not automatically replicate Data Definition Language (DDL) changes.

Any schema changes made on the publisher must be manually applied to all subscribers before data reflecting the change is replicated.

We will cover schema specifically in later parts of this guide, but to sum it up, this is the recommended order of operations when it comes to schema changes:

  1. (Optional) but recommended for breaking changes, pause replication where applicable.
  2. Apply and verify DDL changes to the subscribers.
  3. Apply and verify DDL changes on the publisher.
  4. Resume the replication if applicable.

While this might seem like a major drawback compared to physical replication, it's the characteristic that gives us the most flexibility.

Other Considerations

As logical replication reliably handles row-level changes, it's crucial to understand other specific limitations and behaviours. Specifically:

  • It's important to understand that sequences are not replicated. While they are used to generate the value on the publisher, and their value advances there, the corresponding sequence (if it exists) on the subscriber won't be updated.
  • Other database objects won't be replicated. While it might be understandable for views, stored procedures, triggers, and rules, you also can't rely on it for materialized views.
  • Special consideration (similar to schema changes) must be paid to user-defined data types. If a replicated table uses a user-defined type (e.g., a column using an ENUM), the type must already be available on the subscriber(s) and have exactly the same name and structure. As ENUMs are represented as ordered sets, even the order of the values matters!

Living with Logical Replication

In our previous example, we set up the publisher and subscriber, relied on the initial copy of the data, and let things run. Both of them will survive a restart, and thanks to the replication slot, they will keep going as soon as they come online.

But what if you need to perform maintenance or do regular things like a schema update on the subscriber or the publisher? Sometimes you might need to temporarily stop the replication. PostgreSQL makes this straightforward with subscription management, allowing you to stop consuming the changes.

ALTER SUBSCRIPTION my_subscription DISABLE;

As we hinted earlier, when disabled, the subscription stops consuming changes from the publisher, while keeping the replication slot. This means:

  • No new changes are applied on the subscriber.
  • WAL files will start to accumulate on the publisher (since the replication slot holds its position).

While disabled, you can't really check the status from the subscriber, as only the replication slot has the data. You can use the query we mentioned above on the publisher to check the status.

SELECT
    slot_name,
    active,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS wal_retained
FROM pg_replication_slots;

When ready, you can resume the subscription.

ALTER SUBSCRIPTION my_subscription ENABLE;

Coordinating Schema Changes

As we mentioned earlier, since logical replication does not automatically replicate DDL changes, you need to coordinate schema updates manually. Here's a basic overview of how to perform such a change, step by step.

First, disable the replication on the subscriber and apply your schema changes.

--- disable replication
ALTER SUBSCRIPTION my_subscription DISABLE;

--- apply changes on the subscriber
ALTER TABLE products ADD COLUMN category_id INTEGER;
CREATE INDEX products_by_category ON products(category_id);

Only then can you apply the changes to the publisher.

ALTER TABLE products ADD COLUMN category_id INTEGER;
CREATE INDEX products_by_category ON products(category_id);

And resume the replication.

ALTER SUBSCRIPTION my_subscription ENABLE;

If you applied the schema change to the publisher first, any new data using the new column would fail to replicate to the subscriber that doesn't have the column yet.

Handling Incompatible Changes

PostgreSQL's default conflict resolution is very simple. When a conflict occurs, logical replication stops, and the subscription enters an error state. For example, if you consider a product row already present on the subscriber, you will see something like this in the log files:

ERROR: duplicate key value violates unique constraint "products_pkey"
DETAIL: Key (id)=(452) already exists.
CONTEXT:  processing remote data for replication origin "pg_16444" during message type "INSERT" for replication target relation "public.products" in transaction 783, finished at 0/1E020E8
LOG:  background worker "logical replication apply worker" (PID 457) exited with exit code 1

As you can see, logical replication stopped with an error code and won't continue until you manually resolve the conflict. To do so, you need to identify the conflicting data and decide what to do with it.

-- option 1: remove the conflicting data
DELETE FROM products WHERE id = 452;

-- option 2: update the data to match the expected state
UPDATE products SET name = 'New Product', price = 29.99 WHERE id = 452;

Only after the conflict is resolved can the logical replication continue. This has been only a very simple example of a conflict that might arise during logical replication. Other examples might involve:

  • Data problems, constraint violations, or type mismatch
  • Schema conflicts (missing or renamed table/column)
  • Permissions issues or row-level security

In all those cases, you still need to go and fix the problem before logical replication can resume.

Wrap Up

Logical replication in PostgreSQL opens up powerful possibilities beyond traditional physical replication. We have covered the foundations that can get you started with setting up your own logical replication environment, including understanding the differences between physical and logical replication, configuring publishers and subscribers, and managing core components like publications, subscriptions, and replication slots.

This article is part of the upcoming guide Mastering Logical Replication in PostgreSQL. If you are interested in the topic, please consider subscribing to get the latest articles as they are published.

[syndicated profile] planetposgresql_feed

One of the things I admire most about PostgreSQL is its ease of getting started.

I have seen many developers and teams pick it up, launch something quickly, and build real value without needing a DBA or complex tooling. That simplicity is part of what makes PostgreSQL so widely adopted.

However, over time, as the application grows and traffic increases, new challenges emerge. Queries slow down, disk usage balloons, or a minor issue leads to unexpected downtime.

This is a journey I have witnessed unfold across many teams. I don’t think of it as a mistake or an oversight; it is simply the natural progression of a system evolving from development to production scale.

The idea behind this blog is to help you assess your current situation and identify steps that can enhance the robustness, security, and scalability of your PostgreSQL deployment.

1. Architecture: Is Your Deployment Designed to Withstand Failure?

As systems grow, so does the need for resilience. What worked fine on a single node during development might not hold up in production.

Questions to ask:

  • Are you still on a single-node setup?
  • Do you have at least one streaming replica?
  • Is failover possible — and tested?

Setting up high availability is about pre-emptive measures to ensure that your users continue to be serviced even in the face of software or hardware failures. Even a basic primary-replica setup can make a big difference. Add a failover tool like Patroni or repmgr, and you are well on your way to building a more resilient PostgreSQL foundation.

2. Configuration: Is PostgreSQL Tuned for Your Workload?

PostgreSQL’s defaults are intentionally conservative — they prioritize compatibility, not performance. That is great for getting started, but less ideal for scaling.

What to look for:

  • Is shared_buffers still set to 128MB?
  • Have you tuned work_mem or maintenance_work_mem based on query complexity?
  • Are your autovacuum settings helping or hurting your performance?

In many consulting engagements, tuning these parameters is the first step we take — and it often yields immediate, measurable improvements.

If your workload is increasing, your configuration should grow accordingly. This tuning is not just for performance — it is for predictability and peace of mind.

3. Security: Is Your Database Properly Locked Down?

Security often takes a back seat when speed is the priority. That is understandable, but once your system handles sensitive data or is exposed to the internet, you need stronger guardrails.

Areas to review:

  • Is Scram-SHA-256 authentication enabled?
  • Are you using SSL/TLS for encrypted connections?
  • Are roles and privileges clearly defined?

If your current authentication method still relies on trust or MD5, or if all apps use the same superuser login, it may be time to revisit your setup.

Even small changes, such as separating roles or enabling detailed logging with pgaudit, can make a significant difference.

4. Backups and Recovery: Can You Restore with Confidence?

Backups are often assumed to be “handled” — until you need them.

Healthy backup practices include:

  • Using physical backups with pg_basebackup or pgBackRest
  • Archiving WAL files for point-in-time recovery
  • Testing restores regularly (not just taking backups)

We often guide teams through recovery simulations as part of our PostgreSQL consulting. Many are surprised by how long the recovery takes, or how it is sometimes just plain unusable.

Having a backup is not the same as being able to restore reliably. In production, the distinction matters.

5. Monitoring and Observability: Are You Seeing What Matters?

Monitoring goes beyond CPU and memory. A production-grade PostgreSQL deployment needs visibility into database health, query performance, and replication status.

Recommended stack:

  • Prometheus with postgres_exporter for metrics
  • Grafana for dashboards and alerts
  • Observation tools like pgBadger

You do not need everything on day one, but you should have visibility into slow queries, replication lag, vacuum activity, and disk usage. Observability helps you identify and resolve problems before they impact users.

6. Performance: Is Your Database Keeping Up with Your Growth?

Performance degradation tends to creep in slowly, and it is usually not the fault of PostgreSQL itself.

Common causes we see:

  • Missing or misused indexes
  • Slow JOINs on large datasets
  • Ineffective partitioning strategies

If you are not regularly reviewing query plans with EXPLAIN (ANALYZE) or tracking unused indexes, there is likely room for improvement.

Some performance issues require deep analysis, but many are fixable with good indexing and tuning strategies. This is where targeted PostgreSQL consulting can deliver immediate value.

7. Maintenance and Upgrades: Are You Staying Current?

Keeping PostgreSQL healthy requires regular upkeep, not just when something breaks.

Things to check:

  • Are you running a supported version?
  • Are you on the latest minor version?
  • Is your upgrade process documented and tested?

A smooth upgrade path reduces risk and gives you access to performance improvements and security fixes. We have helped many teams upgrade from legacy versions — often with less friction than they expected.

The key is to plan, test, and document. Maintenance is less stressful when it is routine.

8. Support: Do You Have a Trusted PostgreSQL Partner?

You do not need a full-time DBA to run PostgreSQL in production, but you do need support when issues arise.

If you’re relying on community forums during an outage, consider seeking outside help.

We work with teams that want:

  • A second opinion on architecture or performance
  • Help tuning and scaling their workload
  • Peace of mind through health checks and 24/7 support

Even a short consulting engagement can provide clarity and help your team move forward with confidence.

Final Word: A Healthy PostgreSQL Deployment Is Built Over Time

If you are reading this and realize that your setup does not yet cover certain things, that is entirely okay.

No one gets everything right on their first try. Most PostgreSQL deployments evolve organically, and what starts as a simple instance often grows into something much more critical. Recognizing when it is time to harden your setup is a sign of maturity, not failure.

This checklist is intended to serve as a guide. Use it to reflect on the current state of your PostgreSQL deployment and identify areas that may require additional attention, such as implementing more robust backups, optimizing parameters, strengthening security, or establishing a comprehensive failover plan.

Checklist: Is Your PostgreSQL Deployment Production-Grade?Book a health check with us!

The post Checklist: Is Your PostgreSQL Deployment Production-Grade? appeared first on Stormatics.

Ian Barwick: PgPedia Week, 2025-06-08

Jun. 10th, 2025 10:11 am
[syndicated profile] planetposgresql_feed
PostgreSQL 18 changes this week

Of note:

new GUC log_lock_failure renamed to log_lock_failures commit e5a3c9d9 (" postgres_fdw: Inherit the local transaction's access/deferrable modes. ") was reverted PostgreSQL 18 articles What's New in PostgreSQL 18 - a DBA's Perspective (2025-05-23) - Tianzhou

more...

[syndicated profile] planetposgresql_feed

pgsql_tweaks is a bundle of functions and views for PostgreSQL

The source code is available on GitLab, a mirror is hosted on GitHub.
One could install the whole package, or just copy what is needed from the source code.

The extension is also available on PGXN.

General changes

PostgreSQL 18 Support

No code has been changed. Tests against PostgreSQL 18 beta 1 have been unsuccessful.

Therefore PostgreSQL 18 is now supoorted by pgsql_tweaks.

The reason it took some time to test against the upcoming version is that there have been some problems with the Docker containers, to have them running PostgreSQL 18 as there have been changes, that made it a bit hard to get that version running.
I execute the tests always against Docker containers.

[syndicated profile] planetposgresql_feed

Regular expressions and PostgreSQL have been a great team for many many years. The same is true for PostgreSQL arrays, which have been around for a long time as well. However, what people rarely do is combine those two technologies into something more powerful that can be used for various purposes.

Using ANY and ALL

One of the most widely adopted ways of dealing with arrays is the idea of ANY and ALL, which has been supported by PostgreSQL since God knows when. Those two keywords allow us to figure out if a certain value and an array are a match.

Here are some examples:

```sql
test=# SELECT array_agg(x) FROM generate_series(1, 5) AS x;
  array_agg  
-------------
 {1,2,3,4,5}
(1 row)

test=#  SELECT 3 = ANY(array_agg(x)) 
	FROM   generate_series(1, 5) AS x;
 ?column? 
----------
 t
(1 row)

test=#  SELECT 3 = ALL(array_agg(x)) 
	FROM   generate_series(1, 5) AS x;
 ?column? 
----------
 f
(1 row)
```

The first statement simply generates an array of numbers, which can be used in my example to demonstrate how ANY and ALL work. The idea is simple: ANY will check if one of the values in the array matches - ALL will check if all of the values are a match. So far this is quite common.

However, what happens if we try to apply this concept to regular expressions?

PostgreSQL and regular expressions

Many readers might be surprised to learn that combining those two techniques is indeed possible and actually fairly straight forward. Note that in the example above ANY and ALL were essentially used in combination with the = operator. However, we can also apply the ~ operator, which is the PostgreSQL way of handling regular expressions:

```sql
test=# SELECT 'my fancy string' ~ '.*ancy.*ri.+$';
 ?column? 
----------
 t
(1 row)
```

What it essentially does is matching the regular expression on the right hand side of the operator with the string on the left. So far so good, but what happens if we expand on this a bit?

Here is what we can do:

```sql
test=# SELECT   array_agg(exp)
       FROM (
	 SELECT  '.*SELECT.*' AS exp
	 UNION ALL
	 SELECT  '.*HAVING.*'
	 UNION ALL
	 SELECT  '.*GROUP\s+BY.*'
	) AS x;
		array_agg                 
-------------------------------------------
 {.*SELECT.*,.*HAVING.*,".*GROUP\\s+BY.*"}
(1 row)
```

What we have just created is an array of regular expressions that we can easily apply:

```sql
test=# SELECT 'SELECT name, count(*) 
	 FROM   tab
	 GROUP BY 1
	 HAVING count(*) > 2' ~ALL(array_agg(exp))
FROM (
	SELECT  '.*SELECT.*' AS exp
	UNION ALL
	SELECT  '.*HAVING.*'
	UNION ALL
	SELECT  '.*GROUP BY.*'
     ) AS x;
 ?column? 
----------
 t
(1 row)
```

Voilà, we can see that all expressions have matched successfully. All we had to do was utilize ~ALL instead of =ALL to do the trick. While this might be obvious to some, it does not seem to be common knowledge out there.

Finally ...

If you want to learn more about pattern matching and fuzzy search, consider reading my blog post dealing with this topic and hopefully learn more about some of those techniques.

The post Using regular expressions and arrays in PostgreSQL appeared first on CYBERTEC PostgreSQL | Services & Support.

[syndicated profile] planetposgresql_feed

Introduction

In this post, we’ll cover some database design principles and package them up into a catchy mnemonic acronym.

Software engineering is loaded with acronyms like this. For example, SOLID principles describe 5 principles, Single responsibility, Open-closed, Liskov substitution, Interface segregation and Dependency inversion, that promote good object-oriented design.

Databases are loaded with acronyms, for example “ACID” for the properties of a transaction, but I wasn’t familiar with one the schema designer could keep in mind while they’re working.

Thus, the motivation for this acronym was to help the schema designer, by packaging up some principles of good design practices for database schema design. It’s not based in research or academia though, so don’t take this too seriously. That said, I’d love your feedback!

Let’s get into it.

Picking a mnemonic acronym

In picking an acronym, I wanted it to be short and have each letter describe a word that’s useful, practical, and grounded in experience. I preferred a real word for memorability!

The result was “CORE.” Let’s explore each letter and the word behind it.

Constraint-Driven

The first word (technically two) is “constraint-driven.” Relational databases offer rigid structures, but the ability to be changed while online, a form of flexibility in their evolution. We evolve their structure through DDL. They use data types and constraints changes, as new entities and relationships are added.

Constraint-driven refers to leveraging all the constraint objects available, designing for our needs today, but also in a more general sense to apply restrictions to designs in the pursue of data consistency and quality.

Let’s look at some examples. Choose the appropriate data types, like a numeric data type and not a character data type when storing a number. Use NOT NULL for columns by default. Create foreign key constraints for table relationships by default.

Validate expected data inputs using check constraints. For small databases, use integer primary keys. If you get huge later, we can migrate the structure and data.

The mindset is to prefer rigidity, design for today, then leverage the ability to evolve flexibly later, as opposed to designing for a hypothetical future state.

Optimized

Databases present loads of optimization opportunities. Relational data is initially stored in a normalized form to eliminate duplication, but later denormalizations can be performed when read access is more important than eliminating duplication.

When our use cases are not known at the outset, plan to iterate on the design, changing the structure to better support the use cases that have become better known over time. This will mean evolving the schema design through DDL changes.

This applies to tables, columns, constraints, indexes, parameters, queries, and anything that can be optimized to better support real use cases.

Queries are restructured and indexes are added to reduce data access. Pursue highly selective access, a small proportion of rows, on high cardinality (uniqueness) data, to reduce latency.

Critical background processes like VACUUM get optimized too. Resources (workers, memory, parallelization) are increased to support use cases.

Responsive

When problems emerge like column or row level unexpected data, missing referential integrity, or query performance problems, engineers inspect logs, catalog statistics, and parameters, from the core engine and third party extensions, to diagnose issues.

When DDL changes are ready, the engineer applies them in a non-blocking way, in multiple steps as needed. Operations are performed “online” by default when practical.

DDL changes are in a source code file, reviewed, tracked, and a copy of the schema design is kept in sync across environments.

Parameter (GUC) tuning (Postgres: work_mem, etc.) happens in a trackable way. Parameters are tuned online when possible, and scoped narrowly, to optimize their values for real queries and use cases.

Efficient

It’s costly to store data in the database! The data consumes space and accessing data unnecessarily adds latency.

Data that’s stored is queried later or it’s archived. There’s a continual process to consider use cases, and archive any unneeded data.

To minimize space consumption and latency, tables, columns, constraints, and indexes are removed continually by default, when they no longer are required, to reduce system complexity.

Server software is upgraded at least annually so that performance and security benefits can be leveraged.

Huge tables are split into smaller tables using table partitioning for more predictable administration.

CORE Database Design

There’s lots more to evolving a database schema design, but these principles are a few I keep in mind.

Did you notice anything missing? Do you have other feedback? Please contact me with your thoughts.

Thank You

Over the years, I’ve learned a lot from Postgres.fm hosts Nikolay and Michael, and other community leaders like Lukas and Franck, as they’ve shaped my database design choices.

I’m grateful to them for sharing their knowledge and experience with the community.

Thanks for reading!

[syndicated profile] planetposgresql_feed
PostgreSQL Person of the Week Interview with Teresa Lopes: I am Teresa Lopes, born in Lisbon, Portugal. While living in Portugal, I always divided my time between Lisbon (house/school/work) and Sertã, Castelo Branco (weekends/holidays). At the end of 2023, I decided to move to Berlin and more recently relocated to Amsterdam (got to love the EU 💙💛).
[syndicated profile] planetposgresql_feed

The Call for Presentations for PostgreSQL Conference Europe that will take place in Riga, Latvia from October 21 to 24 is now open.

Key dates

  • Submission deadline: 30 June (at midnight, local time Riga, Latvia; GMT+3)
  • Speakers notified: before 1 August
  • Conference: 21-24 October

To submit your proposals and for more information see our website. Speakers can submit up to three (3) proposed talks each.

Session format

There will be a mix of shorter (25 min) and longer (45 min) sessions, which will be held in English. They may be on any topic related to PostgreSQL.

Selection process

The proposals will be considered by committee who will produce a schedule to be published nearer the conference date. The members of the talk selection committee will be listed soon on the conference website.

For more details about sponsoring the event, see the website.

For any questions, contact us at contact@pgconf.eu.

[syndicated profile] planetposgresql_feed

I recently had a customer that wanted to leverage read replicas to ensure that their read queries were not going to impeded with work being done on the primary instance and also required an SLA of at worst a few seconds. Ultimately they weren’t meeting the SLA and my colleagues and I were asked to look at what was going on.

The first thing we came to understand is that the pattern of work on the primary is a somewhat frequent large DELETE statement followed by a data refresh accomplished by a COPY from STDIN command against a partitioned table with 16 hash partitions.

The problem being observed was that periodically the SELECTs occurring on the read replica would time out and not meet the SLA. Upon investigation, we found that the “startup” process on the read replica would periodically request an “exclusive lock” on some random partition. This exclusive lock would block the SELECT (which is partition unaware) and then cause the timeout. But what is causing the timeout?

After spending some time investigating, the team was able to correlate the exclusive lock with a routine “autovacuum” occurring on the primary. But why was it locking? After inspection of the WAL, it turns out that it the issue was due to a step in the vacuum process whereby it tries to return free pages at the end of the table back to the OS, truncation of the High Water Mark (HWM). Essentially the lock is requested on the primary and then transmitted to the replica via the WAL so that the tables can be kept consistent.

To confirm that it was in fact the step in VACUUM that truncates the HWM, we decided to alter each partition of the table to allow VACUUM to skip that step:

ALTER TABLE [table name / partition name] SET (vacuum_truncate = false);

After letting this run for 24 hours, we in fact saw no further blocking locks causing SLA misses on the replicas. Should we worry about shrinking the High Water Mark (HWM)? Well as with everything in IT, it depends. Other DBMS engines like Oracle do not shrink the High Water Mark (HWM), typically maintenance operations such as DBMS_REDEF or ALTER TABLE … SHRINK SPACE / SHRINK SPACE COMPACT deal with that. So now that we are talking about PostgreSQL do we need to worry about it? This is where the pg_freespacemap extension can help. We can use this extension and a script to check to see if in fact the High Water Mark (HWM) is growing or staying put. If it is growing, we can just execute a regular VACUUM with an additional option called TRUNCATE to handle it:

VACUUM (verbose, truncate) [schema].[table name];

When you do this, you will see one additional message in the VACUUM output signifying that the VACUUM truncated the High Water Mark (HWM):

INFO:  table "large_table": truncated 302534 to 302233 pages

As I stated earlier, we can use pg_freespacemap to see if we actually need to worry about the High Water Mark (HWM) growing. I could have taken a lot of time to write a script to figure it out, but instead, I enlisted Google Gemini to see what it would come up with. After a few iterations, the output was nearly perfect!

CREATE EXTENSION pg_freespacemap;

CREATE OR REPLACE FUNCTION show_empty_pages(p_table_name TEXT)
RETURNS VOID AS $$
DECLARE
    -- Core processing variables
    table_oid_regclass  REGCLASS;
    block_size          BIGINT;
    fsm_granularity     BIGINT;
    max_fsm_free_space  BIGINT;
    total_pages         BIGINT;
    high_water_mark     BIGINT := 0;

    -- Variables for the final summary
    first_empty_block   BIGINT;
    free_pages_at_end   BIGINT;
    free_space_at_end   TEXT;
BEGIN
    -- Setup
    table_oid_regclass := p_table_name::regclass;
    block_size  := current_setting('block_size')::bigint;
    SELECT relpages INTO total_pages FROM pg_class WHERE oid = table_oid_regclass;
    fsm_granularity    := block_size / 256;
    max_fsm_free_space := floor((block_size - 24) / fsm_granularity) * fsm_granularity;

    --------------------------------------------------------------------------------
    -- PASS 1: FIND THE HIGH-WATER MARK (last page with data)
    --------------------------------------------------------------------------------
    FOR i IN REVERSE (total_pages - 1)..0 LOOP
        IF pg_freespace(table_oid_regclass, i) < max_fsm_free_space THEN
            high_water_mark := i;
            EXIT;
        END IF;
    END LOOP;

    --------------------------------------------------------------------------------
    -- FINAL STEP: CALCULATE AND RAISE THE SUMMARY NOTICE
    --------------------------------------------------------------------------------
    first_empty_block := high_water_mark + 1;
    free_pages_at_end := total_pages - first_empty_block;
    IF free_pages_at_end < 0 THEN
        free_pages_at_end := 0;
    END IF;
    free_space_at_end := pg_size_pretty(free_pages_at_end * block_size);

    RAISE NOTICE '-------------------------------------------------------------';
    RAISE NOTICE 'Summary for table: %', p_table_name;
    RAISE NOTICE '-------------------------------------------------------------';
    RAISE NOTICE 'The High Water Mark (HWM) is at page: %', total_pages;
    IF total_pages <> first_empty_block THEN
    	RAISE NOTICE 'First potentially empty page is at: %', first_empty_block;
    	RAISE NOTICE 'Total Pages in Table: %', total_pages;
    	RAISE NOTICE 'Number of potentially truncatable pages at the end: %', free_pages_at_end;
    	RAISE NOTICE 'Amount of free space at the end of the table: %', free_space_at_end;
    ELSE
    	RAISE NOTICE 'There are no empty pages to truncate';
    END IF;
    RAISE NOTICE '-------------------------------------------------------------';
END;
$$ LANGUAGE plpgsql;

This handy script could be periodically executed to check the High Water Mark (HWM) and will produce the following output:

(postgres@10.3.1.17:5432) [postgres] > SELECT * FROM show_empty_pages('public.large_table');
NOTICE:  -------------------------------------------------------------
NOTICE:  Summary for table: public.large_table
NOTICE:  -------------------------------------------------------------
NOTICE:  The High Water Mark (HWM) is at page: 302534
NOTICE:  First potentially empty page is at: 302233
NOTICE:  Total Pages in Table: 302534
NOTICE:  Number of potentially truncatable pages at the end: 301
NOTICE:  Amount of free space at the end of the table: 2408 kB
NOTICE:  -------------------------------------------------------------

If there is no freespace after the last full block the output will look like this:

NOTICE:  -------------------------------------------------------------
NOTICE:  Summary for table: public.large_table
NOTICE:  -------------------------------------------------------------
NOTICE:  The High Water Mark (HWM) is at page: 302233
NOTICE:  There are no empty pages to truncate
NOTICE:  -------------------------------------------------------------

So while there is no right answer on how to deal with this, ensure you know the implications of each step in the process. In this case, we have decided to turn the “vacuum_truncation” option to false, but maybe another option might be to tune vacuum in another way such as either making it more or less frequent. Always evaluate your own situation, but in any case it’s always good to know what happens in your database when certain commands are executed.

Enjoy!

[syndicated profile] planetposgresql_feed

Chicago is better in summer! Our inaugural event, PG DATA 2026, will kick off 363 days from today, and we want you to be a part of it!

PG DATA’s mission is to encourage the growth of the PostgreSQL community in the Midwest through learning and networking opportunities, to foster collaboration with the developers’ community and academia, and to bring the best global Postgres speakers to our local community.

More details are coming soon! Watch for further announcements, and do not plan your vacation for the first week of June 2026. Alternately, make Chicago your vacation destination—our beautiful city has a lot to offer, especially in summer!

Call for Sponsors is open!

[syndicated profile] planetposgresql_feed

Logical replication is a versatile feature offered in PostgreSQL. I have discussed the the theoretical background of this feature in detail in my POSETTE talk. At the end of the talk, I emphasize the need for monitoring logical replication setup. If you are using logical replication and have setup monitoring you will be familiar with pg_stat_replication_slots. In some cases this view shows high amount of spill_txnsspill_count and spill_bytes, which indicates that the WAL sender corresponding to that replication slot is using high amount of disk space. This increases load on the IO subsystem affecting the performance. It also means that there is less disk available for user data and regular transactions to operate. This is an indication that logical_decoding_work_mem has been configured too low. That's the subject of this blog: how to decide the right configuration value for logical_decoding_work_mem. Let's first discuss the purpose of this GUC. Blog might serve as a good background before reading further.

Reorder buffer and logical_decoding_work_mem

When decoding WAL, a logical WAL sender accumulates the transaction in an in-memory data structure called reorder buffer. For every transaction that WAL sender encounters, it maintains a queue of changes in that transaction. As it reads each WAL records, it finds the transaction ID which it belongs to and adds it to the corresponding queue of changes. As soon as it sees a COMMIT record of a transaction, it decodes all the changes in the corresponding queue and sends downstream. If the reorder buffer fills up by transactions whose COMMIT record is yet to be seen, it spills the queue to the disk. We see such disk spills accounted in spill_txnsspill_count and spill_bytes. The amount of memory allocated to reorder buffer is decided by logical_decoding_work_mem GUC. If GUC value is lower, it will cause high disk spills and if the value is higher it will waste memory. Every WAL sender in the server will allocate logical_decoding_work_mem amount of memory, thus the total memory consumed for maintaining reorder buffer is {number of WAL senders} * logical_decoding_work_mem which can go upto max_wal_senders * logical_decoding_work_mem.

Setting logical_decoding_work_mem optimally

It's clear that reorder buffer should be able to hold WAL records of all the concurrent transactions to avoid disk spills. How many concurrent transactions there can be? Every backend in PostgreSQL, client as well as worker can potentially start a transaction and there can be only one transaction active at a given time in a given backend. Thus the higher bound on the number of concurrent transactions in a server is decided by max_connections which decided the maximum number of client backends in the server, max_prepared_transactions which decides the number of prepared transaction in addition to the transactions in client backends, max_worker_processes and autovacuum_max_workers which together decide the backends other than the client backends which may execute transactions. The sum of all these GUCs gives the higher bound on the number of concurrent transactions that can be running at a time in a server. Assuming that average amount of WAL produced by each transaction is known, the total amount WAL that may get added to reorder buffers is {maximum number of transactions in the system} * {average amount of WAL produced by each transaction}. The question is how to find the average?

Transactions by different applications and transactions by worker processes all may have different characteristics and thus produce different amounts of WAL. But they all compete for space in reorder buffer and they all are part of a single WAL stream, which can be examined by pg_waldump. There are a few ways, we can utilize this utility to estimate logical_decoding_work_mem.
  1. Count the number of commits or aborts in a given set of WAL segments and divide the total size of WAL segments by that count. The total size of WAL segments will be {number of WAL segments] * {size of each WAL segment}. If you are seeing transactions being spilled to disk, the total amount of WAL generated by concurrent transactions is higher than logical_decoding_work_mem which by default is 64MB which is equivalent to 4 WAL segments of, default size, 16MB each. So you will need to analyze several WAL segments not just a few.
  2. pg_waldump reports WAL records by transaction. It can be used for a better estimate by sampling typical transactions from pg_waldump and estimating sizes of each such typical transactions and their counts.
  3. Modify pg_waldump to keep a running count of amount of WAL accumulated in reorg buffer. The algorithm would look like below
    1. T = 0
    2. Read a WAL record. If the record belongs to transaction x, Cx = Cx + size of WAL record, where Cx maintains the total size of WAL records of transaction x so far. If x is a new transaction, Cx = size of WAL record
    3. T = T + Cx where T is the total size of WAL records accumulated in reorder buffer when that record was read.
    4. When a COMMIT or ABORT WAL record of transaction x is read, T = T - Cx.

      This way T tracks the size of WAL records accumulated in the reorder buffer at a given point in time. Maximum over T can be used to estimate logical_decoding_work_mem.
  4. If you are not comfortable with C or pg_waldump, above option can be implemented by parsing output of pg_waldump using higher level languages like python.
Once you have estimated the maximum amount of WAL that may get accumulated in the reorder buffer, add about 5% overhead of other data structures with reorder buffer and you have your first estimate of logical_decoding_work_mem. It can be refined further by setting the GUC and monitoring pg_stat_replication_slots.

However, remember that each WAL sender will consume logical_decoding_work_mem amount of memory which may affect the total memory available for the regular server operation. You may find an optimal value which leaves enough memory for regular server operation while reducing the disk spills. Option 3 and 4 would help you with that. If you plot the curve of T against time, you will find memory consumed by the WAL senders in the steady state, eliminating any picks or troughs in memory usage by logical decoding. logical_decoding_work_mem should be configured keeping this steady state consumption in mind.

Even after doing all this, the disk spill is high or there's too much memory consumed by WAL senders, you best bet is to use streaming in-progress transactions by specifying streaming parameter to logical replication protocol. Find more about that in this blog.

If you know other ways to estimate logical decoding work memory or avoiding disk spill, please comment on this blog.
[syndicated profile] planetposgresql_feed

Learn how to set up a PostgreSQL cluster with the pgvector extension on Kubernetes using CloudNativePG—all in a fully declarative way. This article walks you through the process in just a few minutes, from cluster creation to extension installation.

[syndicated profile] planetposgresql_feed

I do a fair number of benchmarks, not only to validate patches, but also to find interesting (suspicious) stuff to improve. It’s an important part of my development workflow. And it’s fun ;-) But we’re dealing with complex systems (hardware, OS, DB, application), and that brings challenges. Every now and then I run into something that I don’t quite understand.

Consider a read-only pgbench, the simplest workload there is, with a single SELECT doing lookup by PK. If you do this with a small data set on any machine, the expectation is near linear scaling up to the number of cores. It’s not perfect, CPUs have frequency scaling and power management, but it should be close.

Some time ago I tried running this on a big machine with 176 cores (352 threads), using scale 50 (about 750MB, so tiny - it actually fits into L3 on the EPYC 9V33X CPU). And I got the following chart for throughput with different client counts:

results for read-only pgbench on a system with 176 cores

This is pretty awful. I still don’t think I entirely understand why this happens, or how to improve the behavior. But let me explain what I know so far, what I think may be happening, and perhaps someone will correct me or have an idea how to fix it.

[syndicated profile] planetposgresql_feed

Cover photo by Luca Upper

I recently saw a feature request for pg_stat_statements to be able to track percentile performance of queries, for example the p95 (95th percentile) or p99 (99th percentile).

That would be fantastic, but isn’t yet possible. In the meantime, there is a statistically-dodgy-but-practically-useful (my speciality) way to approximate them using the mean and standard deviation columns in pg_stat_statements.

Why bother?

When wondering what our user experience is like across different queries, we can miss issues if we only look at things by the average time taken.

For example, let’s consider a query that takes on average 100ms but 1% of the time it takes over 500ms (its p99), and a second query that takes on average 110ms but with a p99 of 200ms. It is quite possible that the first query is causing more user dissatisfaction, despite being faster on average.

Brief statistics refresher

The standard deviation is a measure of the amount of variation from the mean. Wider distributions of values have larger standard deviations.

pg_stat_statements has mean_exec_time (mean execution time) and mean_plan_time (mean planning time) columns, but no median equivalents. The other columns we’ll be using for our approximation calculation are stddev_exec_time and stddev_plan_time.

In a perfectly normally distributed data set, the p90 is 1.28 standard deviations above the mean, the p95 is 1.65, and the p99 is 2.33.

Our query timings are probably not normally distributed, though. In fact, many will have a longer tail on the slow end, and some will have a multimodal distribution (with clustering due to things like non-evenly distributed data and differing query plans).

Having said that, even though many of our query timings are not normally distributed, queries with a high p99 are very likely to also have a high mean-plus-a-couple-of-standard-deviations, so if we approximate the p99 assuming a normal distribution, the results should be directionally correct.

Just give me the query already

Here’s a simple query to get our top 50 queries by their approximate p99 timings:

select	mean_exec_time::int,
	mean_plan_time::int,
	stddev_exec_time::int,
	stddev_plan_time::int,
	((mean_exec_time + mean_plan_time) + 
	2.33 * (stddev_exec_time + stddev_plan_time))::int 
	as approx_p99, 
	calls,
	query
from	pg_stat_statements
where	calls > 100
order by approx_p99 desc
limit 50;

Here I’ve summed the mean execution and planning times, and added 2.33 times the sum of their standard deviations. If you’d like to approximate a different percentile, simply substitute the multiple.

Since the timings statistics are in milliseconds, I like to round them to integers to make them easier to scan (at the cost of a little precision, of course).

I also like to filter out queries that have not been executed much, since we usually want to focus on our more common queries. Also, standard deviations and percentile metrics naturally make less sense at low volumes.

If you run the above and notice that your planning times are all zero, you may want to look into the pg_stat_statements track_planning setting, which is off by default.

Alternatives

While I think this provides a nice proxy metric, and is likely more useful than ordering by mean time alone, there are some very reasonable alternatives.

For example, it may simply be better to monitor p95 and p99 type metrics from the application or user perspective, for example in an APM (Application Performance Monitoring) or RUM (Real User Monitoring) tool.

On the database-centric side, another option is the pg_stat_monitor extension, which includes a histogram feature that makes approximating percentiles more accurate. The extension is relatively new though (at least compared to pg_stat_statements!) and as such is both less battle-tested and is not yet available in many environments.

Conclusion

In short, we can approximate percentile statistics like p95 and p99 using the mean and stddev columns in pg_stat_statements. For some use cases, this will likely be a more useful thing to order by (or monitor) than the mean time alone.

As discussed, this method does have obvious flaws, not least that the data is unlikely to be normally distributed. As such, it’s wise to clearly label it as an approximation.

[syndicated profile] planetposgresql_feed

POSETTE: An Event for Postgres 2025 is back for its 4th year—free, virtual, and packed with deep expertise. No travel needed, just your laptop, internet, and curiosity.

This year’s 45 speakers are smart, capable Postgres practitioners—core contributors, performance experts, application developers, Azure engineers, extension maintainers—and their talks are as interesting as they are useful.

The four livestreams (42 talks total) run from June 10-12, 2025. Every talk will be posted to YouTube afterward (un-gated, of course). But if you can join live, I hope you do! On the virtual hallway track on Discord, you’ll be able to chat with POSETTE speakers—as well as other attendees. And yes, there will be swag.

This “ultimate guide” blog post is your shortcut to navigating POSETTE 2025. In this post you’ll get:

“By the numbers” summary for POSETTE 2025

Here’s a quick snapshot of what you need to know about POSETTE:

About POSETTE: An Event for Postgres 2025
3 days June 10-12, 2025
4 livestreams In Americas & EMEA time zones (but of course you can watch from anywhere)
42 talks All free, all virtual
2 keynotes From Bruce Momjian & Charles Feddersen
45 speakers PG contributors, users, application developers, community members, & Azure engineers
17.4% CFP acceptance rate 40 talks selected from 230 submissions
26% Azure-focused talks 11 talks out of 42 feature Azure Database for PostgreSQL
74% general Postgres talks 31 talks are not cloud-specific at all
16 languages Published videos will have captions available in 16 languages, including English, Czech, Dutch, French, German, Italian, Japanese, Korean, Polish, Portuguese, Russian, Spanish, Turkish, Ukrainian, Chinese Simplified, and Chinese Traditional

And to give you a feel for the hi-level categories and detailed “tags”, to help you navigate all 42 of the talks, maybe this diagram will help.

POSETE 2025 by the numbers
Figure 1: Ultimate Guide for POSETTE 2025, with hi-level categories and detailed tags for all 42 talks.

2 Amazing Keynotes

If you’re interested in what Microsoft is building for Postgres these days, then Charles Feddersen’s keynote is a must-watch. And in spite of all the hype about AI, you’re guaranteed to enjoy Bruce Momjian’s keynote about databases in the AI trenches.

18 Postgres Core talks

Performance

Postgres internals

Replication

Community

Fun

12 Postgres Ecosystem talks

Analytics

App dev

Extensions

Patroni

VS Code

10 Azure Database for PostgreSQL talks

AI-related talks

Customer talks

Flexible Server talks

Oracle to Postgres talks

Where to find the POSETTE Schedule

You may be thinking, “I know how to use a website, Claire.” Fair. But hear me out: the POSETTE 2025 Schedule page has 4 tabs—one for each livestream—and it always opens to Livestream 1 by default.

So if you’re looking for talks in Livestreams 2, 3, or 4:

  • Head to the POSETTE Schedule page
  • Click the tab for the livestream you want
  • Voila—talks for that stream
POSETE 2025 schedule page
Figure 2: Screenshot of the POSETTE 2025 Schedule with separate tabs for the 4 livestreams

How to watch & how to participate on Discord

Here’s how to tune in—and how to participate in the conference.

How to watch the livestreams

  • All 4 livestreams will be watchable on the PosetteConf 2025 home page
  • Pro tip: If you’ve left the page open since the last stream, refresh your browser to see the next livestream.

How to join the virtual hallway track

  • Head to the #posetteconf channel on Discord (on the Microsoft Open Source Discord)
  • That’s where speakers and attendees hang out during the livestreams—it’s where you can ask questions, share reactions, and just say hi.

What’s new in POSETTE 2025

If you attended POSETTE last year (or back when it was called Citus Con), you might be wondering, what’s different this year?

In many ways, the POSETTE playbook is the same: useful and delightful Postgres talks in a virtual, accessible format. But here’s what’s new:

  • New website: And, a new domain too: PosetteConf.com
  • Only 2 keynotes: instead of 4 keynotes last year. We’re honored that Bruce Momjian & Charles Feddersen accepted the invitation to be keynote speakers. Each keynote will be repeated twice.
  • 58% speakers new to POSETTE: 26 out of 45 speakers (58%) are brand new to POSETTE
  • New livestream hosts: 3 of the 7 livestream hosts are brand new to hosting POSETTE livestreams: welcome to Adam Wølk, Derk van Veen, & Thomas Munro
  • Same name: The POSETTE: An Event for Postgres name is here to stay—and we still love the name

Big thank you to our 45 amazing speakers

Every great event starts with great talks—and great talks start with great speakers. Want to learn more about the people behind these talks?

POSETE 2025 speakers
Figure 3: Bio pics for all 45 speakers in POSETTE: An Event for Postgres 25, along with our gratitude.

Join us for POSETTE 2025! Mark your calendars

I hope you join us for POSETTE 2025. Consider yourself officially invited. As part of the talk selection team, I’m definitely biased—but I truly believe these speakers and talk are worth your time.

I’ll be hosting Livestream 1 and Livestream 2 and you’ll find me in the #posetteconf Discord chat. I hope to see you there.

And please—tell your Postgres friends, so they don’t miss out!

🗓️ Add the livestreams to your calendar

Watch last year’s talks in advance: And if you want to get ready, check out the POSETTE 2024 playlist on YouTube. Lots of gems in there.

Acknowledgements & Gratitude

I’ve already thanked the amazing speakers above. In addition, thanks go to Daniel Gustafsson, Teresa Giacomini, and My Nguyen for reviewing parts of this post before publication. And of course, big thank you to the POSETTE 2025 organizing team and POSETTE talk selection team—without you, there would be no POSETTE!

Join the virtual hallway track on Discord
Figure 4: Visual invitation to join the virtual hallway track for POSETTE 2025 on the Microsoft Open Source Discord. So you can chat with the speakers & others in the Postgres community.

This article was originally published on citusdata.com.

[syndicated profile] planetposgresql_feed
Learn how pgstream v0.6 simplifies complex data transformations with custom templates, enhances observability and improves snapshot performance.
[syndicated profile] planetposgresql_feed

PostgreSQL versions follow a well-defined five-year support lifecycle. Each major release receives bug fixes, security patches, and minor updates for five years from its initial release date. After that point, the version reaches end-of-life (EOL) and no longer receives official updates.

Staying on an EOL version exposes your systems to security risks, potential compatibility issues, and missing performance improvements introduced in later releases. You can always check the current support status of PostgreSQL versions on the official PostgreSQL Versioning Policy page.

Upgrading to the latest version ensures long-term stability, access to new features, and better support. Recently, I worked on upgrading a critical production PostgreSQL environment from version 11 to 15. Version 15 was chosen because the client’s application had only been tested up to that release.  The system supported large batch workloads and live applications, so we had to be meticulous. While this article draws from that specific project, the steps are broadly applicable to anyone planning a major PostgreSQL upgrade, especially when crossing several versions.

This guide outlines a generalized, production-ready approach for performing major version upgrades using the pg_dump/pg_restore method.

Upgrade Methods

PostgreSQL provides two primary upgrade options, each with distinct advantages.

1. In-place upgrade using pg_upgrade
This method is designed for rapid transitions and minimal downtime. It upgrades the system catalog in place and reuses existing data files, making it highly efficient. However, it requires careful compatibility checks, especially around tablespaces, file system layout, and extensions.

2. Logical upgrade using pg_dump and pg_restore
This method involves exporting the database schema and data from the old cluster and importing them into a new one. While it involves longer downtime and more disk I/O, it avoids binary compatibility issues and is well-suited for multi-version jumps and cross-platform migrations.

If you have a downtime window and are upgrading across multiple versions, the dump/restore method is often the simpler and safer path. In our case, we had a one-day downtime window and also needed to migrate to a new server, so using the pg_dump/pg_restore method was the most practical and reliable approach. It gave us full control over the migration process and allowed us to verify the restored data and performance on the new instance before final cutover.

Pre-Upgrade Preparation

A major PostgreSQL version upgrade can be performed either on the same host or by migrating to a different server. In our case, we opted for a two-server setup:

  • Source: PostgreSQL 11 (actively serving the application)
  • Target: PostgreSQL 15 (fresh install on a separate server)

At the time of migration, the application was actively connected to the PostgreSQL 11 instance. The goal of this upgrade was to migrate the database from version 11 to 15 on a new server. The migration was carried out on Red Hat Enterprise Linux 9, though the overall approach can be adapted to other operating systems depending on your environment and tooling.

Stop Application 

Prior to the upgrade, all client connections, batch jobs, and scheduled processes must be stopped. This guarantees a consistent state and prevents any post-dump changes from being lost. Application access to the source database should be disabled entirely for the duration of the backup.

Prepare Target Server

If PostgreSQL is not yet installed on the target server, you’ll need to set it up before proceeding with the upgrade. The following instructions demonstrate how to install PostgreSQL 15 on a Red Hat 9 system. You may adjust the version number as needed based on your upgrade target.

First, install the official PostgreSQL repository and disable the system’s default PostgreSQL module:

sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm

sudo dnf -qy module disable postgresql

Next, install the PostgreSQL server package for your desired version:

sudo dnf install -y postgresql15-server

Initialize the database cluster and configure the service to start automatically on boot:

sudo /usr/pgsql-15/bin/postgresql-15-setup initdb

sudo systemctl enable postgresql-15

sudo systemctl start postgresql-15

Some applications rely on specific PostgreSQL extensions that must be installed on the target server prior to restoration. During the restore process, you may encounter warnings or errors if these extensions were present in the source database but are missing from the target environment.

In our case, the only extension we were using was pg_stat_statements, which did not impact the restore itself and could safely be added afterward. However, if your application or schema depends on certain extensions (for custom functions, data types, or triggers), it’s important to ensure those extensions are available before the restore begins to avoid failures or broken dependencies.

Take Dump – Using the target version tools

To ensure compatibility during a major version upgrade, it is strongly recommended to use the pg_dump and pg_dumpall binaries from the target PostgreSQL version (in our case, version 15). This helps avoid potential issues that can arise from using outdated dump formats when restoring to a newer server. If the target binaries are not already available on the source (older) server, you can install just the client tools without the server package using the following command:

sudo dnf install -y postgresql15

If installing PostgreSQL 15 tools on the source server is not possible due to system constraints or compatibility issues, you can run the dump commands remotely from the target server (or any server that has PostgreSQL 15 binaries installed), using the -h flag to connect to the source database over the network. In our scenario, we encountered compatibility issues while trying to install PostgreSQL 15 tools on the production server. Instead, we executed both dump commands remotely from the target Red Hat 9 server using PostgreSQL 15 binaries. This approach worked reliably after setting a password for the postgres user to allow authenticated remote access. 

Export the main database using custom format:

/usr/pgsql-15/bin/pg_dump -Fc -h <source-host> -U postgres -d <database> -f /path/to/backup.dump

The custom format is recommended because it allows greater control during restoration such as selective restores and parallelism, etc.  Note that backup time will vary depending on database size and hardware. In our case, backing up an 800 GB database took approximately two hours on moderately provisioned infrastructure.

Next, export global objects such as roles, tablespaces, and ownership metadata separately:

/usr/pgsql-15/bin/pg_dumpall -g -h <source-host> -U postgres > /path/to/globals.sql

Once the backup is complete, copy both files to the target server. Additionally, store a copy of both on a separate host (outside of the source and target environments) to serve as a recovery fallback in case of unexpected failure during the upgrade process.

Upgrade Execution Plan

Once the backup files have been transferred to the target server and validated, proceed with the following steps to complete the database restoration.

Begin by restoring global objects such as roles, tablespaces, and their associated privileges. These were captured using pg_dumpall -g and are essential for preserving access control and ownership:

psql -U postgres -f /path/to/globals.sql

Next, create a fresh, empty database with the same name as the original source database:

createdb -U postgres <database>

With the database shell in place, restore the main database dump using pg_restore. For improved performance, enable parallel restore mode using the -j flag, as this can greatly speed up the process. The number of parallel jobs should be adjusted based on available CPU and I/O capacity on the target system:

nohup pg_restore -U postgres -d <database> -j 4 -v /path/to/backup.dump > restore.log 2>&1 &

Using nohup allows the command to continue running in the background even if the terminal session is closed. The -v flag enables verbose output, and restore.log captures both standard output and error messages for review.

Monitor the restore.log file to track progress and check for any errors during the restoration process. Depending on the database size and server resources, this step can take significant time. In our case, the restore of an 800 GB dump completed in approximately 2.5 hours.

After the restoration is complete, run ANALYZE on the database to refresh PostgreSQL’s planner statistics. This ensures the query planner can make informed decisions based on the current data distribution:

psql -U postgres -d <database> -c "ANALYZE;"

Install Required Extensions 

For extensions provided by PostgreSQL's contrib modules like pg_stat_statements you must first install the appropriate package.

sudo dnf install -y postgresql15-contrib

Next, configure PostgreSQL to preload the extension by modifying the postgresql.conf file:

shared_preload_libraries = 'pg_stat_statements'

After updating the configuration, restart the PostgreSQL service for changes to take effect:

sudo systemctl restart postgresql-15

Finally, enable the extension within the database:

psql -U postgres -d <database> -c "CREATE EXTENSION pg_stat_statements;"

To verify that the extension is successfully installed and active, connect to the database and run:

\dx

This command lists all extensions installed in the current database. You should see pg_stat_statements or any others you’ve enabled in the output.

Validate Schema and Structural Integrity

After restoring the database, it is important to validate that the schema and object structure match the original environment. Start by verifying that the number and types of database objects (tables, indexes, views, etc.) match the expected counts. To do this effectively, ensure that you have captured and stored the corresponding object counts from the original production database (source version) prior to the upgrade.

You can run queries like the following to review object distributions by type:

 SELECT
   n.nspname AS schema_name,
   CASE
       WHEN c.relkind = 'r' THEN 'TABLE'
       WHEN c.relkind = 'i' THEN 'INDEX'
       WHEN c.relkind = 'S' THEN 'SEQUENCE'
       WHEN c.relkind = 't' THEN 'TOAST TABLE'
       WHEN c.relkind = 'v' THEN 'VIEW'
       WHEN c.relkind = 'm' THEN 'MATERIALIZED VIEW'
       WHEN c.relkind = 'c' THEN 'COMPOSITE TYPE'
       WHEN c.relkind = 'f' THEN 'FOREIGN TABLE'
       WHEN c.relkind = 'p' THEN 'PARTITIONED TABLE'
       WHEN c.relkind = 'I' THEN 'PARTITIONED INDEX'
       ELSE 'OTHER'
   END AS object_type,
    COUNT(*) AS count

FROM
   pg_class c
JOIN
   pg_namespace n ON c.relnamespace = n.oid
WHERE
   n.nspname IN ('public') 
GROUP BY
   n.nspname, object_type
ORDER BY
    n.nspname, object_type;

This query aggregates object counts grouped by schema and object type using pg_class and pg_namespace. By default, the WHERE clause filters for the public schema. You can either replace ‘public’ with a specific schema name you want to inspect or remove the WHERE clause entirely to include all schemas.

You may also run count checks on critical tables and compare key constraint definitions. Be aware that some catalog-level differences between PostgreSQL versions may lead to minor, expected variations in metadata.

Application Cutover

Once the database has been restored, validated, and tested, the final step is to point the application to the new PostgreSQL server.

Update the application’s connection strings or service configurations to reference the new database host, port, and credentials. On the PostgreSQL side, update the pg_hba.conf file to allow connections from the application hosts, ensuring that appropriate authentication methods are used. Also verify the listen_addresses and port settings in postgresql.conf to confirm that the database is accessible from external systems

Rollback Strategy

Having a rollback plan is essential for any major database upgrade. The rollback approach will differ depending on whether the issue occurs during the upgrade window or after the application has gone live on the new system.

If Issues Occur During the Upgrade Window

Plan A: Redirect the application back to the original PostgreSQL production server. Since no writes would have taken place on the new server at this stage, this provides a clean and immediate fallback with minimal risk.

Plan B: If the original production server is inaccessible or compromised, restore the most recent logical backup (ideally stored on a separate, secure host) to a recovery server. This ensures that a known, consistent version of the database remains accessible, even in the event of infrastructure failure.

If Issues Occur After Go-Live (Writes Have Occurred)

Plan A: Resolve the issue directly on the new PostgreSQL instance. This is the preferred approach, as it preserves any new data written since the cutover and avoids complex recovery operations.

Plan B: Revert to the old PostgreSQL server. This is a last-resort option and involves identifying and manually transferring any data that was created or modified on the new server back to the old environment. This process is time-consuming and introduces risk, and should only be considered when all other remediation efforts have failed.

Mitigation Strategy

To reduce the risk of data inconsistency and simplify rollback procedures, it is advisable to initially run the new PostgreSQL instance in read-only mode after the upgrade. This allows for application-level validation in a production-like environment without making irreversible changes. Once the application has been fully tested and confirmed stable, read/write access can be enabled, completing the transition.

​​Summary

Upgrading between major PostgreSQL versions requires careful planning. Always test your upgrade process in a staging environment before performing it on production. In our upgrade from PostgreSQL 11 to 15, we prioritized safety and transparency. This approach allowed us to validate the schema, minimize risk, and transition to a supported version with confidence.

Choose the upgrade method that best aligns with your environment, downtime tolerance, and operational requirements.

The post How to Upgrade Major PostgreSQL Versions: A Practical Production Guide appeared first on Stormatics.

[syndicated profile] planetposgresql_feed

Many companies these days are thinking about migrating their databases from legacy or proprietary system to PostgreSQL. The primary aim is to reduce costs, enhance capabilities, and ensure long-term sustainability. However, even just the idea of migrating to PostgreSQL can be overwhelming. Very often, knowledge about the legacy applications is limited or even lost. In some cases, vendor support is diminishing, and expert pools and community support are shrinking. Legacy databases are also often running on outdated hardware and old operating systems, posing further risks and limitations. (more…)

Profile

astarsan: (Default)
astarsan

September 2017

S M T W T F S
     12
3456789
10111213141516
1718 1920212223
24252627282930

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 14th, 2025 09:48 am
Powered by Dreamwidth Studios