PostgreSQL Replica Fails to Start After Major Version Upgrade (WAL-G Issue)

The operator was upgraded from version 1.9 to 1.11, and PostgreSQL was upgraded from 13 to 14 using the `python3 /scripts/inplace_upgrade.py N` in-place upgrade command.

During this process, the leader node was successfully upgraded, but the replica failed to start due to a WAL-G timeline/history issue. I tried several approaches, such as deleting the replica pod, deleting PVC, and even reinitialising the replica from the leader pod, but none of these solutions worked.

The only workaround that resolved the issue was to delete the entire backup from Azure Blob Storage, create a fresh backup from the leader, and then restart the replica.

At first, I assumed this was a random error. However, when I attempted a similar upgrade on another PostgreSQL cluster, I encountered the same problem—the replica consistently failed to start after the major version upgrade.

Unfortunately, in my current environment, deleting the entire backup (as I did in the development cluster) is not an option.

Could you please suggest if there is an alternative solution to this issue?
ghcr.io/zalando/spilo-16:3.2-p2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PostgreSQL Replica Fails to Start After Major Version Upgrade (WAL-G Issue) #2957

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PostgreSQL Replica Fails to Start After Major Version Upgrade (WAL-G Issue) #2957

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions