Skip to content

PostgreSQL Replica Fails to Start After Major Version Upgrade (WAL-G Issue) #2957

@devRamsheed

Description

@devRamsheed

The operator was upgraded from version 1.9 to 1.11, and PostgreSQL was upgraded from 13 to 14 using the python3 /scripts/inplace_upgrade.py N in-place upgrade command.

During this process, the leader node was successfully upgraded, but the replica failed to start due to a WAL-G timeline/history issue. I tried several approaches, such as deleting the replica pod, deleting PVC, and even reinitialising the replica from the leader pod, but none of these solutions worked.

The only workaround that resolved the issue was to delete the entire backup from Azure Blob Storage, create a fresh backup from the leader, and then restart the replica.

At first, I assumed this was a random error. However, when I attempted a similar upgrade on another PostgreSQL cluster, I encountered the same problem—the replica consistently failed to start after the major version upgrade.

Unfortunately, in my current environment, deleting the entire backup (as I did in the development cluster) is not an option.

Could you please suggest if there is an alternative solution to this issue?
ghcr.io/zalando/spilo-16:3.2-p2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions