Skip to content

Commit cbc47ad

Browse files
committed
Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().
The tests added in 9f8a050 failed nearly reliably on FreeBSD in CI, and occasionally on the buildfarm. That turns out to be caused not by a bug in the test, but by a longstanding bug in recovery conflict handling. The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(), executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad idea, because the deadlock timeout handler (or a spurious latch set) could have interrupted ProcWaitForSignal(). If unlucky that could cause a self-deadlock on ProcArrayLock, if the deadlock check is in SendRecoveryConflictWithBufferPin()->CancelDBBackends(). To fix, set a flag in StandbyTimeoutHandler(), and check the flag in ResolveRecoveryConflictWithBufferPin(). Subsequently the recovery conflict tests will be backpatched. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-
1 parent e8a0cf9 commit cbc47ad

File tree

1 file changed

+12
-10
lines changed

1 file changed

+12
-10
lines changed

src/backend/storage/ipc/standby.c

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ static HTAB *RecoveryLockLists;
4444

4545
/* Flags set by timeout handlers */
4646
static volatile sig_atomic_t got_standby_deadlock_timeout = false;
47+
static volatile sig_atomic_t got_standby_delay_timeout = false;
4748
static volatile sig_atomic_t got_standby_lock_timeout = false;
4849

4950
static void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
@@ -593,10 +594,15 @@ ResolveRecoveryConflictWithBufferPin(void)
593594
enable_timeouts(timeouts, cnt);
594595
}
595596

596-
/* Wait to be signaled by UnpinBuffer() */
597+
/*
598+
* Wait to be signaled by UnpinBuffer() or for the wait to be interrupted
599+
* by one of the timeouts established above.
600+
*/
597601
ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
598602

599-
if (got_standby_deadlock_timeout)
603+
if (got_standby_delay_timeout)
604+
SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
605+
else if (got_standby_deadlock_timeout)
600606
{
601607
/*
602608
* Send out a request for hot-standby backends to check themselves for
@@ -622,6 +628,7 @@ ResolveRecoveryConflictWithBufferPin(void)
622628
* individually, but that'd be slower.
623629
*/
624630
disable_all_timeouts(false);
631+
got_standby_delay_timeout = false;
625632
got_standby_deadlock_timeout = false;
626633
}
627634

@@ -681,8 +688,8 @@ CheckRecoveryConflictDeadlock(void)
681688
*/
682689

683690
/*
684-
* StandbyDeadLockHandler() will be called if STANDBY_DEADLOCK_TIMEOUT
685-
* occurs before STANDBY_TIMEOUT.
691+
* StandbyDeadLockHandler() will be called if STANDBY_DEADLOCK_TIMEOUT is
692+
* exceeded.
686693
*/
687694
void
688695
StandbyDeadLockHandler(void)
@@ -692,16 +699,11 @@ StandbyDeadLockHandler(void)
692699

693700
/*
694701
* StandbyTimeoutHandler() will be called if STANDBY_TIMEOUT is exceeded.
695-
* Send out a request to release conflicting buffer pins unconditionally,
696-
* so we can press ahead with applying changes in recovery.
697702
*/
698703
void
699704
StandbyTimeoutHandler(void)
700705
{
701-
/* forget any pending STANDBY_DEADLOCK_TIMEOUT request */
702-
disable_timeout(STANDBY_DEADLOCK_TIMEOUT, false);
703-
704-
SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
706+
got_standby_delay_timeout = true;
705707
}
706708

707709
/*

0 commit comments

Comments
 (0)