Skip to content

flake: Database cleaner subprocess timeout on macOS - TestScheduleShow, TestCli #1026

@blink-so

Description

@blink-so

CI Failure Details

CI Run: https://github.com/coder/coder/actions/runs/18002107653
Failed Job: test-go-pg (macos-latest)
Commit: da467bad2af6922488faef40470f2e3150472694
Date: 2025-09-25

Failing Tests

  1. TestScheduleShow (cli/schedule_test.go:83)
  2. TestCli (cli/clitest/clitest_test.go:18)

Root Cause Analysis

Both tests are failing with the same error during database setup:

    github.com/coder/coder/v2/coderd/database/dbtestutil.(*Broker).init
        /Users/runner/work/coder/coder/coderd/database/dbtestutil/broker.go:155
  - context deadline exceeded

Analysis

  • The failure occurs in the recently introduced database cleaner subprocess (PR #19844, commit e2f5401fb)
  • The cleaner uses go run github.com/coder/coder/v2/coderd/database/dbtestutil/cleanercmd to start a subprocess
  • This subprocess has a 20-second timeout to start and send an "OK" response
  • This failure is specific to macOS runners - Ubuntu and Windows tests pass
  • The timeout suggests either:
    1. Go module compilation is slow on macOS
    2. Subprocess startup is slow on macOS
    3. Network/dependency issues on macOS runners

Assignment Analysis

Using git blame on the failing test functions:

TestScheduleShow (lines 83-90):

$ git blame -L 83,90 cli/schedule_test.go
a4f1319108 (Cian Johnston 2023-11-10 13:51:49 +0000 83) func TestScheduleShow(t *testing.T) {

TestCli (lines 18-25):

$ git blame -L 18,25 cli/clitest/clitest_test.go
154b9bce57 (Kyle Carberry    2022-02-12 13:34:04 -0600 18) func TestCli(t *testing.T) {

However, this is not a test-specific issue. The failure is in the database infrastructure code added by @spikecurtis in commit e2f5401fb. The actual issue is in the database cleaner startup process, not the individual test logic.

Recommendations

  1. Increase timeout for macOS specifically (if this is expected slower startup)
  2. Pre-compile the cleaner binary instead of using go run
  3. Add better logging to understand where the delay occurs
  4. Consider macOS-specific handling if this is a platform limitation

Related Issues

This cleaner was added to fix #927 (database leaks), but the fix is causing reliability issues on macOS.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions