Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] DocsClientYamlTestSuiteIT class failing #124671

Closed
elasticsearchmachine opened this issue Mar 12, 2025 · 6 comments · Fixed by #124684
Closed

[CI] DocsClientYamlTestSuiteIT class failing #124671

elasticsearchmachine opened this issue Mar 12, 2025 · 6 comments · Fixed by #124684
Assignees
Labels
:Data Management/Data streams Data streams and their lifecycles low-risk An open issue or test failure that is a low risk to future releases Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

Build Scans:

Reproduction Line:

undefined

Applicable branches:
8.x

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

undefined

Issue Reasons:

  • [8.x] 30 failures in class org.elasticsearch.smoketest.DocsClientYamlTestSuiteIT (3.2% fail rate in 933 executions)
  • [8.x] 9 failures in step windows-2022_checkpart1_platform-support-windows (42.9% fail rate in 21 executions)
  • [8.x] 8 failures in step windows-2019_checkpart1_platform-support-windows (40.0% fail rate in 20 executions)
  • [8.x] 9 failures in step part-1-windows (90.0% fail rate in 10 executions)
  • [8.x] 11 failures in pipeline elasticsearch-periodic-platform-support (52.4% fail rate in 21 executions)
  • [8.x] 5 failures in pipeline elasticsearch-pull-request (4.6% fail rate in 108 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added :Data Management/Data streams Data streams and their lifecycles >test-failure Triaged test failures from CI Team:Data Management Meta label for data/management team labels Mar 12, 2025
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the needs:risk Requires assignment of a risk label (low, medium, blocker) label Mar 12, 2025
@nielsbauman
Copy link
Contributor

I had a quick look at all the linked build scans and I'm seeing a few different things:

  • The most common one is
     java.lang.AssertionError: Failure at [reference/cat/nodes:15]: field [$body] was expected to match the provided regex but didn't	
     Expected: ip        \s+heap.percent \s+ram.percent \s+cpu \s+load_1m \s+load_5m \s+load_15m \s+node.role \s+master \s+name\s* 127.0.0.1           \s+\d+ \s+\d+ \s+\d+    \s+(\d+\.\d+( \s+\d+\.\d+ \s+(\d+\.\d+)?)?)?                  \s+.+       \s+[*]      \s+.+\s*	
          but: was "ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name\n127.0.0.1           59          39  -1                          cdfhilmrstw *      node-0\n"
    
    See this example build scan. Someone should have a look at this to check what's going on.
  • But I'm also seeing
     java.lang.AssertionError: Failure at [reference/troubleshooting/common-issues/disk-usage-exceeded:78]: got unexpected warning header [	
     	299 Elasticsearch-8.19.0-a85f65f951ce7132d699cc42230404060f876dc0 "this request accesses system indices: [.security-7], but in a future major version, direct access to system indices will be prevented by default"	
     ]
    
    and
     java.lang.AssertionError: Failure at [reference/rest-api/common-options:235]: Expected a map containing	
        metadata: a map containing	
           indices: a map containing	
     my-index-000003: a map containing	
                 state: "open"	
     my-index-000002: a map containing	
                 state: "open"	
     my-index-000001: a map containing	
                 state: "open"	
         .security-7: <unexpected> but was <{state=open}>
    
    in this and this build scan respectively. Both of these are caused by the recent changes to the security index. @slobodanadamovic, I fear we're going to keep running into issues (also considering the ones that are still open). Do you happen to have any suggestions on a more systematic approach to tackle/prevent these test failures? They're only test failures (not actual production issues), so the "only" impact they have is on us ES engineers, but I'd like to avoid these from showing up if possible.

@nielsbauman
Copy link
Contributor

I'm not changing the risk yet because I'm not sure what the cause/impact of the first failure type is.

@slobodanadamovic
Copy link
Contributor

Sorry for not handling these earlier. I've been swamped with other tasks.
I've raised #124684, which disables the feature for docs YAML tests.
There is no real need for this feature to be enabled in all YAML tests.

@nielsbauman
Copy link
Contributor

@slobodanadamovic no worries! I figured you were swamped, so I was planning on giving it a go to resolve the other currently open test failures caused by the security index. Sorry, I forgot turning off the eager index creation was an option for test suites that don't need it, otherwise I would have opened a PR myself and saved you some work.

Thanks for opening the PR!

jfreden pushed a commit to jfreden/elasticsearch that referenced this issue Mar 13, 2025
The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves elastic#122343 Resolves
elastic#121748 Resolves
elastic#121611 Resolves
elastic#121345 Resolves
elastic#121338 Resolves
elastic#121337 Resolves
elastic#121288 Resolves
elastic#121287 Resolves
elastic#121867 Resolves
elastic#122335 Resolves
elastic#122681 Resolves
elastic#121976 Resolves
elastic#123094 Resolves
elastic#123192 Resolves
elastic#122983 Resolves
elastic#124671 Resolves
elastic#124103
@nielsbauman nielsbauman added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Mar 13, 2025
@nielsbauman
Copy link
Contributor

Regarding the first failure type, that's being addressed in #124103.

slobodanadamovic added a commit to slobodanadamovic/elasticsearch that referenced this issue Mar 13, 2025
The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves elastic#122343 Resolves
elastic#121748 Resolves
elastic#121611 Resolves
elastic#121345 Resolves
elastic#121338 Resolves
elastic#121337 Resolves
elastic#121288 Resolves
elastic#121287 Resolves
elastic#121867 Resolves
elastic#122335 Resolves
elastic#122681 Resolves
elastic#121976 Resolves
elastic#123094 Resolves
elastic#123192 Resolves
elastic#122983 Resolves
elastic#124671 Resolves
elastic#124103

(cherry picked from commit cac356a)

# Conflicts:
#	muted-tests.yml
slobodanadamovic added a commit to slobodanadamovic/elasticsearch that referenced this issue Mar 13, 2025
The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves elastic#122343 Resolves
elastic#121748 Resolves
elastic#121611 Resolves
elastic#121345 Resolves
elastic#121338 Resolves
elastic#121337 Resolves
elastic#121288 Resolves
elastic#121287 Resolves
elastic#121867 Resolves
elastic#122335 Resolves
elastic#122681 Resolves
elastic#121976 Resolves
elastic#123094 Resolves
elastic#123192 Resolves
elastic#122983 Resolves
elastic#124671 Resolves
elastic#124103

(cherry picked from commit cac356a)

# Conflicts:
#	muted-tests.yml
slobodanadamovic added a commit to slobodanadamovic/elasticsearch that referenced this issue Mar 13, 2025
The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves elastic#122343 Resolves
elastic#121748 Resolves
elastic#121611 Resolves
elastic#121345 Resolves
elastic#121338 Resolves
elastic#121337 Resolves
elastic#121288 Resolves
elastic#121287 Resolves
elastic#121867 Resolves
elastic#122335 Resolves
elastic#122681 Resolves
elastic#121976 Resolves
elastic#123094 Resolves
elastic#123192 Resolves
elastic#122983 Resolves
elastic#124671 Resolves
elastic#124103

(cherry picked from commit cac356a)

# Conflicts:
#	muted-tests.yml
elasticsearchmachine pushed a commit that referenced this issue Mar 13, 2025
)

The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves #122343 Resolves
#121748 Resolves
#121611 Resolves
#121345 Resolves
#121338 Resolves
#121337 Resolves
#121288 Resolves
#121287 Resolves
#121867 Resolves
#122335 Resolves
#122681 Resolves
#121976 Resolves
#123094 Resolves
#123192 Resolves
#122983 Resolves
#124671 Resolves
#124103

(cherry picked from commit cac356a)

# Conflicts:
#	muted-tests.yml
elasticsearchmachine pushed a commit that referenced this issue Mar 13, 2025
)

The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves #122343 Resolves
#121748 Resolves
#121611 Resolves
#121345 Resolves
#121338 Resolves
#121337 Resolves
#121288 Resolves
#121287 Resolves
#121867 Resolves
#122335 Resolves
#122681 Resolves
#121976 Resolves
#123094 Resolves
#123192 Resolves
#122983 Resolves
#124671 Resolves
#124103

(cherry picked from commit cac356a)

# Conflicts:
#	muted-tests.yml
elasticsearchmachine pushed a commit that referenced this issue Mar 13, 2025
)

The .security index is created asynchronously on a cluster startup. This
affects some of the docs YAML tests in a way that they need to account
for the existence of the .security index or wait for the index to be
created and green. This PR disables the feature for docs YAML tests.
Disabling the feature in docs YAML tests will solve the flakiness
without affecting the coverage.

Resolves #122343 Resolves
#121748 Resolves
#121611 Resolves
#121345 Resolves
#121338 Resolves
#121337 Resolves
#121288 Resolves
#121287 Resolves
#121867 Resolves
#122335 Resolves
#122681 Resolves
#121976 Resolves
#123094 Resolves
#123192 Resolves
#122983 Resolves
#124671 Resolves
#124103

(cherry picked from commit cac356a)

# Conflicts:
#	muted-tests.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles low-risk An open issue or test failure that is a low risk to future releases Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants