Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] MlWithSecurityIT test {yaml=ml/3rd_party_deployment/Test start and stop multiple deployments} failing #124315

Open
elasticsearchmachine opened this issue Mar 7, 2025 · 2 comments
Labels
low-risk An open issue or test failure that is a low risk to future releases :ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

elasticsearchmachine commented Mar 7, 2025

Build Scans:

Reproduction Line:

./gradlew ":x-pack:plugin:ml:qa:ml-with-security:yamlRestTest" --tests "org.elasticsearch.smoketest.MlWithSecurityIT.test {yaml=ml/3rd_party_deployment/Test start and stop multiple deployments}" -Dtests.seed=36C1F9B1BBB9F987 -Dtests.locale=shi-Tfng-MA -Dtests.timezone=Europe/Bucharest -Druntime.java=17 -Dtests.fips.enabled=true

Applicable branches:
8.18

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.lang.AssertionError: Failure at [ml/3rd_party_deployment:633]: expected [2xx] status code but api [ml.infer_trained_model] returned [408 Request Timeout] [{"error":{"root_cause":[{"type":"status_exception","reason":"timeout [10s] waiting for inference result","stack_trace":"org.elasticsearch.ElasticsearchStatusException: timeout [10s] waiting for inference result\n\tat org.elasticsearch.ml@8.18.0-SNAPSHOT/org.elasticsearch.xpack.ml.inference.deployment.AbstractPyTorchAction.onTimeout(AbstractPyTorchAction.java:68)\n\tat org.elasticsearch.server@8.18.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:977)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat java.base/java.lang.Thread.run(Thread.java:833)\n"}],"type":"status_exception","reason":"timeout [10s] waiting for infere
[truncated]

Issue Reasons:

  • [8.18] 2 failures in test test {yaml=ml/3rd_party_deployment/Test start and stop multiple deployments} (1.5% fail rate in 137 executions)
  • [8.18] 2 failures in pipeline elasticsearch-periodic (40.0% fail rate in 5 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added :ml Machine learning >test-failure Triaged test failures from CI labels Mar 7, 2025
@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch main

Mute Reasons:

  • [main] 3 failures in test test {yaml=ml/3rd_party_deployment/Test start and stop multiple deployments} (0.4% fail rate in 813 executions)
  • [main] 2 failures in step part-4 (0.6% fail rate in 350 executions)
  • [main] 2 failures in pipeline elasticsearch-pull-request (0.6% fail rate in 353 executions)

Build Scans:

elasticsearchmachine added a commit that referenced this issue Mar 7, 2025
…arty_deployment/Test start and stop multiple deployments} #124315
@elasticsearchmachine elasticsearchmachine added Team:ML Meta label for the ML team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Mar 7, 2025
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/ml-core (Team:ML)

georgewallace pushed a commit to georgewallace/elasticsearch that referenced this issue Mar 11, 2025
…arty_deployment/Test start and stop multiple deployments} elastic#124315
@davidkyle davidkyle added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
low-risk An open issue or test failure that is a low risk to future releases :ml Machine learning Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

2 participants