Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System data streams are not being upgraded in the feature migration API #122949

Closed
masseyke opened this issue Feb 19, 2025 · 14 comments · May be fixed by #125437
Closed

System data streams are not being upgraded in the feature migration API #122949

masseyke opened this issue Feb 19, 2025 · 14 comments · May be fixed by #125437
Assignees
Labels
blocker >bug :Core/Infra/Core Core issues without another label stateful Marking issues only relevant for stateful releases Team:Core/Infra Meta label for core/infra team v8.18.0

Comments

@masseyke
Copy link
Member

See elastic/kibana#211614 for an example of what this looks like to the end user. We have a system data stream, .fleet-action-results that has one or more old backing indices. The user calls the feature migration API (POST /_migration/system_features), which is supposed to upgrade all system indices and data streams. That does not return any errors. Then the deprecation info API (GET /_migration/deprecations) tells them that they need to reindex the .fleet-action-results data stream (POST _data_stream/_modify). Since it is a system data stream though, the user is not allowed to reindex it. They get stuck in a position where they cannot upgrade.

@masseyke masseyke added :Core/Infra/Core Core issues without another label >bug blocker v8.18.0 labels Feb 19, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Feb 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@alexey-ivanov-es alexey-ivanov-es self-assigned this Feb 20, 2025
@alexey-ivanov-es alexey-ivanov-es added the stateful Marking issues only relevant for stateful releases label Feb 21, 2025
@tomsonpl
Copy link

Hey team, wanted to check if there's any timeline for it already by any chance. Unfortunately the issue is stopping us from upgrading to 9.0. Thanks in advance for putting some priority on it :) cc: @dasansol92 @ferullo

@alexey-ivanov-es
Copy link
Contributor

@tomsonpl I'm investigating the issue right now, but I can't provide an ETA yet. As soon as I have more information, I'll update you here

@tomsonpl
Copy link

Thanks 👍

@alexey-ivanov-es
Copy link
Contributor

@tomsonpl We've identified the problem - /_migration/system_features doesn't take system data streams into account. I'm working on a fix, but it may be a big one and may take a few days to get merged

alexey-ivanov-es added a commit that referenced this issue Mar 3, 2025
It seems the best way to fix #122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.
alexey-ivanov-es added a commit to alexey-ivanov-es/elasticsearch that referenced this issue Mar 3, 2025
It seems the best way to fix elastic#122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

(cherry picked from commit 0a769c8)
elasticsearchmachine pushed a commit that referenced this issue Mar 4, 2025
…3934)

* Move system indices migration to migrate plugin (#123551)

It seems the best way to fix #122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

(cherry picked from commit 0a769c8)

* Restore tests
@tomsonpl
Copy link

tomsonpl commented Mar 5, 2025

Hey @alexey-ivanov-es 👋
Thanks for the work you've done!
Wanted to check up and see if the issue is resolved, or just partly resolved, since the PR mentions another PR to add system data streams migration.

Wondering if I could run my tests again, and not face the .fleet-actions-results issue? :)

@alexey-ivanov-es
Copy link
Contributor

Hi @tomsonpl,

It ended up requiring more work than I initially anticipated, but we're pretty close to resolving it (PR). I expect it to be done by the end of this week. The PR that was merged didn't actually change the system's behavior but was required for the final changes.

@tomsonpl
Copy link

tomsonpl commented Mar 5, 2025

Awesome, thank you 👍

alexey-ivanov-es added a commit to alexey-ivanov-es/elasticsearch that referenced this issue Mar 13, 2025
It seems the best way to fix elastic#122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

Port of elastic#123551
alexey-ivanov-es added a commit to alexey-ivanov-es/elasticsearch that referenced this issue Mar 13, 2025
It seems the best way to fix elastic#122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

Port of elastic#123551
alexey-ivanov-es added a commit that referenced this issue Mar 14, 2025
It seems the best way to fix #122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

Port of #123551
@tomsonpl
Copy link

tomsonpl commented Mar 18, 2025

Hey team, the issue seem partially resolved in my case - because initially the issue appeared when I tried to use the Upgrade Assistant, and .fleet-actions-results system index showed up as ES Critical issue, blocking the upgrade.

Tested today with 8.18 - the blocker on Upgrade assistant does not exist 👍
But when trying to upgrade to 9.0 - ES doesn't stand up and the deployment crashes.

Image Image

Looking at the logs there seem to be an issue with the same index:
https://platform-logging.kb.us-west2.gcp.elastic-cloud.com/app/r/s/QaW23

To me it seems connected to the previous issue, but please share your thoughts.
Thanks in advance!

cc: @alexey-ivanov-es @masseyke @dasansol92

@alexey-ivanov-es
Copy link
Contributor

The exception is:
The index [.ds-.fleet-actions-results-2025.03.17-000001/yh1rfZLtQz2utrMw5lAYzw] created in version [7.17.25-8.0.0] with current compatibility version [7.17.25-8.0.0] must be marked as read-only using the setting [index.blocks.write] set to [true] before upgrading to 9.0.0
And the data stream wasn't reindexed during migration.

7.17.25-8.0.0 version looks suspicious to me, I'm investigating what it means and whether it might be a root cause of the problem

@tomsonpl
Copy link

Update:

We debugged this with @alexey-ivanov-es , and apparently the fixes haven't gotten to 8.18 GA build yet.
I double tested and the issue does not occur on snapshot build...
Sorry for the confusion :bows:

alexey-ivanov-es added a commit to alexey-ivanov-es/elasticsearch that referenced this issue Mar 21, 2025
It seems the best way to fix elastic#122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

Port of elastic#123551
@tomsonpl
Copy link

tomsonpl commented Mar 24, 2025

Hi @alexey-ivanov-es, unfortunately another issue occurred with the above mentioned data stream. It was found on 9.0.0-rc1.

Is this still in your ownership, or fleet now?

Image
{
    "statusCode": 500,
    "error": "Internal Server Error",
    "message": "illegal_argument_exception\n\tRoot causes:\n\t\tillegal_argument_exception: Data stream(s) [.fleet-actions-results] may not be accessed by product [kibana]"
}

@tomsonpl tomsonpl reopened this Mar 24, 2025
@tomsonpl
Copy link

Does this seem to be causing the issue ?

[2025-03-24T14:59:05,612][ERROR][org.elasticsearch.bootstrap.ElasticsearchUncaughtExceptionHandler] [instance-0000000000] uncaught exception in thread [elasticsearch[instance-0000000000][generic][T#17]]
java.lang.IllegalArgumentException: Data stream(s) [.fleet-actions-results] may not be accessed by product [kibana]
	at org.elasticsearch.indices.SystemIndices.dataStreamAccessException(SystemIndices.java:561) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.indices.SystemIndices.dataStreamAccessException(SystemIndices.java:535) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$SystemResourceAccess.handleMatchedSystemIndices(IndexNameExpressionResolver.java:2268) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$SystemResourceAccess.doCheckSystemIndexAccess(IndexNameExpressionResolver.java:2247) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$SystemResourceAccess.checkSystemIndexAccess(IndexNameExpressionResolver.java:2215) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:550) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:476) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:129) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.xpack.deprecation.TransportDeprecationInfoAction.checkAndCreateResponse(TransportDeprecationInfoAction.java:205) ~[?:?]
	at org.elasticsearch.xpack.deprecation.TransportDeprecationInfoAction.lambda$checkAndCreateResponse$0(TransportDeprecationInfoAction.java:161) ~[?:?]
	at org.elasticsearch.action.ActionListener$1.lambda$onFailure$0(ActionListener.java:226) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:64) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:226) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.action.ActionRunnable.onFailure(ActionRunnable.java:152) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:1027) ~[elasticsearch-9.0.0-rc1.jar:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:29) ~[elasticsearch-9.0.0-rc1.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1095) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:619) ~[?:?]
	at java.lang.Thread.run(Thread.java:1447) ~[?:?]
	Suppressed: java.lang.IllegalArgumentException: Data stream(s) [.fleet-actions-results] may not be accessed by product [kibana]
		at org.elasticsearch.indices.SystemIndices.dataStreamAccessException(SystemIndices.java:561) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.indices.SystemIndices.dataStreamAccessException(SystemIndices.java:535) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$SystemResourceAccess.handleMatchedSystemIndices(IndexNameExpressionResolver.java:2268) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$SystemResourceAccess.doCheckSystemIndexAccess(IndexNameExpressionResolver.java:2247) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$SystemResourceAccess.checkSystemIndexAccess(IndexNameExpressionResolver.java:2215) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:550) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:476) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:129) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.xpack.deprecation.TransportDeprecationInfoAction.checkAndCreateResponse(TransportDeprecationInfoAction.java:205) ~[?:?]
		at org.elasticsearch.xpack.deprecation.TransportDeprecationInfoAction.lambda$checkAndCreateResponse$0(TransportDeprecationInfoAction.java:161) ~[?:?]
		at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:217) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:40) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044) ~[elasticsearch-9.0.0-rc1.jar:?]
		at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[elasticsearch-9.0.0-rc1.jar:?]
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1095) ~[?:?]
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:619) ~[?:?]
		at java.lang.Thread.run(Thread.java:1447) ~[?:?]

@tomsonpl
Copy link

Since it was closed and I reopened it - I am closing it again. The new issue: #125560

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker >bug :Core/Infra/Core Core issues without another label stateful Marking issues only relevant for stateful releases Team:Core/Infra Meta label for core/infra team v8.18.0
Projects
None yet
4 participants