Refactor data stream lifecycle to use the template paradigm #124593

gmarouli · 2025-03-11T20:00:33Z

In this PR we migrate the DataStreamLifecycle configuration to the new "template" framework we introduced in #117357. Effectively, we split the code that is used by the data stream and the templates. This is better for the following reasons:

The explicit nulls are scoped only in the template code and will not leak to the data stream (any more).
The explicit nulls are configured in a predicable way using the ResettableValue and can be composed easier as well.
In a follow up PR when it's time to incorporate the lifecycle under the failure store, it will be relatively easy since the failure store config already uses this template framework.
We removed dedicated wrapper classes that where necessary to represent the explicit nulls.

elasticsearchmachine · 2025-03-11T20:01:08Z

Pinging @elastic/es-data-management (Team:Data Management)

gmarouli · 2025-03-11T20:40:03Z

I will work on the CI in the meantime, but I wanted to open it already for a review because it's quite a big.

gmarouli · 2025-03-11T21:21:30Z

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamLifecycle.java

    }

+    private final boolean enabled;


I have been going back and forth about this one, should we keep it boolean or should we convert it to Boolean.

When it comes to failure store we had a specific use case where we enable it via a cluster setting. Here, we do not have such a use case yet, so I am leaning towards keep it as is. Any thoughts?

cc @dakrone

I think keeping it as is until we have a more concrete need for it to change is probably the best approach here.

…late).

…ix in progress)

jbaiera

LGTM!

jbaiera · 2025-03-17T20:24:28Z

server/src/main/java/org/elasticsearch/cluster/metadata/DataStreamLifecycle.java

+                || out.getTransportVersion().isPatchFrom(TransportVersions.INTRODUCE_LIFECYCLE_TEMPLATE_8_19)) {
+                out.writeOptionalTimeValue(dataRetention);
+            } else {
+                out.writeBoolean(dataRetention != null);


You were right, this is indeed somewhat confusing. I understand that we're replicating writing an optional object that was writing an optional field of its own. Maybe we should add some comments to this method to mirror those added in the read method.

…added

gmarouli · 2025-03-18T10:15:44Z

Because there are two new versions in 8.x that are not yet referred (33f2010 & 3fea0b6) in main we removed the patch versions until the versions are synced again.

elasticsearchmachine · 2025-03-18T11:25:25Z

💔 Backport failed

Status	Branch	Result
❌	8.x	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 124593

gmarouli · 2025-03-19T10:42:21Z

💚 All backports created successfully

Status	Branch	Result
✅	8.x

Questions ?

Please refer to the Backport tool documentation

…124593) (cherry picked from commit ce04da7) # Conflicts: # modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/action/TransportGetDataStreamLifecycleStatsAction.java # modules/data-streams/src/test/java/org/elasticsearch/datastreams/MetadataIndexTemplateServiceTests.java # modules/data-streams/src/test/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleFixtures.java # server/src/main/java/org/elasticsearch/TransportVersions.java # server/src/main/java/org/elasticsearch/action/admin/indices/template/post/TransportSimulateIndexTemplateAction.java # server/src/main/java/org/elasticsearch/cluster/metadata/MetadataCreateDataStreamService.java # server/src/main/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateService.java # server/src/main/java/org/elasticsearch/cluster/metadata/ProjectMetadata.java # server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamTests.java # server/src/test/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsServiceTests.java # server/src/test/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateServiceTests.java

gmarouli added 3 commits March 11, 2025 21:43

Move DownsamplingRounds our of the wrapper class.

7ba1925

Introduce DataStreamLifecycle dedicated Template class and use it.

3117ab8

Remove the need for explicit nulls from the DataStreamLifecycle

bbf6507

gmarouli added >refactoring :Data Management/Data streams Data streams and their lifecycles labels Mar 11, 2025

gmarouli requested a review from jbaiera March 11, 2025 20:00

gmarouli added auto-backport Automatically create backport pull requests when merged v8.19.0 labels Mar 11, 2025

elasticsearchmachine added Team:Data Management Meta label for data/management team v9.1.0 labels Mar 11, 2025

gmarouli added 2 commits March 11, 2025 22:43

Fix effective retention bug

cefb4c9

Revert enabled flag to boolean instead of Boolean.

f4a570d

gmarouli commented Mar 11, 2025

View reviewed changes

gmarouli added 11 commits March 11, 2025 23:45

Revert enabled flag to boolean instead of Boolean (also from the temp…

7c5c277

…late).

Bug fix in XContent parser of the DataStreamLifecycle (downsampling f…

ebd743f

…ix in progress)

Fix bwc serialisation errors

934d155

Merge branch 'main' into refactor-data-stream-lifecycle

67d10d1

Fix tests

f4ef99f

Use getters in methods with heavier logic to facilitate testing

21bbb31

Convert to the template DataStreamLifecyclein assertions when needed

16f48e9

Merge with main

9797a7b

Prepare backport patch to 8.19.0

f9cb602

Merge with main

7ed4c67

Remove incorrect nullable annotations

298a1cd

elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Mar 12, 2025

jbaiera approved these changes Mar 18, 2025

View reviewed changes

gmarouli added 2 commits March 18, 2025 12:00

Merge with main

b955541

Remove the 8.19 patch until all previous backport versions have been …

ebc6794

…added

gmarouli added 2 commits March 18, 2025 12:07

Make legacy reader and writer more explicit and readable

b025cb9

Merge branch 'main' into refactor-data-stream-lifecycle

4f75227

gmarouli merged commit ce04da7 into elastic:main Mar 18, 2025
17 checks passed

gmarouli deleted the refactor-data-stream-lifecycle branch March 18, 2025 11:24

elasticsearchmachine added the backport pending label Mar 18, 2025

gmarouli mentioned this pull request Mar 19, 2025

[8.x] Refactor data stream lifecycle to use the template paradigm (#124593) #125199

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor data stream lifecycle to use the template paradigm #124593

Refactor data stream lifecycle to use the template paradigm #124593

gmarouli commented Mar 11, 2025

elasticsearchmachine commented Mar 11, 2025

gmarouli commented Mar 11, 2025

gmarouli Mar 11, 2025

jbaiera Mar 17, 2025

jbaiera left a comment

jbaiera Mar 17, 2025

gmarouli commented Mar 18, 2025

elasticsearchmachine commented Mar 18, 2025

gmarouli commented Mar 19, 2025

Refactor data stream lifecycle to use the template paradigm #124593

Refactor data stream lifecycle to use the template paradigm #124593

Conversation

gmarouli commented Mar 11, 2025

elasticsearchmachine commented Mar 11, 2025

gmarouli commented Mar 11, 2025

gmarouli Mar 11, 2025

Choose a reason for hiding this comment

jbaiera Mar 17, 2025

Choose a reason for hiding this comment

jbaiera left a comment

Choose a reason for hiding this comment

jbaiera Mar 17, 2025

Choose a reason for hiding this comment

gmarouli commented Mar 18, 2025

elasticsearchmachine commented Mar 18, 2025

💔 Backport failed

gmarouli commented Mar 19, 2025

💚 All backports created successfully

Questions ?