Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES-10037 Persist recent write load in index metadata #125330

Conversation

PeteGillinElastic
Copy link
Member

This changes the default value for the Exponentially Weighted Moving Rate calculation used for the 'recent write load' metric in indexing stats to 5 minutes (as agreed over Slack) and persists the value in the index metadata alongside the existing write load metric.

The value is still not used in the data stream autosharding calculation, that will be yet one more PR.

There are a couple of drive-by changes in this PR:

  • It adds a comment to DataStreamAutoShardingService.computeOptimalNumberOfShards, because the nested min and max calls are quite hard to understand at a glance.
  • It changes IndexShard.indexingStats() so that, if it is called before the shard has entered the started state, it uses a timeSinceShardStartedInNanos value of zero when calling InternalIndexingStats.stats(). Previously, it would have passed the current relative time in nanos as timeSinceShardStartedInNanos (because startedRelativeTimeInNanos would be zero) which is arbitrary and incorrect (since the zero point of System.nanoTime() is arbitrary). This didn't actually matter, since InternalIndexingStats.postIndex would not increment the metrics while in recovery, so the numerator used to calculate the write load would be zero if the shard has not started, so it doesn't matter if the denominator is incorrect. However, it is good defensive coding not to rely on that, and to pass a correct value instead.

@PeteGillinElastic PeteGillinElastic added >non-issue :Data Management/Stats Statistics tracking and retrieval APIs v9.1.0 and removed v9.1.0 labels Mar 20, 2025
@PeteGillinElastic PeteGillinElastic force-pushed the ES-10037-persist-recent-write-load branch 2 times, most recently from 6891b71 to a4465e2 Compare March 20, 2025 17:12
@PeteGillinElastic PeteGillinElastic force-pushed the ES-10037-persist-recent-write-load branch from a4465e2 to 6392468 Compare March 20, 2025 17:14
@PeteGillinElastic PeteGillinElastic marked this pull request as ready for review March 20, 2025 17:14
assertThat(indexWriteLoad.getWriteLoadForShard(shardId).getAsDouble(), is(equalTo(populatedShardWriteLoads[shardId])));
assertThat(indexWriteLoad.getUptimeInMillisForShard(shardId).isPresent(), is(equalTo(true)));
assertThat(indexWriteLoad.getUptimeInMillisForShard(shardId).getAsLong(), is(equalTo(populatedShardUptimes[shardId])));
assertThat(indexWriteLoad.getWriteLoadForShard(shardId), equalTo(OptionalDouble.of(populatedShardWriteLoads[shardId])));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm combining the separate assertions on isPresent() and getAsDouble/Long() for simplicity.

} else {
assertThat(indexWriteLoad.getWriteLoadForShard(shardId).isPresent(), is(false));
assertThat(indexWriteLoad.getUptimeInMillisForShard(shardId).isPresent(), is(false));
assertThat(indexWriteLoad.getWriteLoadForShard(shardId), equalTo(OptionalDouble.empty()));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a minor thing, but here I'm asserting equality with the empty value rather than asserting that isPresent() is false, so that if it fails then the message will say what value it had, which might be helpful.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Mar 20, 2025
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left a couple of really minor comments, but nothing major.

Comment on lines 327 to 328
* - shardsByMaxThreads = number of shards required to ensure no more than 50% utilization with max number of threads per shard
* - shardsByMaxThreads = number of shards required to ensure no more than 50% utilization with min number of threads per shard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These both say "max" in the name, but I assume one of them should be shardsByMinThreads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, good catch. Me and my cut-and-paste again...

@@ -59,35 +68,63 @@ public static IndexWriteLoad create(List<Double> shardsWriteLoad, List<Long> sha
throw new IllegalArgumentException("At least one shard write load and uptime should be provided, but none was provided");
}

if (shardsRecentWriteLoad != null && shardsRecentWriteLoad.size() != shardsUptimeInMillis.size()) {
assert false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a message to this failing assert? It helps narrow down the logic if someone encounters it killing a test ES.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fair cop. I lazily cut-and-pasted. I've added messages to all 5 asserts that were missing them in this class.

assert shardWriteLoad.length == shardUptimeInMillis.length;
this.shardWriteLoad = shardWriteLoad;
this.shardUptimeInMillis = shardUptimeInMillis;
if (shardRecentWriteLoad != null) {
assert shardRecentWriteLoad.length == shardUptimeInMillis.length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about an assert message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

Comment on lines 88 to 92
int tooManyDays = BigDecimal.valueOf(Long.MAX_VALUE)
.add(BigDecimal.ONE)
.divide(BigDecimal.valueOf(24 * 60 * 60 * 1_000_000_000L), RoundingMode.UP)
.setScale(0, RoundingMode.UP)
.intValueExact();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is more complicated here rather than just hard-coding it to a large value for the test? The tests for the limits seem to straddle testing the Settings framework itself, and I'm not sure we're getting a lot of value out of them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I thought it would specifically assert that exceeding Long.MAX_VALUE nanos would be rejected. But you're right, there's not really much value in these tests. I'll revert.

Copy link
Member Author

@PeteGillinElastic PeteGillinElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lee, all comments done.

Comment on lines 88 to 92
int tooManyDays = BigDecimal.valueOf(Long.MAX_VALUE)
.add(BigDecimal.ONE)
.divide(BigDecimal.valueOf(24 * 60 * 60 * 1_000_000_000L), RoundingMode.UP)
.setScale(0, RoundingMode.UP)
.intValueExact();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I thought it would specifically assert that exceeding Long.MAX_VALUE nanos would be rejected. But you're right, there's not really much value in these tests. I'll revert.

@@ -59,35 +68,63 @@ public static IndexWriteLoad create(List<Double> shardsWriteLoad, List<Long> sha
throw new IllegalArgumentException("At least one shard write load and uptime should be provided, but none was provided");
}

if (shardsRecentWriteLoad != null && shardsRecentWriteLoad.size() != shardsUptimeInMillis.size()) {
assert false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fair cop. I lazily cut-and-pasted. I've added messages to all 5 asserts that were missing them in this class.

assert shardWriteLoad.length == shardUptimeInMillis.length;
this.shardWriteLoad = shardWriteLoad;
this.shardUptimeInMillis = shardUptimeInMillis;
if (shardRecentWriteLoad != null) {
assert shardRecentWriteLoad.length == shardUptimeInMillis.length;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

Comment on lines 327 to 328
* - shardsByMaxThreads = number of shards required to ensure no more than 50% utilization with max number of threads per shard
* - shardsByMaxThreads = number of shards required to ensure no more than 50% utilization with min number of threads per shard
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, good catch. Me and my cut-and-paste again...

@PeteGillinElastic PeteGillinElastic merged commit 22d8169 into elastic:main Mar 20, 2025
17 checks passed
@PeteGillinElastic PeteGillinElastic deleted the ES-10037-persist-recent-write-load branch March 20, 2025 22:45
afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Mar 21, 2025
This changes the default value for the Exponentially Weighted Moving Rate calculation used for the 'recent write load' metric in indexing stats to 5 minutes (as agreed over Slack) and persists the value in the index metadata alongside the existing write load metric.

The value is still not used in the data stream autosharding calculation, that will be yet one more PR.

There are a couple of drive-by changes in this PR:

It adds a comment to DataStreamAutoShardingService.computeOptimalNumberOfShards, because the nested min and max calls are quite hard to understand at a glance.

It changes IndexShard.indexingStats() so that, if it is called before the shard has entered the started state, it uses a timeSinceShardStartedInNanos value of zero when calling InternalIndexingStats.stats(). Previously, it would have passed the current relative time in nanos as timeSinceShardStartedInNanos (because startedRelativeTimeInNanos would be zero) which is arbitrary and incorrect (since the zero point of System.nanoTime() is arbitrary). This didn't actually matter, since InternalIndexingStats.postIndex would not increment the metrics while in recovery, so the numerator used to calculate the write load would be zero if the shard has not started, so it doesn't matter if the denominator is incorrect. However, it is good defensive coding not to rely on that, and to pass a correct value instead.
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Mar 21, 2025
This changes the default value for the Exponentially Weighted Moving Rate calculation used for the 'recent write load' metric in indexing stats to 5 minutes (as agreed over Slack) and persists the value in the index metadata alongside the existing write load metric.

The value is still not used in the data stream autosharding calculation, that will be yet one more PR.

There are a couple of drive-by changes in this PR:

It adds a comment to DataStreamAutoShardingService.computeOptimalNumberOfShards, because the nested min and max calls are quite hard to understand at a glance.

It changes IndexShard.indexingStats() so that, if it is called before the shard has entered the started state, it uses a timeSinceShardStartedInNanos value of zero when calling InternalIndexingStats.stats(). Previously, it would have passed the current relative time in nanos as timeSinceShardStartedInNanos (because startedRelativeTimeInNanos would be zero) which is arbitrary and incorrect (since the zero point of System.nanoTime() is arbitrary). This didn't actually matter, since InternalIndexingStats.postIndex would not increment the metrics while in recovery, so the numerator used to calculate the write load would be zero if the shard has not started, so it doesn't matter if the denominator is incorrect. However, it is good defensive coding not to rely on that, and to pass a correct value instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Stats Statistics tracking and retrieval APIs >non-issue Team:Data Management Meta label for data/management team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants