Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release aggregations earlier during reduce #124520

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Mar 10, 2025

Release each hit's aggregations before moving on to the next hit and unlink it from the shard result even earlier.
Also, do the aggregation-reduce earlier in the reduce steps to reduce average heap use over time.
To that effect, do not do the reduction in the search phase controller. This has the added benefit of removing any need for a fake aggs-reduce-context in scroll.

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@original-brownbear original-brownbear changed the title Release aggregations more earlier during reduce Release aggregations earlier during reduce Mar 10, 2025
}
};
return topLevelReduce(aggregations, context);
public static InternalAggregations topLevelReduce(Iterator<InternalAggregations> aggs, int count, AggregationReduceContext context) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is single-use for now but I left it here since it could be used where the existing version of it that consumes a list is used today as well to save some more indirection and maybe heap.

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me.

@iverase
Copy link
Contributor

iverase commented Mar 13, 2025

Although the failure in CI might need attention

@original-brownbear
Copy link
Member Author

Funny enough, I think that failure was unrelated. I'll comment on the already existing issue for it.

Thanks Ignacio!

@original-brownbear original-brownbear added the auto-backport Automatically create backport pull requests when merged label Mar 13, 2025
@original-brownbear original-brownbear merged commit 70486b4 into elastic:main Mar 13, 2025
17 checks passed
@original-brownbear original-brownbear deleted the release-aggs-earlier branch March 13, 2025 14:36
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 124520

jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 13, 2025
Release each hit's aggregations before moving on to the next hit and unlink it from the shard result even earlier.
Also, do the aggregation-reduce earlier in the reduce steps to reduce average heap use over time. 
To that effect, do not do the reduction in the search phase controller. This has the added benefit of removing any need for a fake aggs-reduce-context in scroll.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations auto-backport Automatically create backport pull requests when merged backport pending >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants