Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search responses with large size can cause OOMs #110962

Closed
carlosdelest opened this issue Jul 17, 2024 · 4 comments
Closed

Search responses with large size can cause OOMs #110962

carlosdelest opened this issue Jul 17, 2024 · 4 comments
Labels
>bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@carlosdelest
Copy link
Member

Elasticsearch Version

7.17.x - 8.x

Installed Plugins

No response

Java Version

bundled

OS Version

N/A

Problem Description

Search responses that include a high number of hits and/or hits with considerable size can OOM a node, with the following stacktrace:

elasticsearch[node_name][transport_worker][T#XXXX]
  at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
  at org.elasticsearch.common.io.stream.StreamInput.readBytesReference(I)Lorg/elasticsearch/common/bytes/BytesReference; (StreamInput.java:161)
  at org.elasticsearch.common.io.stream.StreamInput.readBytesReference()Lorg/elasticsearch/common/bytes/BytesReference; (StreamInput.java:127)
  at org.elasticsearch.search.SearchHit.<init>(Lorg/elasticsearch/common/io/stream/StreamInput;)V (SearchHit.java:150)
  at org.elasticsearch.search.SearchHits.<init>(Lorg/elasticsearch/common/io/stream/StreamInput;)V (SearchHits.java:90)
  at org.elasticsearch.search.fetch.FetchSearchResult.<init>(Lorg/elasticsearch/common/io/stream/StreamInput;)V (FetchSearchResult.java:42)
  at org.elasticsearch.search.fetch.QueryFetchSearchResult.<init>(Lorg/elasticsearch/common/io/stream/StreamInput;)V (QueryFetchSearchResult.java:28)
  at org.elasticsearch.action.search.SearchTransportService$$Lambda$6076+0x0000000801b1cc88.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Ljava/lang/Object; ()
  at org.elasticsearch.action.ActionListenerResponseHandler.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Lorg/elasticsearch/transport/TransportResponse; (ActionListenerResponseHandler.java:58)
  at org.elasticsearch.action.ActionListenerResponseHandler.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Ljava/lang/Object; (ActionListenerResponseHandler.java:25)
  at org.elasticsearch.transport.TransportService$4.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Lorg/elasticsearch/transport/TransportResponse; (TransportService.java:863)
  at org.elasticsearch.transport.TransportService$4.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Ljava/lang/Object; (TransportService.java:843)
  at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Lorg/elasticsearch/transport/TransportResponse; (TransportService.java:1462)
  at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.read(Lorg/elasticsearch/common/io/stream/StreamInput;)Ljava/lang/Object; (TransportService.java:1449)
  at org.elasticsearch.transport.InboundHandler.handleResponse(Ljava/net/InetSocketAddress;Lorg/elasticsearch/common/io/stream/StreamInput;Lorg/elasticsearch/transport/TransportResponseHandler;)V (InboundHandler.java:311)
  at org.elasticsearch.transport.InboundHandler.messageReceived(Lorg/elasticsearch/transport/TcpChannel;Lorg/elasticsearch/transport/InboundMessage;J)V (InboundHandler.java:134)

Large search responses should not OOM a node, but be cancelled.

Steps to Reproduce

This was observed in production and we don't have a reproducible script.

Logs (if relevant)

No response

@carlosdelest carlosdelest added >bug needs:triage Requires assignment of a team area label Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations labels Jul 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jul 17, 2024
@original-brownbear
Copy link
Member

Just one thing to note here: the bug as seen in this stack trace has long been fixed by reading to pooled bytes arrays. We do however have a bunch of other remaining spots where we are not yet using pooled bytes.

@javanna
Copy link
Member

javanna commented Mar 14, 2025

This has been mitigated by #121920 , which added memory tracking when loading search hits as part of the fetch phase.

@javanna javanna closed this as completed Mar 14, 2025
@original-brownbear
Copy link
Member

This is on the coordinating node side, but the specific issue seen here is dealt with by #103763 in 8.13+ I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

4 participants