Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming gRPC does not avoid 'message larger than max' error #3765

Closed
akevdmeer opened this issue Jun 10, 2024 · 4 comments · Fixed by #3808
Closed

Streaming gRPC does not avoid 'message larger than max' error #3765

akevdmeer opened this issue Jun 10, 2024 · 4 comments · Fixed by #3808

Comments

@akevdmeer
Copy link

When querying Tempo with (streaming) gRPC using tempo-cli --use-grpc to do a large query, tempo-cli fails:

tempo-cli: error: main.metricsQueryRangeCmd.Run(): rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4560241 vs. 4194304)

.. and the same when using the same max message size on tempo-cli as we have previously configured on the server:

tempo-cli: error: main.metricsQueryRangeCmd.Run(): rpc error: code = Internal desc = response larger than the max (16842812 vs 16777216)

To Reproduce
Perform a query with tempo-cli that returns a lot of data, specifically I tested with
tempo-cli query api metrics localhost:3100 '{ selector } | quantile_over_time(duration, .99) by (span.http.url)' starttime endtime --use-grpc

Expected behavior
The streaming gRPC should avoid the max message size by chunking appropriately, allowing to run large queries.

Environment:
Tempo 2.5.0 on Kubernetes 1.28

@joe-elliott
Copy link
Member

While the other streaming endpoints only send a diff for every message the metrics streaming endpoint currently sends all data every message.

We should update the metrics streaming endpoint to only send the diff by improving this method:

https://github.com/grafana/tempo/blob/main/modules/frontend/combiner/metrics_query_range.go#L63

@akevdmeer
Copy link
Author

I see, that provokes the error easily!

But isn't it still possible to exceed the max message size for the connection also with a / where this is a proper diff(), when it receives more fresh data than fits between two GRPCCollector updates?

@joe-elliott
Copy link
Member

But isn't it still possible to exceed the max message size for the connection also with a / where this is a proper diff(), when it receives more fresh data than fits between two GRPCCollector updates?

This is correct. We could try to do something that cut diffs aggressively for larger datasets. It feels like it would be quite difficult to do well and would have some difficult to handle edge cases. I'm open to hearing a proposal, but this is currently a lower priority for us.

@akevdmeer
Copy link
Author

Clear thank you. A proper diff() for the metrics queries + larger grpc max size may solve the issue, practically speaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants