Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exponential backoff and jitter to the retries #194

Closed
acm19 opened this issue Nov 24, 2023 · 4 comments · Fixed by #204
Closed

Add exponential backoff and jitter to the retries #194

acm19 opened this issue Nov 24, 2023 · 4 comments · Fixed by #204
Labels
enhancement New feature or request

Comments

@acm19
Copy link
Contributor

acm19 commented Nov 24, 2023

Current retries are constant which might cause snowball effect if many client get rate limited at the same time. Retries should have a exponential backoff and jitter to avoid this.

@nehaev nehaev added the enhancement New feature or request label Nov 24, 2023
@nehaev
Copy link
Contributor

nehaev commented Nov 24, 2023

Hi @acm19, and thanks again for posting this! I totally agree that exponential backoff and jitter for retries is good to have.

@acm19
Copy link
Contributor Author

acm19 commented Dec 7, 2023

Hey @nehaev, I might try to find some time to give this a try but I wonder what your thoughts are about the implementation. I see two options, either we give only exponential retries as retry mechanism (we can then add a validation for maxRetries and retryTimeoutMs so that they don't overflow long) or we give different retries options. The second option would probably involve adding a lib and a few more properties. I kind of feel the first option should be enough, but it will break backward compatibility. Other thoughts? Preferences?

@nehaev
Copy link
Contributor

nehaev commented Dec 10, 2023

Hey @acm19,

I think having only exponential backoff retries is fine.

We can introduce two more config options: minRetryTimeoutMs and maxRetryTimeoutMs. Also we can deprecate retryTimeoutMs, see the example. Looks like this deprecation would be the only compatibility break. Of course, it still requires a major version increment and starting the 1.5.x series, but I don't see any issues with that.

acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
@acm19
Copy link
Contributor Author

acm19 commented Jan 2, 2024

Hey @nehaev, I've created a proposal. I think it's fine to leave the current config settings, I just updated the documentation. Please take a look and let me know your thoughts.

acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 2, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Jan 31, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
acm19 added a commit to acm19/loki-logback-appender that referenced this issue Feb 9, 2024
Adds exponential backoff by multiplying the `retryTimeoutMs` by
`2^attempt`. Adds a random jitter to spread out the retries. Although
in the initial implementation the retry function isn't available to the
user the code is written in a way that makes it easier to change that.

Refactors the tests to avoid using `Thread#sleep`, which is not only a
bad practice producing not deterministic tests but also makes them very
slow. A reduction of more that 90% of testing time is percieved for the
tests where this was changed.

Resolves: loki4j#194
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants