Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflector not watching secrets after period of time #341

Closed
InWithTheNew opened this issue Mar 22, 2023 · 11 comments
Closed

Reflector not watching secrets after period of time #341

InWithTheNew opened this issue Mar 22, 2023 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@InWithTheNew
Copy link

InWithTheNew commented Mar 22, 2023

Sorry if this has been raised before.
Running multiple small clusters, AKS 1.25.x

In 2 environments so far, the SecretWatcher seems to just stop watching. This causes us to find our letsencrypt certs start to expire in namespaces. ConfigMapWatcher seems to continue.

We're running kubernetes-reflector:6.1.9. We've been running this (awesome) microservice for months and no problems to be seen, then they both stopped working within 4 days of eachother.

Logs are as follow:

2023-02-13 04:13:34.991 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2023-02-13 04:13:35.034 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected [redacted] where permitted. Created 0 - Updated 0 - Deleted 0 - Validated 3.
2023-02-13 04:25:52.974 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:37:30.7838490. Faulted: False.
2023-02-13 04:25:52.975 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 04:55:21.486 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Session closed. Duration: 00:41:46.4948709. Faulted: False.
2023-02-13 04:55:21.486 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2023-02-13 04:55:21.518 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected [redacted] where permitted. Created 0 - Updated 0 - Deleted 0 - Validated 3.
2023-02-13 04:58:06.580 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:32:13.6046741. Faulted: False.
2023-02-13 04:58:06.580 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 05:26:02.397 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Session closed. Duration: 00:30:40.9110194. Faulted: False.
2023-02-13 05:26:02.397 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2023-02-13 05:26:02.452 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected [redacted] where permitted. Created 0 - Updated 0 - Deleted 0 - Validated 3.
2023-02-13 05:31:14.467 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:33:07.8864128. Faulted: False.
2023-02-13 05:31:14.467 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 06:08:59.948 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:37:45.4805678. Faulted: False.
2023-02-13 06:08:59.948 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 06:14:09.695 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Session closed. Duration: 00:48:07.2973451. Faulted: False.
2023-02-13 06:14:09.695 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2023-02-13 06:14:09.747 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected [redacted] where permitted. Created 0 - Updated 0 - Deleted 0 - Validated 3.
2023-02-13 06:41:05.303 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:32:05.3546074. Faulted: False.
2023-02-13 06:41:05.303 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 07:11:36.461 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Session closed. Duration: 00:57:26.7657591. Faulted: False.
2023-02-13 07:11:36.462 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2023-02-13 07:11:36.613 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected [redacted] where permitted. Created 0 - Updated 0 - Deleted 0 - Validated 3.
2023-02-13 07:39:46.942 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:58:41.6387750. Faulted: False.
2023-02-13 07:39:46.942 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 08:10:37.678 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:30:50.7360987. Faulted: False.
2023-02-13 08:10:37.678 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 08:44:00.317 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:33:22.6384749. Faulted: False.
2023-02-13 08:44:00.317 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 09:34:31.038 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:50:30.7211105. Faulted: False.
2023-02-13 09:34:31.039 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 10:22:48.996 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:48:17.9567637. Faulted: False.
2023-02-13 10:22:48.996 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 10:56:12.512 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:33:23.5159199. Faulted: False.
2023-02-13 10:56:12.513 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 11:42:25.995 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:46:13.4821249. Faulted: False.
2023-02-13 11:42:25.995 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 12:23:14.749 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:40:48.7536606. Faulted: False.
2023-02-13 12:23:14.750 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 12:57:02.601 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:33:47.8514877. Faulted: False.
2023-02-13 12:57:02.602 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2023-02-13 13:49:11.526 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:52:08.9244818. Faulted: False.

No faults in the logs, but the Core.SecretWatcher never comes back up. One was discovered today, so we're quite confident it's not going to autoheal.

We'll be updating to 7.x shortly, but we're raising this now in the hopes of some enlightenment :)

@KrisJohnstone
Copy link

KrisJohnstone commented Mar 22, 2023

#77
#337

@winromulus
Copy link
Contributor

This is a known issue currently. For some reason the k8s master nodes stop sending updates for secrets and don't close the session, so it keeps it running. Please use a smaller timeout for now (~15 minutes) so it forces a watcher close

@winromulus winromulus reopened this Aug 1, 2023
@winromulus winromulus self-assigned this Aug 1, 2023
@winromulus winromulus added the bug Something isn't working label Aug 1, 2023
@winromulus winromulus pinned this issue Aug 1, 2023
@rayanebel
Copy link

rayanebel commented Aug 21, 2023

Hello @winromulus

We have the same issue on our GKE cluster. reflector is working perfectly and after 2 days, it stop syncing the secrets.

reflector-dd6c9fdf5-qcdkv reflector 2023-08-20 22:25:04.621 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-20 22:57:19.856 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:32:15.2347130. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-20 22:57:19.856 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-20 23:53:00.238 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:55:40.3812884. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-20 23:53:00.238 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 00:38:49.259 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:45:49.0215931. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 00:38:49.259 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 01:24:32.815 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:45:43.5558523. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 01:24:32.816 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 02:13:58.957 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:49:26.1411039. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 02:13:58.957 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 03:12:26.201 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:58:27.2443779. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 03:12:26.201 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 04:03:28.537 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:51:02.3355887. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 04:03:28.537 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 04:44:50.822 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:41:22.2846000. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 04:44:50.822 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 05:18:07.837 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:33:17.0153841. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 05:18:07.837 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 06:12:36.365 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:54:28.5272157. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 06:12:36.365 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 06:52:10.409 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:39:34.0442951. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 06:52:10.409 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 07:22:32.837 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:30:22.4281258. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 07:22:32.837 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 08:13:15.017 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:50:42.1795882. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 08:13:15.017 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 09:01:11.965 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:47:56.9481984. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 09:01:11.966 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 09:41:50.723 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:40:38.7571401. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 09:41:50.723 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 10:22:29.630 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:40:38.9068063. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 10:22:29.630 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 11:10:24.958 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:47:55.3279240. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 11:10:24.958 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 12:09:28.116 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:59:03.1578351. Faulted: False.
reflector-dd6c9fdf5-qcdkv reflector 2023-08-21 12:09:28.116 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources

version: v7.0.193
k8s version: 1.27
provider: Google Cloud (GKE)

@winromulus
Copy link
Contributor

@rayanebel same issue as above. For some reason the Secrets watcher stops receiving notifications from k8s.
I'm still trying to find a fix for this, my assumption being k8s since it has a history of having issues with the API not keeping connections alive.
In the meantime, please provide the k8s cluster version and also try to use the Timeout setting to set them to around 15-20 minutes for the watchers.

@rayanebel
Copy link

rayanebel commented Aug 21, 2023

@winromulus Ok, I will try to use the timeout settings. Concerning our context, we are using reflector in several kubernetes clusters hosted in different provider (AWS, GCP and Scaleway) and currently reflector stop to work only with our GKE clusters.

@winromulus
Copy link
Contributor

This issue seems to have been fixed with the latest version of the Kubernetes client. Please upgrade and reopen this issue if there's still a problem.

@steve-gray
Copy link

@winromulus - Had this one bite us today, process silently hung without any notice and stopped replicating data. Caused Traefik to emit expired SSL certificates, causing clients to reject them. Is there any diagnostics we can do? The secrets would change maybe at best fortnightly, so I'm not sure hiking the timeouts that high is the right play.

@aaronmassicotte
Copy link

@winromulus - Had this one bite us today, process silently hung without any notice and stopped replicating data. Caused Traefik to emit expired SSL certificates, causing clients to reject them. Is there any diagnostics we can do? The secrets would change maybe at best fortnightly, so I'm not sure hiking the timeouts that high is the right play.

We're using this instead: https://github.com/mittwald/kubernetes-replicator
So far 1 year no problems with wildcard certificates

@winromulus
Copy link
Contributor

@steve-gray can you provide more details about your environment (kube version, how is it hosted, etc)? Also which version of reflector are you using?
Also have you done a control plane upgrade recently? sometimes the upgrade can cause a silent socket disconnect which results in reflector not receiving the disconnect event and handing. Restarting the pod should solve this

@ivababukova
Copy link

We are experiencing this too. We use EKS, our version of reflector is 7.1.262, kubernetes version is 1.26. For the last month or two, we notice that once a week or once every 2 weeks the reflector stops reflecting configmaps. There are no error logs in the reflector pod -- it just stops noticing that it needs to reflect configmaps in new environments. Restarting the pod fixes the issue.

@winromulus
Copy link
Contributor

@ivababukova can you provide more information about the flavor of kubernetes you're using? (k8s or k3s or something else) Also if you're self hosting or using a cloud provider?
I need this information in order to try to reproduce the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

7 participants