-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.16.0 and lower than v1.17.0
What happened?
This bug report is related to #33074 , where in the same context, "No mapping for NAT masquerade" errors occured. After updating from 1.15.x to 1.16.0, that issue got resolved, and instead, flows get dropped at about the same rate with the reason "CT: Map insertion failed". In #33115 , a specific race condition got "fixed", and my guess is, that if the port allocation now fails on rev snat, cilium does not try again, and drops the flow.
How can we reproduce the issue?
Try hosting a highly loaded DNS server on a kubernetes cluster with cilium as the CNI. Or create and close a lot of connections quickly in other ways (while the CT table is highly loaded), since the issue happens "randomly" on new connections. More ideas in #33074 .
Cilium Version
Client: 1.16.0 8299999 2024-07-23T22:22:14-07:00 go version go1.22.5 linux/amd64
Daemon: 1.16.0 8299999 2024-07-23T22:22:14-07:00 go version go1.22.5 linux/amd64
Kernel Version
Linux 5.14.0-427.16.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Wed May 8 17:48:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
v1.28.13+rke2r1
Regression
It did work differently in 1.15.x, but to my best knowledge, never worked properly.
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct