Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The grpc.max_reconnect_backoff_ms of client-c is too small to capture TLS verification errors #9803

Open
solotzg opened this issue Jan 21, 2025 · 0 comments

Comments

@solotzg
Copy link
Contributor

solotzg commented Jan 21, 2025

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

  • The error logs of tiflash won't record valuable info (like Cannot check peer: missing selected ALPN property) again but only show messages like below
[2025/01/21 09:59:11.087 +00:00] [INFO] [Server.cpp:384] ["/workspace/source/tiflash/contrib/grpc/src/core/ext/filters/client_channel/subchannel.cc, line number: 945, log msg : subchannel 0x7821a7060800 {address=ipv4:10.0.128.40:10080, args=grpc.client_channel_factory=0x782c66a9d540, grpc.default_authority=db-tidb-0.db-tidb-peer.tidb1379661944646414117.svc:10080, grpc.http2_scheme=https, grpc.initial_reconnect_backoff_ms=1000, grpc.internal.channel_credentials=0x782bf05086f0, grpc.internal.security_connector=0x782bf0935130, grpc.internal.subchannel_pool=0x782c364c6aa0, grpc.max_receive_message_length=-1, grpc.max_reconnect_backoff_ms=3000, grpc.min_reconnect_backoff_ms=1000, grpc.primary_user_agent=grpc-c++/1.44.0, grpc.resource_quota=0x782c364c6b30, grpc.server_uri=dns:///db-tidb-0.db-tidb-peer.tidb1379661944646414117.svc:10080}: connect failed: {\"created\":\"@1737453551.087321880\",\"description\":\"Handshake read failed\",\"file\":\"/workspace/source/tiflash/contrib/grpc/src/core/lib/security/transport/security_handshaker.cc\",\"file_line\":457,\"referenced_errors\":[{\"created\":\"@1737453551.087299660\",\"description\":\"FD Shutdown\",\"file\":\"/workspace/source/tiflash/contrib/grpc/src/core/lib/iomgr/lockfree_event.cc\",\"file_line\":217,\"referenced_errors\":[{\"created\":\"@1737453551.087296828\",\"description\":\"Handshake timed out\",\"file\":\"/workspace/source/tiflash/contrib/grpc/src/core/lib/channel/handshaker.cc\",\"file_line\":165}]}]}"] [source=grpc] [thread_id=2762]
[2025/01/21 09:59:11.087 +00:00] [INFO] [Server.cpp:384] ["/workspace/source/tiflash/contrib/grpc/src/core/ext/filters/client_channel/subchannel.cc, line number: 884, log msg : subchannel 0x7821a7060800 {address=ipv4:10.0.128.40:10080, args=grpc.client_channel_factory=0x782c66a9d540, grpc.default_authority=db-tidb-0.db-tidb-peer.tidb1379661944646414117.svc:10080, grpc.http2_scheme=https, grpc.initial_reconnect_backoff_ms=1000, grpc.internal.channel_credentials=0x782bf05086f0, grpc.internal.security_connector=0x782bf0935130, grpc.internal.subchannel_pool=0x782c364c6aa0, grpc.max_receive_message_length=-1, grpc.max_reconnect_backoff_ms=3000, grpc.min_reconnect_backoff_ms=1000, grpc.primary_user_agent=grpc-c++/1.44.0, grpc.resource_quota=0x782c364c6b30, grpc.server_uri=dns:///db-tidb-0.db-tidb-peer.tidb1379661944646414117.svc:10080}: Retry immediately"] [source=grpc] [thread_id=2762]
[2025/01/21 09:59:11.087 +00:00] [INFO] [Server.cpp:384] ["/workspace/source/tiflash/contrib/grpc/src/core/ext/filters/client_channel/subchannel.cc, line number: 910, log msg : subchannel 0x7821a7060800 {address=ipv4:10.0.128.40:10080, args=grpc.client_channel_factory=0x782c66a9d540, grpc.default_authority=db-tidb-0.db-tidb-peer.tidb1379661944646414117.svc:10080, grpc.http2_scheme=https, grpc.initial_reconnect_backoff_ms=1000, grpc.internal.channel_credentials=0x782bf05086f0, grpc.internal.security_connector=0x782bf0935130, grpc.internal.subchannel_pool=0x782c364c6aa0, grpc.max_receive_message_length=-1, grpc.max_reconnect_backoff_ms=3000, grpc.min_reconnect_backoff_ms=1000, grpc.primary_user_agent=grpc-c++/1.44.0, grpc.resource_quota=0x782c364c6b30, grpc.server_uri=dns:///db-tidb-0.db-tidb-peer.tidb1379661944646414117.svc:10080}: failed to connect to channel, retrying"] [source=grpc] [thread_id=2762]

The default grpc.max_reconnect_backoff_ms(3000 ms) is much smaller than DEFAULT_POLL_INTERVAL_MS(5000 ms). When facing error about TLS connection verification, the backup poller ( sync mod API ) can not capture valuable info in time before next reconnection.

4. What is your TiFlash version? (Required)

after #3649

>= v5.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants