Search before asking
Version
Pulsar: 3.0.1.4
C++ Client: 3.4.2
Minimal reproduce step
It happens in a stress test.
What did you expect to see?
When the broker is temporarily unavailable, e.g. SSL handshake failed, the client should retry creating producers or consumers.
What did you see instead?
There are a lot ResultConnectError errors in createProducerAsync with many Handshake failed: stream truncated and Connection closed with ConnectError logs.
2024-01-25T00:23:02.223Z E [<local_ip>:53420 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.223Z E [<local_ip>:53420 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.287Z E [<local_ip>:53488 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.288Z E [<local_ip>:53488 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.323Z E [<local_ip>:53538 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.323Z E [<local_ip>:53538 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.430Z E [<local_ip>:53730 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.430Z E [<local_ip>:53730 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.485Z E [<local_ip>:53798 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.485Z E [<local_ip>:53798 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.521Z E [<local_ip>:53886 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.521Z E [<local_ip>:53886 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.697Z E [<local_ip>:54094 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.698Z E [<local_ip>:54094 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.812Z E [<local_ip>:54280 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.812Z E [<local_ip>:54280 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.824Z W Error creating topic producer for <topic-1>: 5
2024-01-25T00:23:02.824Z E [<local_ip>:54350 -> <remote_ip>:6651] Connection closed with ConnectError (refCnt: 2)
2024-01-25T00:23:02.824Z E [<local_ip>:54350 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.838Z E [<local_ip>:54386 -> <remote_ip>:6651] Handshake failed: stream truncated
2024-01-25T00:23:02.824Z W Error creating topic producer for <topic-2>: 5
Error code 5 means ResultConnectError.
Anything else?
It's because when handshake failed, the ClientConnection will close with ResultConnectError (by default)
|
void ClientConnection::handleHandshake(const ASIO_ERROR& err) { |
|
if (err) { |
|
LOG_ERROR(cnxString_ << "Handshake failed: " << err.message()); |
|
close(); |
|
return; |
Then ProducerImpl::connectionFailed will be called with ResultConnectError, if the producer didn't complete the creation, it will immediately fail with that Result
|
} else if (producerCreatedPromise_.setFailed(result)) { |
Are you willing to submit a PR?
Search before asking
Version
Pulsar: 3.0.1.4
C++ Client: 3.4.2
Minimal reproduce step
It happens in a stress test.
What did you expect to see?
When the broker is temporarily unavailable, e.g. SSL handshake failed, the client should retry creating producers or consumers.
What did you see instead?
There are a lot
ResultConnectErrorerrors increateProducerAsyncwith manyHandshake failed: stream truncatedandConnection closed with ConnectErrorlogs.Error code 5 means
ResultConnectError.Anything else?
It's because when handshake failed, the
ClientConnectionwill close withResultConnectError(by default)pulsar-client-cpp/lib/ClientConnection.cc
Lines 504 to 508 in d1dd08b
Then
ProducerImpl::connectionFailedwill be called withResultConnectError, if the producer didn't complete the creation, it will immediately fail with that Resultpulsar-client-cpp/lib/ProducerImpl.cc
Line 179 in d1dd08b
Are you willing to submit a PR?