控制 TCP 重传：早期问题检测以防止数据丢失

Oleg Tolmashov14m2024/01/23

在本文中，我将讨论 TCP 通信的一个关键方面：有效管理服务器无法响应的情况。我重点关注应用程序仅通过 TCP 发送数据而不从服务器接收任何应用程序级响应的特定场景。本次探索从应用程序的角度涵盖了 TCP 通信，重点介绍了应用程序层和底层操作系统操作。您将了解如何设置有效的超时以避免在服务器实例无响应期间丢失数据。

featured image - 控制 TCP 重传：早期问题检测以防止数据丢失

介绍

在本文中，我将讨论 TCP 通信的一个关键方面：有效管理服务器无法响应的情况。我重点关注应用程序仅通过 TCP 发送数据而不从服务器接收任何应用程序级响应的特定场景。

本次探索从应用程序的角度涵盖了 TCP 通信，重点介绍了应用程序层和底层操作系统操作。您将了解如何设置有效的超时以避免在服务器实例无响应期间丢失数据。我将提供 Ruby 中的代码示例，但这个想法对于任何语言都是相同的。

静默 TCP 服务器的挑战

想象一下，您正在使用一个通过 TCP 套接字持续传输数据的应用程序。虽然 TCP 旨在确保在定义的 TCP 堆栈配置内传输级别的数据包传送，但考虑这在应用程序级别意味着什么是很有趣的。

为了更好地理解这一点，让我们使用 Ruby 构建一个示例 TCP 服务器和客户端。这将使我们能够观察实际的沟通过程。

server.rb ：

 # server.rb require 'socket' require 'time' $stdout.sync = true puts 'starting tcp server...' server = TCPServer.new(1234) puts 'started tcp server on port 1234' loop do Thread.start(server.accept) do |client| puts 'new client' while (message = client.gets) puts "#{Time.now}]: #{message.chomp}" end client.close end end

还有client.rb ：

 require 'socket' require 'time' $stdout.sync = true socket = Socket.tcp('server', 1234) loop do puts "sending message to the socket at #{Time.now}" socket.write "Hello from client\n" sleep 1 end

让我们使用这个Dockerfile将这个设置封装在容器中：

 FROM ruby:2.7 RUN apt-get update && apt-get install -y tcpdump # Set the working directory in the container WORKDIR /usr/src/app # Copy the current directory contents into the container at /usr/src/app COPY . .

和docker-compose.yml ：

 version: '3' services: server: build: context: . dockerfile: Dockerfile command: ruby server.rb volumes: - .:/usr/src/app ports: - "1234:1234" healthcheck: test: ["CMD", "sh", "-c", "nc -z localhost 1234"] interval: 1s timeout: 1s retries: 2 networks: - net client: build: context: . dockerfile: Dockerfile command: ruby client.rb volumes: - .:/usr/src/app - ./data:/data depends_on: - server networks: - net networks: net:

现在，我们可以使用docker compose up轻松运行它，并在日志中查看客户端如何发送消息以及服务器如何接收消息：

 $ docker compose up [+] Running 2/0 ⠿ Container tcp_tests-server-1 Created 0.0s ⠿ Container tcp_tests-client-1 Created 0.0s Attaching to tcp_tests-client-1, tcp_tests-server-1 tcp_tests-server-1 | starting tcp server... tcp_tests-server-1 | started tcp server on port 1234 tcp_tests-client-1 | sending message to the socket at 2024-01-14 08:59:08 +0000 tcp_tests-server-1 | new client tcp_tests-server-1 | 2024-01-14 08:59:08 +0000]: Hello from client tcp_tests-server-1 | 2024-01-14 08:59:09 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-14 08:59:09 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 08:59:10 +0000 tcp_tests-server-1 | 2024-01-14 08:59:10 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-14 08:59:11 +0000 tcp_tests-server-1 | 2024-01-14 08:59:11 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-14 08:59:12 +0000 tcp_tests-server-1 | 2024-01-14 08:59:12 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-14 08:59:13 +0000

到目前为止很容易，是吧？

然而，当我们模拟活动连接的服务器故障时，情况会变得更加有趣。

我们使用docker compose stop server来执行此操作：

 tcp_tests-server-1 | 2024-01-14 09:04:23 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:24 +0000 tcp_tests-server-1 | 2024-01-14 09:04:24 +0000]: Hello from client tcp_tests-server-1 exited with code 1 tcp_tests-server-1 exited with code 0 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:25 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:26 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:27 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:28 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:29 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:30 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:31 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:04:32 +0000

我们观察到服务器现在处于离线状态，但客户端的行为就像连接仍然处于活动状态一样，继续毫不犹豫地向套接字发送数据。

这让我质疑为什么会发生这种情况。从逻辑上讲，客户端应该在很短的时间内（可能是几秒钟）检测到服务器的停机时间，因为 TCP 无法收到其数据包的确认，从而提示连接关闭。

然而，实际结果却与这个预期有出入：

 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:11 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:12 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:13 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:14 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:15 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:16 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:17 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:18 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:19 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:20 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:21 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:22 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-14 09:20:23 +0000 tcp_tests-client-1 | client.rb:11:in `write': No route to host (Errno::EHOSTUNREACH) tcp_tests-client-1 | from client.rb:11:in `block in <main>' tcp_tests-client-1 | from client.rb:9:in `loop' tcp_tests-client-1 | from client.rb:9:in `<main>' tcp_tests-client-1 exited with code 1

事实上，客户端可能在长达15 分钟的时间内都没有意识到连接中断！

是什么导致检测延迟？让我们更深入地研究一下原因。

深入：TCP 通信机制

为了全面介绍这个案例，我们首先回顾一下基本原理，然后检查客户端如何通过 TCP 传输数据。

TCP 基础知识

下面是说明 TCP 流程的基本图：

连接建立后，每条消息的传输通常涉及两个关键步骤：

客户端发送消息，标有PSH（Push）标志。
服务器通过发回 ACK（确认）响应来确认接收。

应用程序和套接字之间的通信

下面是一个简化的序列图，说明了应用程序打开 TCP 套接字以及通过它进行后续数据传输：

该应用程序执行两个操作：

打开 TCP 套接字
向打开的套接字发送数据

例如，当打开 TCP 套接字时，如使用 Ruby 的Socket.tcp(host, port)命令完成的那样，系统使用socket(2)系统调用同步创建套接字，然后通过connect(2)系统调用建立连接。

至于发送数据，在应用程序中使用像socket.write('foo')这样的命令主要是将消息放入套接字的发送缓冲区中。然后它返回成功排队的字节数。该数据通过网络到目标主机的实际传输由 TCP/IP 堆栈异步管理。

这意味着当应用程序写入套接字时，它不会直接参与网络操作，并且可能无法实时知道连接是否仍处于活动状态。它收到的唯一确认是消息已成功添加到 TCP 发送缓冲区。

TCP 服务器宕机时会发生什么？

由于服务器没有响应 ACK 标志，我们的 TCP 堆栈将启动最后一个未确认数据包的重传：

这里有趣的是，默认情况下 TCP 会进行 15 次指数退避重传，这会导致近15 分钟的重试！

您可以检查主机上设置的重试次数：

 $ sysctl net.ipv4.tcp_retries2 net.ipv4.tcp_retries2 = 15

深入研究文档后，情况变得清晰起来； ip-sysctl.txt文档说：

默认值 15 产生 924.6 秒的假设超时，并且是有效超时的下限。 TCP 将在第一个超过假设超时的 RTO 处有效超时。

在此期间，本地 TCP 套接字处于活动状态并接受数据。当所有重试完成后，套接字将关闭，并且应用程序在尝试向套接字发送任何内容时会收到错误。

为什么它通常不是问题？

TCP 服务器在未发送 FIN 或 RST TCP 标志的情况下意外停机或存在连接问题的情况非常常见。那么为什么这样的情况常常被忽视呢？

因为，在大多数情况下，服务器会在应用程序级别上做出一些响应。例如，HTTP 协议要求服务器响应每个请求。基本上，当你有像connection.get这样的代码时，它会进行两个主要操作：

将有效负载写入 TCP 套接字的发送缓冲区。
从此时起，操作系统的 TCP 堆栈负责将这些数据包可靠地传递到具有 TCP 保证的远程服务器。
在 TCP 接收缓冲区中等待来自远程服务器的响应

通常，应用程序使用来自同一 TCP 套接字的接收缓冲区的非阻塞读取。

这种方法大大简化了问题，因为在这种情况下，我们可以轻松地在应用程序级别设置超时，并在定义的时间范围内没有来自服务器的响应时关闭套接字。

但是，当我们不期望服务器做出任何响应（TCP 确认除外）时，从应用程序级别确定连接的状态就变得不太简单

长 TCP 重传的影响

到目前为止，我们已经建立了以下内容：

应用程序打开一个 TCP 套接字并定期向其中写入数据。
在某个时刻，TCP 服务器甚至没有发送 RST 数据包就宕机了，发送方的 TCP 堆栈开始重新传输最后一个未确认的数据包。
写入该套接字的所有其他数据包都在该套接字的发送缓冲区中排队。
默认情况下，TCP 堆栈尝试使用指数退避技术重传未确认的数据包 15 次，导致持续时间约为 924.6 秒（约 15 分钟）。

在这 15 分钟期间，本地 TCP 套接字保持打开状态，应用程序继续向其写入数据，直到发送缓冲区已满（通常容量有限，通常只有几兆字节）。当套接字在所有重传后最终被标记为关闭时，发送缓冲区中的所有数据都会丢失。

这是因为写入发送缓冲区后应用程序不再对其负责，操作系统只是丢弃此数据。

只有当 TCP 套接字的发送缓冲区已满时，应用程序才能检测到连接已断开。在这种情况下，尝试写入套接字将阻塞应用程序的主线程，使其能够处理这种情况。

然而，这种检测方法的有效性取决于发送的数据的大小。

例如，如果应用程序仅发送几个字节（例如指标），则它可能无法在此 15 分钟的时间范围内完全填满发送缓冲区。

那么，如何实现一种机制，在 TCP 服务器宕机时关闭连接并明确设置超时时间，以避免在此期间发生 15 分钟的重传和数据丢失？

使用套接字选项的 TCP 重传超时

在专用网络中，通常不需要大量重传，并且可以将 TCP 堆栈配置为仅尝试有限次数的重试。但是，此配置全局适用于整个节点。由于多个应用程序通常在同一节点上运行，因此更改此默认值可能会导致意外的副作用。

更精确的方法是使用TCP_USER_TIMEOUT套接字选项专门为我们的套接字设置重传超时。通过使用此选项，如果在指定超时内重传未成功，TCP 堆栈将自动关闭套接字，无论全局设置的 TCP 重传最大次数如何。

在应用程序级别，这会导致在尝试将数据写入关闭的套接字时收到错误，从而允许正确的数据丢失预防处理。

让我们在client.rb中设置此套接字选项：

 require 'socket' require 'time' $stdout.sync = true socket = Socket.tcp('server', 1234) # set 5 seconds restransmissions timeout socket.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_USER_TIMEOUT, 5000) loop do puts "sending message to the socket at #{Time.now}" socket.write "Hello from client\n" sleep 1 end

另外，根据我的观察，TCP_USER_TIMEOUT 套接字选项在 macOS 上不可用。

现在，使用docket compose up重新启动一切，并在某个时刻，让我们使用docker compose stop server再次停止服务器：

 $ docker compose up [+] Running 2/0 ⠿ Container tcp_tests-server-1 Created 0.0s ⠿ Container tcp_tests-client-1 Created 0.0s Attaching to tcp_tests-client-1, tcp_tests-server-1 tcp_tests-server-1 | starting tcp server... tcp_tests-server-1 | started tcp server on port 1234 tcp_tests-server-1 | new client tcp_tests-server-1 | 2024-01-20 12:37:38 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:38 +0000 tcp_tests-server-1 | 2024-01-20 12:37:39 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:39 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:40 +0000 tcp_tests-server-1 | 2024-01-20 12:37:40 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:41 +0000 tcp_tests-server-1 | 2024-01-20 12:37:41 +0000]: Hello from client tcp_tests-server-1 | 2024-01-20 12:37:42 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:42 +0000 tcp_tests-server-1 | 2024-01-20 12:37:43 +0000]: Hello from client tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:43 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:44 +0000 tcp_tests-server-1 | 2024-01-20 12:37:44 +0000]: Hello from client tcp_tests-server-1 exited with code 1 tcp_tests-server-1 exited with code 0 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:45 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:46 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:47 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:48 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:49 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:50 +0000 tcp_tests-client-1 | sending message to the socket at 2024-01-20 12:37:51 +0000 tcp_tests-client-1 | client.rb:11:in `write': Connection timed out (Errno::ETIMEDOUT) tcp_tests-client-1 | from client.rb:11:in `block in <main>' tcp_tests-client-1 | from client.rb:9:in `loop' tcp_tests-client-1 | from client.rb:9:in `<main>' tcp_tests-client-1 exited with code 1

在~12:37:45，我停止了服务器，我们看到客户端几乎在 5 秒内就收到了Errno::ETIMEDOUT ，太棒了。

让我们使用docker exec -it tcp_tests-client-1 tcpdump -i any tcp port 1234捕获 tcpdump ：

值得注意的是，超时实际上发生在 5 秒多一点的时间内。这是因为在下次重试时会检查是否超出 TCP_USER_TIMEOUT。当 TCP/IP 堆栈检测到已超过超时时，它会将套接字标记为已关闭，并且我们的应用程序会收到Errno::ETIMEDOUT错误

另外，如果您使用 TCP keepalive，我建议您查看Cloudflare 上的这篇文章。它涵盖了将 TCP_USER_TIMEOUT 与 TCP keepalive 结合使用的细微差别。

L O A D I N G
. . . comments & more!

About Author

Oleg Tolmashov@koilas

Software Development Engineer @ Workato

Read my stories