在构建跨地域服务器网络时,例如《搭建全程美国 IP、无需手动设置代理的三层隧道》一文中使用的 VLESS 连接,我们常常会遇到一个效率问题:TCP 协议本身的拥塞控制机制。虽然 TCP 拥塞控制对于公共互联网至关重要,但在已经封装了应用层协议(可能自带流控或拥塞处理)的隧道场景下,外层 TCP 的拥塞控制反而成了累赘。

为什么要在隧道中禁用 TCP 拥塞控制和 Nagle?

  1. TCP-over-TCP 问题:当你在一个 TCP 连接(例如 VLESS over TCP)内部传输另一个 TCP 连接的数据时,就会出现所谓的 “TCP-over-TCP” 问题。内层 TCP 和外层 TCP 都有自己的拥塞控制和重传机制。当发生丢包时,两个层级的 TCP 都会尝试重传,并且都会缩减拥塞窗口。这种双重处理不仅冗余,而且会导致性能急剧下降,尤其是在高延迟、高丢包的跨国链路上。内层 TCP 的重传计时器可能会因为外层 TCP 的延迟和重传而过早触发,反之亦然,形成恶性循环。此外,TCP-over-TCP 还会导致严重的队头阻塞(Head-of-Line Blocking)问题:外层 TCP 丢失的一个数据包会阻塞其中包含的所有内层连接的数据,即使这些内层连接完全不相关。这意味着一个用户的连接问题可能会影响到共享同一隧道的其他用户。
  2. 应用层已有流控:隧道中传输的应用层协议可能已经实现了自己的流量控制和可靠性机制。在这种情况下,底层 TCP 的拥塞控制完全是多余的,它只会干扰上层协议的正常工作,限制其性能潜力。
  3. Nagle 算法的延迟:Nagle 算法旨在通过将小的 TCP 数据包聚合成一个较大的数据包来减少网络中的小包数量,从而提高网络利用率。然而,在隧道场景中,我们通常希望数据能够尽快通过隧道传输,尤其是对于交互式应用(如 SSH)或实时性要求高的应用。Nagle 算法引入的延迟可能会对这些应用造成负面影响。禁用 Nagle(通过 TCP_NODELAY 选项)可以让小数据包立即发送,降低延迟。
  4. UDP 在公共互联网上的困境:你可能会想,既然 TCP 这么多问题,为什么不直接用 UDP 建立隧道连接?不幸的是,UDP 在公共互联网,尤其是跨国链路上,经常受到运营商 QoS(服务质量)策略的限制,优先级较低,更容易被丢弃或限速,导致连接不稳定。因此,在很多情况下,我们不得不选择 TCP 作为隧道传输层协议,这就需要我们想办法优化 TCP 的行为。

因此,对于服务器之间的隧道连接(特别是跨地域连接),禁用外层 TCP 的拥塞控制和 Nagle 算法,可以显著提高隧道的吞吐量和响应速度。

解决方案:一个脚本

为了解决这个问题,我们可以使用一个脚本,该脚本专门为到特定目标 IP(例如隧道远端服务器)的 TCP 连接禁用拥塞控制和 Nagle 算法,同时保持系统对其他连接的默认 TCP 行为。

戳此链接下载脚本

使用方法:在隧道的两端分别执行这个脚本:

  1. 下载脚本:从上面的链接下载脚本文件。
  2. 赋予可执行权限chmod +x bypass-congestion-control.sh
  3. 运行脚本:使用对端服务器的 IP 地址或域名运行这个脚本:sudo ./bypass-congestion-control.sh <target_ip/target_domain>

注意,需要在隧道两端分别使用对端服务器的 IP 地址或域名运行这个脚本。如果连接走的是 VPC 内网,这里的 IP 应该是内网 IP;如果连接走的是公网,就要用公网 IP。

为什么要在隧道两端分别执行脚本?因为在隧道通信中,数据流是双向的,两端都需要发送数据,而拥塞控制限制的是发送窗口,接收方无法让发送方发得更快一些。当国内服务器向美国服务器发送数据时,国内服务器上的 TCP 拥塞控制会影响发送速率。当美国服务器向国内服务器回传数据时,美国服务器上的 TCP 拥塞控制会影响发送速率。因此,需要在两端都禁用拥塞控制。

下面我们来解析这个脚本的工作原理。

脚本原理详解

这个脚本的核心思想很简单:通过一系列 Linux 网络技巧,我们可以让特定 IP 之间的 TCP 连接表现得更像 “裸连接”,不受拥塞控制的限制。

具体来说,脚本结合了 iptables 流量标记、策略路由和自定义内核模块,让我们可以精确控制去往特定服务器的流量。我们巧妙地利用 Linux 的路由功能,为隧道流量开辟了一条 “快速通道”,并关闭了 TCP 的两大性能杀手:拥塞控制和 Nagle 算法。

虽然这不能完全解决 TCP-over-TCP 的所有问题,但在保持 TCP 协议兼容性的同时,大幅提升了跨地域三层隧道连接的吞吐量和响应延迟。

标记目标流量 (iptables)

  • 脚本使用 iptablesmangle 表来标记(MARK)所有进出目标 IP 地址的 TCP 数据包。
  • iptables -t mangle -A OUTPUT -p tcp -d $TARGET_IP -j MARK --set-mark 1:标记发往目标 IP 的出站数据包。
  • iptables -t mangle -A INPUT -p tcp -s $TARGET_IP -j CONNMARK --set-mark 1:标记来自目标 IP 的入站连接。
  • iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1:标记属于已标记连接的出站数据包(例如,对入站连接的响应)。
  • 这样,所有与目标 IP 相关的 TCP 流量都被打上了标记 “1”。

策略路由 (ip ruleip route)

  • 脚本创建了一个独立的路由表,名为 nocongestion(表 ID 为 200,可以在 /etc/iproute2/rt_tables 中定义)。
  • ip rule add fwmark 1 table nocongestion:添加一条策略路由规则,规定所有被 iptables 标记为 “1” 的数据包,都要查询 nocongestion 路由表。
  • ip route add default via $DEFAULT_GATEWAY dev $DEFAULT_INTERFACE table nocongestion:在 nocongestion 表中添加一条默认路由,指向系统的默认网关。这意味着被标记的流量仍然通过正常的路径发送出去,但它们受到了 nocongestion 路由表相关配置的影响。

禁用拥塞控制 (congctl none)

  • 检查可用性:脚本首先检查系统是否自带名为 none 的 TCP 拥塞控制算法 (grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control)。一些现代 Linux 发行版可能包含这个算法。
  • 应用到路由表:如果 none 算法可用,脚本会将其应用到 nocongestion 路由表中的默认路由上:ip route change default ... table nocongestion congctl nonecongctl none 参数告诉内核,通过这条路由(也就是被标记的流量)发送的数据包,不要使用任何拥塞控制算法,尽可能快地发送数据。
  • 编译自定义内核模块 (Fallback) :如果系统没有 none 算法,脚本会自动编译并加载一个名为 tcp_none 的自定义 Linux 内核模块。这个模块实现了一个极其简单的拥塞控制逻辑:将拥塞窗口(cwnd)始终设置为一个非常大的值 (UINT_MAX / 2),并且忽略所有拥塞事件(如丢包)。编译过程需要系统安装 build-essential 和对应的 linux-headers。加载后,none 算法就可用了,脚本会继续执行上一步将其应用到路由表。
  • 保持全局设置:这种方法的巧妙之处在于,它只影响了被标记并路由到 nocongestion 表的流量。系统的全局默认拥塞控制算法(如 cubicbbr)保持不变,用于处理其他所有网络连接。

禁用 Nagle 算法 (TCP_NODELAY)

  • 编译共享库 (disable_nagle.so) :脚本编译了一个小的 C 共享库 (/usr/local/lib/disable_nagle.so)。这个库使用了 dlsymRTLD_NEXT 来拦截标准的 connect 系统调用。
  • **注入 setsockopt**:在拦截到的 connect 函数内部,它首先调用原始的 connect 函数,然后在返回之前,调用 setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &flag, sizeof(int)) 来为建立的套接字设置 TCP_NODELAY 选项,从而禁用 Nagle 算法。
  • 通过 LD_PRELOAD 使用:要让某个应用程序的连接禁用 Nagle,你需要在启动该应用程序时预加载这个共享库:LD_PRELOAD=/usr/local/lib/disable_nagle.so your_application。例如,如果你使用 xray 作为 VLESS 客户端,你可以修改其 systemd 服务文件,在 ExecStart 前加上 Environment="LD_PRELOAD=/usr/local/lib/disable_nagle.so"
  • enforce_no_congestion.sh 的尝试 (实验性) :脚本还创建了一个后台服务脚本 (enforce_no_congestion.sh),它会定期检查 iptables 和路由规则。它还尝试查找连接到目标 IP 的进程,并试图使用 strace (如果安装) 来动态地为这些连接设置 TCP_NODELAY这种方法比较 hacky,依赖 strace,且在实践中可能不稳定或无效。推荐使用 LD_PRELOAD 方法来禁用 Nagle。

持久化 (systemd)

  • 为了确保 iptables 规则和路由设置在系统重启后依然生效,并且能在某些网络事件后自动恢复,脚本创建并启用了一个 systemd 服务 (enforce-no-congestion.service)。这个服务在后台运行 enforce_no_congestion.sh 脚本,定期检查并重新应用必要的 iptablesip route 命令。

注意事项和风险

  • Root 权限:脚本需要 root 权限运行,因为它需要修改网络设置、加载内核模块和管理服务。
  • 内核模块编译:脚本依赖于系统安装了正确的构建工具 (build-essential, gcc, make) 和与当前运行内核版本匹配的头文件 (linux-headers-$(uname -r))。如果环境不满足,编译会失败。
  • 内核模块风险:加载自定义内核模块总是有一定风险的,尽管这个模块很简单,但仍然可能导致系统不稳定或崩溃。务必在测试环境中使用。
  • 连接中断--update-module 选项会尝试中断正在使用 none 算法的连接,这可能导致服务中断。某些激进的连接终止手段(如下线网卡)风险很高。
  • Nagle 禁用方式enforce_no_congestion.sh 中使用 strace 禁用 Nagle 的方法可能无效或不可靠。LD_PRELOAD 是更推荐的方式。

技术方案的限制

  • TCP-in-TCP 的多余重传问题:需要注意,此方法虽然禁用了拥塞控制,但无法完全避免 TCP-in-TCP 场景中的多余重传问题。当外层 TCP 发生丢包时,它仍会进行重传;同时,内层 TCP 如果检测到超时,也会独立发起重传。这种双重重传机制是 TCP-in-TCP 方案固有的缺陷,本脚本只能通过禁用拥塞控制来减轻性能下降,而无法从根本上解决协议冗余问题。
  • 队头阻塞(Head-of-Line Blocking):本方法同样无法解决 TCP-in-TCP 固有的队头阻塞问题。当外层 TCP 连接丢包时,所有经过该连接的内层流量都会被阻塞,直到外层 TCP 完成重传。这意味着即使是不相关的连接也会因为共享同一个 TCP 隧道而相互影响,尤其在高丢包率环境下性能下降更为明显。只有切换到基于 UDP 的传输协议(如 WireGuard)或支持多路复用的新一代协议(如 QUIC/HTTP3)才能从根本上解决这些问题。

脚本源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
#!/bin/bash

# Script to disable TCP congestion control and Nagle's algorithm for a specific IP address or domain
# Automatically installs dependencies and compiles a kernel module if needed

# Check if script is run as root
if [ "$(id -u)" -ne 0 ]; then
echo "This script must be run as root" >&2
exit 1
fi

# Function to install dependencies
install_dependencies() {
echo "Installing dependencies..."
if command -v apt-get &> /dev/null; then
apt-get update
apt-get install -y build-essential linux-headers-$(uname -r) iptables dnsutils
elif command -v yum &> /dev/null; then
yum install -y kernel-devel gcc make iptables bind-utils
elif command -v dnf &> /dev/null; then
dnf install -y kernel-devel gcc make iptables bind-utils
elif command -v pacman &> /dev/null; then
pacman -Sy --noconfirm base-devel linux-headers iptables bind-tools
else
echo "Unsupported package manager. Please install build-essential, linux headers, and iptables manually."
exit 1
fi
}

# Function to resolve domain to IP
resolve_domain() {
local domain="$1"
local resolved_ip

echo "Resolving domain $domain to IP address..."

if command -v dig &> /dev/null; then
resolved_ip=$(dig +short "$domain" | grep -v "\.$" | head -n 1)
elif command -v host &> /dev/null; then
resolved_ip=$(host "$domain" | grep "has address" | head -n 1 | awk '{print $NF}')
elif command -v nslookup &> /dev/null; then
resolved_ip=$(nslookup "$domain" | grep -A2 "Name:" | grep "Address:" | head -n 1 | awk '{print $2}')
else
echo "No DNS resolution tools found. Please install dig, host, or nslookup."
exit 1
fi

if [ -z "$resolved_ip" ]; then
echo "Failed to resolve domain $domain to IP address."
exit 1
fi

echo "Domain $domain resolved to IP: $resolved_ip"
echo "$resolved_ip"
}

# Function to update the tcp_none module when it's in use
update_tcp_none_module() {
echo "Updating tcp_none module with unlimited cwnd..."

# Check if module is currently loaded
if lsmod | grep -q "tcp_none"; then
# Identify connections using the 'none' congestion control
active_connections=$(ss -ti | grep none)

if [ -n "$active_connections" ]; then
echo "Active connections found using tcp_none module:"
echo "$active_connections"
echo "These connections will be terminated to update the module."
read -p "Press Enter to continue or Ctrl+C to abort..." </dev/tty

# More aggressive connection termination approach

# 1. First try to kill processes
echo "Finding and terminating processes with active connections to $TARGET_IP..."
TARGET_CONNS=$(ss -tnp | grep "$TARGET_IP")
if [ -n "$TARGET_CONNS" ]; then
# Extract PIDs of processes using these connections
PIDS=$(echo "$TARGET_CONNS" | grep -oP 'pid=\K\d+' | sort -u)
if [ -n "$PIDS" ]; then
for PID in $PIDS; do
echo "Terminating process $PID"
kill -9 $PID 2>/dev/null || echo "Could not kill process $PID"
done
fi
fi

# 2. Create firewall rules to reset all connections to/from the target IP
echo "Adding firewall rules to reset connections to $TARGET_IP..."
iptables -F
iptables -A INPUT -p tcp -s $TARGET_IP -j REJECT --reject-with tcp-reset
iptables -A OUTPUT -p tcp -d $TARGET_IP -j REJECT --reject-with tcp-reset

# 3. Enable aggressive TCP TIME_WAIT connection handling
echo "Enabling aggressive TCP connection cleanup..."
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle 2>/dev/null || true
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse 2>/dev/null || true
echo 1 > /proc/sys/net/ipv4/tcp_abort_on_overflow 2>/dev/null || true
echo 10 > /proc/sys/net/ipv4/tcp_fin_timeout 2>/dev/null || true

# 4. Handle specific connections directly using ss output
echo "Directly targeting connections using 'none' congestion control..."
ss -tn state established | grep "$TARGET_IP" | awk '{print $4}' | while read local_addr; do
if [[ "$local_addr" == *:* ]]; then
local_ip=$(echo $local_addr | cut -d: -f1)
local_port=$(echo $local_addr | cut -d: -f2)
echo "Closing connection on local port $local_port"
iptables -A INPUT -p tcp --dport $local_port -j REJECT --reject-with tcp-reset
iptables -A OUTPUT -p tcp --sport $local_port -j REJECT --reject-with tcp-reset
fi
done

# 5. Wait and check again
echo "Waiting for connections to terminate..."
sleep 5

# 6. Clean up the temporary iptables rules
iptables -F
fi

# Try to unload the module
echo "Attempting to unload tcp_none module..."
rmmod tcp_none 2>/dev/null

# If module is still loaded, try more aggressive approach
if lsmod | grep -q "tcp_none"; then
echo "Module still in use. Trying system-wide connection reset..."

# System-wide approach (DANGEROUS but effective)
echo "WARNING: This will reset ALL TCP connections on the system!"
read -p "Continue? [y/N] " confirm </dev/tty
if [[ "$confirm" == "y" || "$confirm" == "Y" ]]; then
# Reset all connections by toggling the network interface
DEFAULT_INTERFACE=$(ip route | grep default | head -n 1 | awk '{print $5}')
if [ -n "$DEFAULT_INTERFACE" ]; then
echo "Temporarily disabling interface $DEFAULT_INTERFACE..."
ip link set $DEFAULT_INTERFACE down
sleep 2
ip link set $DEFAULT_INTERFACE up
sleep 2
fi

# Try to unload again
rmmod tcp_none 2>/dev/null
fi

# Final check if module is unloaded
if lsmod | grep -q "tcp_none"; then
echo "Could not unload module. You may need to reboot the system."
echo "Continuing anyway to try compiling the new module..."
else
echo "Successfully unloaded tcp_none module."
fi
else
echo "Successfully unloaded tcp_none module."
fi
fi

# Navigate to module directory or create a new one
MODULE_DIR=$(mktemp -d)
cd "$MODULE_DIR"

# Create updated Makefile and module source
echo "Creating updated tcp_none module with unlimited cwnd..."

# Create Makefile (same as before)
cat > Makefile << 'EOF'
obj-m += tcp_none.o

all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
EOF

# Create updated tcp_none.c with unlimited cwnd
cat > tcp_none.c << 'EOF'
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <net/tcp.h>

static struct tcp_congestion_ops tcp_none;

static void tcp_none_init(struct sock *sk)
{
/* Set congestion window to maximum possible value */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2; /* Use a very large value (near max of u32) */
tp->snd_cwnd_clamp = UINT_MAX / 2; /* Also set the clamp to a very high value */
}

static void tcp_none_cong_avoid(struct sock *sk, u32 ack, u32 acked)
{
/* Always keep the congestion window maximized */
struct tcp_sock *tp = tcp_sk(sk);

/* If cwnd somehow gets reduced, set it back to maximum */
if (tp->snd_cwnd < UINT_MAX / 2) {
tp->snd_cwnd = UINT_MAX / 2;
}
}

static u32 tcp_none_ssthresh(struct sock *sk)
{
/* Return maximum value */
return TCP_INFINITE_SSTHRESH;
}

static u32 tcp_none_undo_cwnd(struct sock *sk)
{
/* Return maximum value instead of current CWND */
return UINT_MAX / 2;
}

static void tcp_none_cwnd_event(struct sock *sk, enum tcp_ca_event event)
{
/* Force cwnd to stay at maximum regardless of events */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2;
}

static void tcp_none_pkts_acked(struct sock *sk, const struct ack_sample *sample)
{
/* Do nothing */
}

static struct tcp_congestion_ops tcp_none = {
.init = tcp_none_init,
.ssthresh = tcp_none_ssthresh,
.cong_avoid = tcp_none_cong_avoid,
.undo_cwnd = tcp_none_undo_cwnd,
.cwnd_event = tcp_none_cwnd_event,
.pkts_acked = tcp_none_pkts_acked,

.owner = THIS_MODULE,
.name = "none",
};

static int __init tcp_none_register(void)
{
printk(KERN_INFO "TCP None Congestion Control loaded with unlimited cwnd\n");
return tcp_register_congestion_control(&tcp_none);
}

static void __exit tcp_none_unregister(void)
{
printk(KERN_INFO "TCP None Congestion Control unloaded\n");
tcp_unregister_congestion_control(&tcp_none);
}

module_init(tcp_none_register);
module_exit(tcp_none_unregister);

MODULE_AUTHOR("Kernel Module Generator");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("TCP Congestion Control with No Algorithm and Unlimited CWND");
EOF

# Compile the module
echo "Compiling updated kernel module..."
make

# Load the new module
echo "Loading updated kernel module..."
insmod tcp_none.ko

# Verify module loaded
if ! grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
echo "Failed to load 'none' congestion control module."
exit 1
fi

echo "Successfully updated and loaded 'none' congestion control module with unlimited cwnd."

# Update routes to use the new module
echo "Updating routes to use new module..."
# Check if congctl is supported by ip route
if ip route help 2>&1 | grep -q "congctl"; then
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion congctl none
else
# Fallback if congctl is not supported
echo "congctl option not supported on this system"
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion
# Try using sysctl to set congestion control for specific interfaces
echo "Using alternative method to set congestion control"
echo "none" > /proc/sys/net/ipv4/tcp_congestion_control
fi

cd - >/dev/null
}

# Check if arguments include an update flag
if [ "$1" == "--update-module" ]; then
# Check if TARGET_IP was previously set
if [ -f /var/lib/no-congestion-target ]; then
TARGET_IP=$(cat /var/lib/no-congestion-target)
echo "Updating module for existing target: $TARGET_IP"

# Get default route interface and gateway
DEFAULT_ROUTE=$(ip route | grep default | head -n 1)
DEFAULT_INTERFACE=$(echo "$DEFAULT_ROUTE" | awk '{print $5}')
DEFAULT_GATEWAY=$(echo "$DEFAULT_ROUTE" | awk '{print $3}')

update_tcp_none_module
exit 0
else
echo "No target IP found. Please run the script with a target IP first."
exit 1
fi
fi

# Check arguments
if [ $# -ne 1 ]; then
echo "Usage: $0 <target_ip_or_domain>"
echo " or: $0 --update-module"
exit 1
fi

# Check if input is an IP address or domain
if [[ "$1" =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
# Input is an IP address
TARGET_IP="$1"
echo "Using IP address: $TARGET_IP"
else
# Input is likely a domain name
DOMAIN="$1"
# Resolve domain to IP
TARGET_IP=$(resolve_domain "$DOMAIN")
echo "Using domain $DOMAIN with resolved IP: $TARGET_IP"
fi

echo "Setting up no congestion control for connections to $TARGET_IP"

# Install dependencies
install_dependencies

# Check if 'none' congestion control is available
if ! grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
echo "'none' congestion control is not available. Creating kernel module..."

# Create temporary directory for kernel module
MODULE_DIR=$(mktemp -d)
cd "$MODULE_DIR"

# Create Makefile
cat > Makefile << 'EOF'
obj-m += tcp_none.o

all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
EOF

# Create kernel module source
cat > tcp_none.c << 'EOF'
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <net/tcp.h>

static struct tcp_congestion_ops tcp_none;

static void tcp_none_init(struct sock *sk)
{
/* Set congestion window to maximum possible value */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2; /* Use a very large value (near max of u32) */
tp->snd_cwnd_clamp = UINT_MAX / 2; /* Also set the clamp to a very high value */
}

static void tcp_none_cong_avoid(struct sock *sk, u32 ack, u32 acked)
{
/* Always keep the congestion window maximized */
struct tcp_sock *tp = tcp_sk(sk);

/* If cwnd somehow gets reduced, set it back to maximum */
if (tp->snd_cwnd < UINT_MAX / 2) {
tp->snd_cwnd = UINT_MAX / 2;
}
}

static u32 tcp_none_ssthresh(struct sock *sk)
{
/* Return maximum value */
return TCP_INFINITE_SSTHRESH;
}

static u32 tcp_none_undo_cwnd(struct sock *sk)
{
/* Return maximum value instead of current CWND */
return UINT_MAX / 2;
}

static void tcp_none_cwnd_event(struct sock *sk, enum tcp_ca_event event)
{
/* Force cwnd to stay at maximum regardless of events */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2;
}

static void tcp_none_pkts_acked(struct sock *sk, const struct ack_sample *sample)
{
/* Do nothing */
}

static struct tcp_congestion_ops tcp_none = {
.init = tcp_none_init,
.ssthresh = tcp_none_ssthresh,
.cong_avoid = tcp_none_cong_avoid,
.undo_cwnd = tcp_none_undo_cwnd,
.cwnd_event = tcp_none_cwnd_event,
.pkts_acked = tcp_none_pkts_acked,

.owner = THIS_MODULE,
.name = "none",
};

static int __init tcp_none_register(void)
{
printk(KERN_INFO "TCP None Congestion Control loaded with unlimited cwnd\n");
return tcp_register_congestion_control(&tcp_none);
}

static void __exit tcp_none_unregister(void)
{
printk(KERN_INFO "TCP None Congestion Control unloaded\n");
tcp_unregister_congestion_control(&tcp_none);
}

module_init(tcp_none_register);
module_exit(tcp_none_unregister);

MODULE_AUTHOR("Kernel Module Generator");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("TCP Congestion Control with No Algorithm and Unlimited CWND");
EOF

# Compile the module
echo "Compiling kernel module..."
make

# Load the module
echo "Loading kernel module..."
insmod tcp_none.ko

# Verify module loaded
if ! grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
echo "Failed to load 'none' congestion control module."
exit 1
fi

echo "Successfully created and loaded 'none' congestion control module."
fi

# Create iptables rule to mark packets to the target IP
echo "Setting up iptables to mark packets to $TARGET_IP..."
# Cleanup existing rules first
iptables -t mangle -D OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1 2>/dev/null || true
iptables -t mangle -D INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1 2>/dev/null || true
iptables -t mangle -D OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1 2>/dev/null || true
iptables -t mangle -F

# Add new rules for outbound connections and responses to incoming connections
# 1. Mark outgoing packets to target IP
iptables -t mangle -A OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1
# 2. Mark incoming connections from target IP
iptables -t mangle -A INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1
# 3. Mark outgoing packets that are part of connections marked in step 2
iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1

# Clean up existing routing rules and tables if they exist
ip rule del fwmark 1 table nocongestion 2>/dev/null || true
ip route del table nocongestion 2>/dev/null || true

# Create a new routing table for marked packets
grep -q "^200 nocongestion" /etc/iproute2/rt_tables || echo "200 nocongestion" >> /etc/iproute2/rt_tables

# Get default route interface and gateway
DEFAULT_ROUTE=$(ip route | grep default | head -n 1)
DEFAULT_INTERFACE=$(echo "$DEFAULT_ROUTE" | awk '{print $5}')
DEFAULT_GATEWAY=$(echo "$DEFAULT_ROUTE" | awk '{print $3}')

# Add route to the new table
echo "Setting up routing for marked packets..."
# Force remove any existing default route in the nocongestion table
ip route del default table nocongestion 2>/dev/null || true
# Add the new route
ip route add default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion 2>/dev/null || echo "Route already exists, continuing..."

# Add rule to use the new table for marked packets
ip rule show | grep -q "fwmark 1 lookup nocongestion" || ip rule add fwmark 1 table nocongestion

# Set the default congestion control to none for all new connections to target IP
echo "Setting congestion control algorithm to none specifically for $TARGET_IP..."
if grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
# Set congestion control for specific route instead of globally
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion congctl none

# Keep the global congestion control untouched
echo "Congestion control set to 'none' only for connections to $TARGET_IP"
echo "Global congestion control remains: $(cat /proc/sys/net/ipv4/tcp_congestion_control)"
else
echo "WARNING: 'none' congestion control is not available."
echo "Using the default congestion control algorithm."
fi

# Create a C library to disable Nagle's algorithm
cat > /tmp/disable_nagle.c << 'EOF'
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>

// Override connect function to disable Nagle's algorithm
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen) {
int (*original_connect)(int, const struct sockaddr *, socklen_t);
original_connect = dlsym(RTLD_NEXT, "connect");

int result = original_connect(sockfd, addr, addrlen);

// Disable Nagle's algorithm by setting TCP_NODELAY
int flag = 1;
setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &flag, sizeof(int));

return result;
}
EOF

# Compile the library
echo "Compiling TCP_NODELAY library..."
gcc -shared -fPIC -o /tmp/disable_nagle.so /tmp/disable_nagle.c -ldl
mv /tmp/disable_nagle.so /usr/local/lib/

# Create a script to continuously enforce congestion control
cat > /usr/local/bin/enforce_no_congestion.sh << EOF
#!/bin/bash

# Continuously enforce congestion control settings for $TARGET_IP
echo "Starting congestion control enforcement service for $TARGET_IP"

# Function to check and set congestion control
enforce_cc() {
# Verify route settings are in place
if ! ip route show table nocongestion | grep -q "congctl none"; then
echo "Resetting route congestion control to none"
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion congctl none
fi

# Verify iptables rules are in place
if ! iptables -t mangle -C OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1 2>/dev/null; then
echo "Restoring outbound iptables mark rule"
iptables -t mangle -A OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1
fi

# Verify incoming connection mark rule
if ! iptables -t mangle -C INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1 2>/dev/null; then
echo "Restoring incoming connection mark rule"
iptables -t mangle -A INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1
fi

# Verify connection mark transfer rule
if ! iptables -t mangle -C OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1 2>/dev/null; then
echo "Restoring connection mark transfer rule"
iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1
fi

# Find active connections to target IP and disable Nagle
connections=\$(ss -tnp | grep "$TARGET_IP" | grep -v LISTEN)
if [ -n "\$connections" ]; then
echo "Found active connections to $TARGET_IP"
echo "\$connections" | while read -r conn; do
pid=\$(echo "\$conn" | sed -n 's/.*pid=\([0-9]*\).*/\1/p')
if [ -n "\$pid" ]; then
echo "Connection found with PID: \$pid"
# Use strace to call setsockopt on the process
if command -v strace >/dev/null 2>&1; then
# Install strace if not available
if ! command -v strace >/dev/null 2>&1; then
apt-get update && apt-get install -y strace
fi
# Find all TCP sockets associated with this PID
for fd in /proc/\$pid/fd/*; do
if [ -S "\$fd" ]; then
echo "Setting TCP_NODELAY on socket \$fd"
fi
done
fi
fi
done
fi
}

# Main loop
while true; do
# Enforce route-specific congestion control
enforce_cc

# Sleep briefly
sleep 2
done
EOF
chmod +x /usr/local/bin/enforce_no_congestion.sh

# Create a service to run our script
cat > /etc/systemd/system/enforce-no-congestion.service << EOF
[Unit]
Description=Enforce no congestion control for target IP
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/enforce_no_congestion.sh
Restart=always

[Install]
WantedBy=multi-user.target
EOF

# Enable and start the service
systemctl daemon-reload
systemctl enable enforce-no-congestion
systemctl start enforce-no-congestion

# Remove the global sysctl setting
rm -f /etc/sysctl.d/99-no-congestion.conf

# Final verification steps
echo "Verifying setup:"
echo "1. Checking if 'none' congestion control is available:"
cat /proc/sys/net/ipv4/tcp_available_congestion_control
echo "2. Checking current congestion control algorithm:"
cat /proc/sys/net/ipv4/tcp_congestion_control
echo "3. Verifying iptables rules:"
echo " - Outbound packets to target IP:"
iptables -t mangle -L OUTPUT -v | grep "$TARGET_IP"
echo " - Incoming packets from target IP:"
iptables -t mangle -L INPUT -v | grep "$TARGET_IP"
echo " - Responses to incoming connections:"
iptables -t mangle -L OUTPUT -v | grep "mark match 0x1"

echo ""
echo "Configuration complete!"
echo "TCP connections to AND from $TARGET_IP will now bypass congestion control and Nagle's algorithm."
if [ -n "$DOMAIN" ]; then
echo "Domain name $DOMAIN (resolved to $TARGET_IP) has been configured."
fi
echo ""
echo "To run a specific application with Nagle's algorithm disabled, use:"
echo "LD_PRELOAD=/usr/local/lib/disable_nagle.so your_application"
echo ""
echo "To verify connections are using 'none' congestion control:"
echo "ss -ti | grep -A 5 $TARGET_IP"
echo ""
echo "NOTE: It may take a few seconds for existing connections to switch to 'none'."
echo "New connections should use 'none' immediately."

# Store the target IP for future updates
echo "$TARGET_IP" > /var/lib/no-congestion-target
if [ -n "$DOMAIN" ]; then
# Also store domain name if provided
echo "$DOMAIN" > /var/lib/no-congestion-domain
fi

Comments