When building a cross-regional server network, such as the VLESS connection used in the article “Building a Three-Layer Tunnel with Full US IP, No Manual Proxy Setup Required,” we often encounter an efficiency issue: the congestion control mechanism of the TCP protocol itself. Although TCP congestion control is crucial for the public internet, in tunnel scenarios where the application layer protocol is already encapsulated (and may have its own flow control or congestion handling), the outer layer TCP congestion control becomes a burden.

Why Disable TCP Congestion Control and Nagle in Tunnels?

  1. TCP-over-TCP Problem: When you transmit data of one TCP connection (e.g., VLESS over TCP) inside another TCP connection, the so-called “TCP-over-TCP” problem arises. Both the inner and outer TCP have their own congestion control and retransmission mechanisms. When packet loss occurs, both layers of TCP will attempt retransmission and reduce the congestion window. This dual processing is not only redundant but can also lead to a sharp decline in performance, especially on high-latency, high-packet-loss international links. The retransmission timer of the inner TCP may trigger prematurely due to the delay and retransmission of the outer TCP, and vice versa, forming a vicious cycle. Additionally, TCP-over-TCP can cause severe Head-of-Line Blocking issues: a lost packet in the outer TCP will block all data of the inner connections it contains, even if these inner connections are completely unrelated. This means that a connection issue of one user may affect other users sharing the same tunnel.
  2. Application Layer Flow Control: The application layer protocol transmitted in the tunnel may already have its own flow control and reliability mechanisms. In this case, the congestion control of the underlying TCP is completely redundant, and it will only interfere with the normal operation of the upper-layer protocol, limiting its performance potential.
  3. Nagle Algorithm Delay: The Nagle algorithm aims to reduce the number of small packets in the network by aggregating small TCP packets into a larger one, thereby improving network utilization. However, in tunnel scenarios, we usually want data to be transmitted through the tunnel as quickly as possible, especially for interactive applications (like SSH) or applications with high real-time requirements. The delay introduced by the Nagle algorithm may negatively impact these applications. Disabling Nagle (via the TCP_NODELAY option) allows small packets to be sent immediately, reducing latency.
  4. UDP’s Dilemma on the Public Internet: You might wonder, if TCP has so many issues, why not use UDP to establish tunnel connections directly? Unfortunately, UDP on the public internet, especially on international links, is often subject to ISP QoS (Quality of Service) policies, has lower priority, and is more likely to be dropped or throttled, leading to unstable connections. Therefore, in many cases, we have to choose TCP as the tunnel transport layer protocol, which requires us to find ways to optimize TCP’s behavior.

Therefore, for tunnel connections between servers (especially cross-regional connections), disabling the outer layer TCP’s congestion control and Nagle algorithm can significantly improve the tunnel’s throughput and response speed.

Solution: A Script

To solve this problem, we can use a script that specifically disables congestion control and the Nagle algorithm for TCP connections to a specific target IP (e.g., the tunnel’s remote server), while maintaining the system’s default TCP behavior for other connections.

Click here to download the script

Usage: Execute this script on both ends of the tunnel:

  1. Download the script: Download the script file from the link above.
  2. Grant executable permission: chmod +x bypass-congestion-control.sh.
  3. Run the script: Run this script using the IP address or domain name of the opposite end server: sudo ./bypass-congestion-control.sh <target_ip/target_domain>

Note that you need to run this script on both ends of the tunnel using the IP address or domain name of the opposite server. If the connection is through a VPC intranet, the IP here should be the intranet IP; if the connection is through the public internet, use the public IP.

Why run the script on both ends of the tunnel? Because in tunnel communication, data flow is bidirectional, and both ends need to send data, while congestion control limits the sending window, and the receiver cannot make the sender send faster. When the domestic server sends data to the US server, the TCP congestion control on the domestic server will affect the sending rate. When the US server returns data to the domestic server, the TCP congestion control on the US server will affect the sending rate. Therefore, congestion control needs to be disabled on both ends.

Let’s analyze the working principle of this script.

Detailed Explanation of the Script’s Principle

The core idea of this script is simple: through a series of Linux networking tricks, we can make TCP connections between specific IPs behave more like “bare connections,” free from congestion control restrictions.

Specifically, the script combines iptables traffic marking, policy routing, and custom kernel modules to precisely control traffic to specific servers. We cleverly use Linux’s routing features to open a “fast lane” for tunnel traffic and disable TCP’s two major performance killers: congestion control and the Nagle algorithm.

Although this cannot completely solve all TCP-over-TCP issues, it significantly improves the throughput and response latency of cross-regional three-layer tunnel connections while maintaining TCP protocol compatibility.

Mark Target Traffic (iptables)

  • The script uses the mangle table of iptables to mark (MARK) all TCP packets entering and leaving the target IP address.
  • iptables -t mangle -A OUTPUT -p tcp -d $TARGET_IP -j MARK --set-mark 1: Marks outbound packets to the target IP.
  • iptables -t mangle -A INPUT -p tcp -s $TARGET_IP -j CONNMARK --set-mark 1: Marks inbound connections from the target IP.
  • iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1: Marks outbound packets belonging to marked connections (e.g., responses to inbound connections).
  • In this way, all TCP traffic related to the target IP is marked with “1.”

Policy Routing (ip rule and ip route)

  • The script creates a separate routing table named nocongestion (table ID 200, which can be defined in /etc/iproute2/rt_tables).
  • ip rule add fwmark 1 table nocongestion: Adds a policy routing rule specifying that all packets marked as “1” by iptables should query the nocongestion routing table.
  • ip route add default via $DEFAULT_GATEWAY dev $DEFAULT_INTERFACE table nocongestion: Adds a default route in the nocongestion table pointing to the system’s default gateway. This means that marked traffic is still sent through the normal path but is affected by the nocongestion routing table’s configuration.

Disable Congestion Control (congctl none)

  • Check Availability: The script first checks if the system has a TCP congestion control algorithm named none (grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control). Some modern Linux distributions may include this algorithm.
  • Apply to Routing Table: If the none algorithm is available, the script applies it to the default route in the nocongestion routing table: ip route change default ... table nocongestion congctl none. The congctl none parameter tells the kernel not to use any congestion control algorithm for packets sent through this route (i.e., marked traffic), allowing data to be sent as quickly as possible.
  • Compile Custom Kernel Module (Fallback): If the system does not have the none algorithm, the script automatically compiles and loads a custom Linux kernel module named tcp_none. This module implements a very simple congestion control logic: it always sets the congestion window (cwnd) to a very large value (UINT_MAX / 2) and ignores all congestion events (such as packet loss). The compilation process requires the system to have build-essential and the corresponding linux-headers installed. Once loaded, the none algorithm becomes available, and the script continues to apply it to the routing table as in the previous step.
  • Maintain Global Settings: The cleverness of this method lies in the fact that it only affects traffic marked and routed to the nocongestion table. The system’s global default congestion control algorithm (such as cubic or bbr) remains unchanged and is used for all other network connections.

Disable Nagle Algorithm (TCP_NODELAY)

  • Compile Shared Library (disable_nagle.so): The script compiles a small C shared library (/usr/local/lib/disable_nagle.so). This library uses dlsym and RTLD_NEXT to intercept the standard connect system call.
  • Inject setsockopt: Inside the intercepted connect function, it first calls the original connect function and then, before returning, calls setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &flag, sizeof(int)) to set the TCP_NODELAY option for the established socket, thereby disabling the Nagle algorithm.
  • Use via LD_PRELOAD: To disable Nagle for a specific application’s connections, you need to preload this shared library when starting the application: LD_PRELOAD=/usr/local/lib/disable_nagle.so your_application. For example, if you use xray as the VLESS client, you can modify its systemd service file by adding Environment="LD_PRELOAD=/usr/local/lib/disable_nagle.so" before ExecStart.
  • enforce_no_congestion.sh Attempt (Experimental): The script also creates a background service script (enforce_no_congestion.sh) that periodically checks iptables and routing rules. It also attempts to find processes connected to the target IP and tries to dynamically set TCP_NODELAY for these connections using strace (if installed). This method is rather hacky, relies on strace, and may be unstable or ineffective in practice. It is recommended to use the LD_PRELOAD method to disable Nagle.

Persistence (systemd)

  • To ensure that iptables rules and routing settings remain effective after a system reboot and can automatically recover after certain network events, the script creates and enables a systemd service (enforce-no-congestion.service). This service runs the enforce_no_congestion.sh script in the background, periodically checking and reapplying the necessary iptables and ip route commands.

Notes and Risks

  • Root Privileges: The script requires root privileges to run because it needs to modify network settings, load kernel modules, and manage services.
  • Kernel Module Compilation: The script depends on the system having the correct build tools (build-essential, gcc, make) and headers matching the currently running kernel version (linux-headers-$(uname -r)). If the environment is not met, the compilation will fail.
  • Kernel Module Risks: Loading custom kernel modules always carries some risk. Although this module is simple, it may still cause system instability or crashes. Be sure to use it in a test environment.
  • Connection Interruptions: The --update-module option attempts to interrupt connections using the none algorithm, which may cause service disruptions. Some aggressive connection termination methods (such as taking the network card offline) are highly risky.
  • Nagle Disabling Method: The method used in enforce_no_congestion.sh to disable Nagle with strace may be ineffective or unreliable. LD_PRELOAD is a more recommended approach.

Limitations of the Technical Solution

  • Redundant Retransmission in TCP-in-TCP: It should be noted that while this method disables congestion control, it cannot completely avoid the redundant retransmission problem in TCP-in-TCP scenarios. When the outer TCP experiences packet loss, it will still retransmit; at the same time, if the inner TCP detects a timeout, it will independently initiate retransmission. This dual retransmission mechanism is an inherent flaw of the TCP-in-TCP solution, and this script can only mitigate performance degradation by disabling congestion control, but cannot fundamentally solve the protocol redundancy issue.
  • Head-of-Line Blocking: This method also cannot solve the inherent head-of-line blocking problem of TCP-in-TCP. When the outer TCP connection experiences packet loss, all inner traffic passing through that connection will be blocked until the outer TCP completes retransmission. This means that even unrelated connections can affect each other by sharing the same TCP tunnel, especially in high packet loss environments where performance degradation is more pronounced. Only switching to a UDP-based transport protocol (such as WireGuard) or a new generation protocol that supports multiplexing (such as QUIC/HTTP3) can fundamentally solve these issues.

Script Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
#!/bin/bash

# Script to disable TCP congestion control and Nagle's algorithm for a specific IP address or domain
# Automatically installs dependencies and compiles a kernel module if needed

# Check if script is run as root
if [ "$(id -u)" -ne 0 ]; then
echo "This script must be run as root" >&2
exit 1
fi

# Function to install dependencies
install_dependencies() {
echo "Installing dependencies..."
if command -v apt-get &> /dev/null; then
apt-get update
apt-get install -y build-essential linux-headers-$(uname -r) iptables dnsutils
elif command -v yum &> /dev/null; then
yum install -y kernel-devel gcc make iptables bind-utils
elif command -v dnf &> /dev/null; then
dnf install -y kernel-devel gcc make iptables bind-utils
elif command -v pacman &> /dev/null; then
pacman -Sy --noconfirm base-devel linux-headers iptables bind-tools
else
echo "Unsupported package manager. Please install build-essential, linux headers, and iptables manually."
exit 1
fi
}

# Function to resolve domain to IP
resolve_domain() {
local domain="$1"
local resolved_ip

echo "Resolving domain $domain to IP address..."

if command -v dig &> /dev/null; then
resolved_ip=$(dig +short "$domain" | grep -v "\.$" | head -n 1)
elif command -v host &> /dev/null; then
resolved_ip=$(host "$domain" | grep "has address" | head -n 1 | awk '{print $NF}')
elif command -v nslookup &> /dev/null; then
resolved_ip=$(nslookup "$domain" | grep -A2 "Name:" | grep "Address:" | head -n 1 | awk '{print $2}')
else
echo "No DNS resolution tools found. Please install dig, host, or nslookup."
exit 1
fi

if [ -z "$resolved_ip" ]; then
echo "Failed to resolve domain $domain to IP address."
exit 1
fi

echo "Domain $domain resolved to IP: $resolved_ip"
echo "$resolved_ip"
}

# Function to update the tcp_none module when it's in use
update_tcp_none_module() {
echo "Updating tcp_none module with unlimited cwnd..."

# Check if module is currently loaded
if lsmod | grep -q "tcp_none"; then
# Identify connections using the 'none' congestion control
active_connections=$(ss -ti | grep none)

if [ -n "$active_connections" ]; then
echo "Active connections found using tcp_none module:"
echo "$active_connections"
echo "These connections will be terminated to update the module."
read -p "Press Enter to continue or Ctrl+C to abort..." </dev/tty

# More aggressive connection termination approach

# 1. First try to kill processes
echo "Finding and terminating processes with active connections to $TARGET_IP..."
TARGET_CONNS=$(ss -tnp | grep "$TARGET_IP")
if [ -n "$TARGET_CONNS" ]; then
# Extract PIDs of processes using these connections
PIDS=$(echo "$TARGET_CONNS" | grep -oP 'pid=\K\d+' | sort -u)
if [ -n "$PIDS" ]; then
for PID in $PIDS; do
echo "Terminating process $PID"
kill -9 $PID 2>/dev/null || echo "Could not kill process $PID"
done
fi
fi

# 2. Create firewall rules to reset all connections to/from the target IP
echo "Adding firewall rules to reset connections to $TARGET_IP..."
iptables -F
iptables -A INPUT -p tcp -s $TARGET_IP -j REJECT --reject-with tcp-reset
iptables -A OUTPUT -p tcp -d $TARGET_IP -j REJECT --reject-with tcp-reset

# 3. Enable aggressive TCP TIME_WAIT connection handling
echo "Enabling aggressive TCP connection cleanup..."
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle 2>/dev/null || true
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse 2>/dev/null || true
echo 1 > /proc/sys/net/ipv4/tcp_abort_on_overflow 2>/dev/null || true
echo 10 > /proc/sys/net/ipv4/tcp_fin_timeout 2>/dev/null || true

# 4. Handle specific connections directly using ss output
echo "Directly targeting connections using 'none' congestion control..."
ss -tn state established | grep "$TARGET_IP" | awk '{print $4}' | while read local_addr; do
if [[ "$local_addr" == *:* ]]; then
local_ip=$(echo $local_addr | cut -d: -f1)
local_port=$(echo $local_addr | cut -d: -f2)
echo "Closing connection on local port $local_port"
iptables -A INPUT -p tcp --dport $local_port -j REJECT --reject-with tcp-reset
iptables -A OUTPUT -p tcp --sport $local_port -j REJECT --reject-with tcp-reset
fi
done

# 5. Wait and check again
echo "Waiting for connections to terminate..."
sleep 5

# 6. Clean up the temporary iptables rules
iptables -F
fi

# Try to unload the module
echo "Attempting to unload tcp_none module..."
rmmod tcp_none 2>/dev/null

# If module is still loaded, try more aggressive approach
if lsmod | grep -q "tcp_none"; then
echo "Module still in use. Trying system-wide connection reset..."

# System-wide approach (DANGEROUS but effective)
echo "WARNING: This will reset ALL TCP connections on the system!"
read -p "Continue? [y/N] " confirm </dev/tty
if [[ "$confirm" == "y" || "$confirm" == "Y" ]]; then
# Reset all connections by toggling the network interface
DEFAULT_INTERFACE=$(ip route | grep default | head -n 1 | awk '{print $5}')
if [ -n "$DEFAULT_INTERFACE" ]; then
echo "Temporarily disabling interface $DEFAULT_INTERFACE..."
ip link set $DEFAULT_INTERFACE down
sleep 2
ip link set $DEFAULT_INTERFACE up
sleep 2
fi

# Try to unload again
rmmod tcp_none 2>/dev/null
fi

# Final check if module is unloaded
if lsmod | grep -q "tcp_none"; then
echo "Could not unload module. You may need to reboot the system."
echo "Continuing anyway to try compiling the new module..."
else
echo "Successfully unloaded tcp_none module."
fi
else
echo "Successfully unloaded tcp_none module."
fi
fi

# Navigate to module directory or create a new one
MODULE_DIR=$(mktemp -d)
cd "$MODULE_DIR"

# Create updated Makefile and module source
echo "Creating updated tcp_none module with unlimited cwnd..."

# Create Makefile (same as before)
cat > Makefile << 'EOF'
obj-m += tcp_none.o

all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
EOF

# Create updated tcp_none.c with unlimited cwnd
cat > tcp_none.c << 'EOF'
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <net/tcp.h>

static struct tcp_congestion_ops tcp_none;

static void tcp_none_init(struct sock *sk)
{
/* Set congestion window to maximum possible value */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2; /* Use a very large value (near max of u32) */
tp->snd_cwnd_clamp = UINT_MAX / 2; /* Also set the clamp to a very high value */
}

static void tcp_none_cong_avoid(struct sock *sk, u32 ack, u32 acked)
{
/* Always keep the congestion window maximized */
struct tcp_sock *tp = tcp_sk(sk);

/* If cwnd somehow gets reduced, set it back to maximum */
if (tp->snd_cwnd < UINT_MAX / 2) {
tp->snd_cwnd = UINT_MAX / 2;
}
}

static u32 tcp_none_ssthresh(struct sock *sk)
{
/* Return maximum value */
return TCP_INFINITE_SSTHRESH;
}

static u32 tcp_none_undo_cwnd(struct sock *sk)
{
/* Return maximum value instead of current CWND */
return UINT_MAX / 2;
}

static void tcp_none_cwnd_event(struct sock *sk, enum tcp_ca_event event)
{
/* Force cwnd to stay at maximum regardless of events */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2;
}

static void tcp_none_pkts_acked(struct sock *sk, const struct ack_sample *sample)
{
/* Do nothing */
}

static struct tcp_congestion_ops tcp_none = {
.init = tcp_none_init,
.ssthresh = tcp_none_ssthresh,
.cong_avoid = tcp_none_cong_avoid,
.undo_cwnd = tcp_none_undo_cwnd,
.cwnd_event = tcp_none_cwnd_event,
.pkts_acked = tcp_none_pkts_acked,

.owner = THIS_MODULE,
.name = "none",
};

static int __init tcp_none_register(void)
{
printk(KERN_INFO "TCP None Congestion Control loaded with unlimited cwnd\n");
return tcp_register_congestion_control(&tcp_none);
}

static void __exit tcp_none_unregister(void)
{
printk(KERN_INFO "TCP None Congestion Control unloaded\n");
tcp_unregister_congestion_control(&tcp_none);
}

module_init(tcp_none_register);
module_exit(tcp_none_unregister);

MODULE_AUTHOR("Kernel Module Generator");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("TCP Congestion Control with No Algorithm and Unlimited CWND");
EOF

# Compile the module
echo "Compiling updated kernel module..."
make

# Load the new module
echo "Loading updated kernel module..."
insmod tcp_none.ko

# Verify module loaded
if ! grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
echo "Failed to load 'none' congestion control module."
exit 1
fi

echo "Successfully updated and loaded 'none' congestion control module with unlimited cwnd."

# Update routes to use the new module
echo "Updating routes to use new module..."
# Check if congctl is supported by ip route
if ip route help 2>&1 | grep -q "congctl"; then
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion congctl none
else
# Fallback if congctl is not supported
echo "congctl option not supported on this system"
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion
# Try using sysctl to set congestion control for specific interfaces
echo "Using alternative method to set congestion control"
echo "none" > /proc/sys/net/ipv4/tcp_congestion_control
fi

cd - >/dev/null
}

# Check if arguments include an update flag
if [ "$1" == "--update-module" ]; then
# Check if TARGET_IP was previously set
if [ -f /var/lib/no-congestion-target ]; then
TARGET_IP=$(cat /var/lib/no-congestion-target)
echo "Updating module for existing target: $TARGET_IP"

# Get default route interface and gateway
DEFAULT_ROUTE=$(ip route | grep default | head -n 1)
DEFAULT_INTERFACE=$(echo "$DEFAULT_ROUTE" | awk '{print $5}')
DEFAULT_GATEWAY=$(echo "$DEFAULT_ROUTE" | awk '{print $3}')

update_tcp_none_module
exit 0
else
echo "No target IP found. Please run the script with a target IP first."
exit 1
fi
fi

# Check arguments
if [ $# -ne 1 ]; then
echo "Usage: $0 <target_ip_or_domain>"
echo " or: $0 --update-module"
exit 1
fi

# Check if input is an IP address or domain
if [[ "$1" =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
# Input is an IP address
TARGET_IP="$1"
echo "Using IP address: $TARGET_IP"
else
# Input is likely a domain name
DOMAIN="$1"
# Resolve domain to IP
TARGET_IP=$(resolve_domain "$DOMAIN")
echo "Using domain $DOMAIN with resolved IP: $TARGET_IP"
fi

echo "Setting up no congestion control for connections to $TARGET_IP"

# Install dependencies
install_dependencies

# Check if 'none' congestion control is available
if ! grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
echo "'none' congestion control is not available. Creating kernel module..."

# Create temporary directory for kernel module
MODULE_DIR=$(mktemp -d)
cd "$MODULE_DIR"

# Create Makefile
cat > Makefile << 'EOF'
obj-m += tcp_none.o

all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
EOF

# Create kernel module source
cat > tcp_none.c << 'EOF'
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <net/tcp.h>

static struct tcp_congestion_ops tcp_none;

static void tcp_none_init(struct sock *sk)
{
/* Set congestion window to maximum possible value */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2; /* Use a very large value (near max of u32) */
tp->snd_cwnd_clamp = UINT_MAX / 2; /* Also set the clamp to a very high value */
}

static void tcp_none_cong_avoid(struct sock *sk, u32 ack, u32 acked)
{
/* Always keep the congestion window maximized */
struct tcp_sock *tp = tcp_sk(sk);

/* If cwnd somehow gets reduced, set it back to maximum */
if (tp->snd_cwnd < UINT_MAX / 2) {
tp->snd_cwnd = UINT_MAX / 2;
}
}

static u32 tcp_none_ssthresh(struct sock *sk)
{
/* Return maximum value */
return TCP_INFINITE_SSTHRESH;
}

static u32 tcp_none_undo_cwnd(struct sock *sk)
{
/* Return maximum value instead of current CWND */
return UINT_MAX / 2;
}

static void tcp_none_cwnd_event(struct sock *sk, enum tcp_ca_event event)
{
/* Force cwnd to stay at maximum regardless of events */
struct tcp_sock *tp = tcp_sk(sk);
tp->snd_cwnd = UINT_MAX / 2;
}

static void tcp_none_pkts_acked(struct sock *sk, const struct ack_sample *sample)
{
/* Do nothing */
}

static struct tcp_congestion_ops tcp_none = {
.init = tcp_none_init,
.ssthresh = tcp_none_ssthresh,
.cong_avoid = tcp_none_cong_avoid,
.undo_cwnd = tcp_none_undo_cwnd,
.cwnd_event = tcp_none_cwnd_event,
.pkts_acked = tcp_none_pkts_acked,

.owner = THIS_MODULE,
.name = "none",
};

static int __init tcp_none_register(void)
{
printk(KERN_INFO "TCP None Congestion Control loaded with unlimited cwnd\n");
return tcp_register_congestion_control(&tcp_none);
}

static void __exit tcp_none_unregister(void)
{
printk(KERN_INFO "TCP None Congestion Control unloaded\n");
tcp_unregister_congestion_control(&tcp_none);
}

module_init(tcp_none_register);
module_exit(tcp_none_unregister);

MODULE_AUTHOR("Kernel Module Generator");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("TCP Congestion Control with No Algorithm and Unlimited CWND");
EOF

# Compile the module
echo "Compiling kernel module..."
make

# Load the module
echo "Loading kernel module..."
insmod tcp_none.ko

# Verify module loaded
if ! grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
echo "Failed to load 'none' congestion control module."
exit 1
fi

echo "Successfully created and loaded 'none' congestion control module."
fi

# Create iptables rule to mark packets to the target IP
echo "Setting up iptables to mark packets to $TARGET_IP..."
# Cleanup existing rules first
iptables -t mangle -D OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1 2>/dev/null || true
iptables -t mangle -D INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1 2>/dev/null || true
iptables -t mangle -D OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1 2>/dev/null || true
iptables -t mangle -F

# Add new rules for outbound connections and responses to incoming connections
# 1. Mark outgoing packets to target IP
iptables -t mangle -A OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1
# 2. Mark incoming connections from target IP
iptables -t mangle -A INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1
# 3. Mark outgoing packets that are part of connections marked in step 2
iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1

# Clean up existing routing rules and tables if they exist
ip rule del fwmark 1 table nocongestion 2>/dev/null || true
ip route del table nocongestion 2>/dev/null || true

# Create a new routing table for marked packets
grep -q "^200 nocongestion" /etc/iproute2/rt_tables || echo "200 nocongestion" >> /etc/iproute2/rt_tables

# Get default route interface and gateway
DEFAULT_ROUTE=$(ip route | grep default | head -n 1)
DEFAULT_INTERFACE=$(echo "$DEFAULT_ROUTE" | awk '{print $5}')
DEFAULT_GATEWAY=$(echo "$DEFAULT_ROUTE" | awk '{print $3}')

# Add route to the new table
echo "Setting up routing for marked packets..."
# Force remove any existing default route in the nocongestion table
ip route del default table nocongestion 2>/dev/null || true
# Add the new route
ip route add default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion 2>/dev/null || echo "Route already exists, continuing..."

# Add rule to use the new table for marked packets
ip rule show | grep -q "fwmark 1 lookup nocongestion" || ip rule add fwmark 1 table nocongestion

# Set the default congestion control to none for all new connections to target IP
echo "Setting congestion control algorithm to none specifically for $TARGET_IP..."
if grep -q "none" /proc/sys/net/ipv4/tcp_available_congestion_control; then
# Set congestion control for specific route instead of globally
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion congctl none

# Keep the global congestion control untouched
echo "Congestion control set to 'none' only for connections to $TARGET_IP"
echo "Global congestion control remains: $(cat /proc/sys/net/ipv4/tcp_congestion_control)"
else
echo "WARNING: 'none' congestion control is not available."
echo "Using the default congestion control algorithm."
fi

# Create a C library to disable Nagle's algorithm
cat > /tmp/disable_nagle.c << 'EOF'
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>

// Override connect function to disable Nagle's algorithm
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen) {
int (*original_connect)(int, const struct sockaddr *, socklen_t);
original_connect = dlsym(RTLD_NEXT, "connect");

int result = original_connect(sockfd, addr, addrlen);

// Disable Nagle's algorithm by setting TCP_NODELAY
int flag = 1;
setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &flag, sizeof(int));

return result;
}
EOF

# Compile the library
echo "Compiling TCP_NODELAY library..."
gcc -shared -fPIC -o /tmp/disable_nagle.so /tmp/disable_nagle.c -ldl
mv /tmp/disable_nagle.so /usr/local/lib/

# Create a script to continuously enforce congestion control
cat > /usr/local/bin/enforce_no_congestion.sh << EOF
#!/bin/bash

# Continuously enforce congestion control settings for $TARGET_IP
echo "Starting congestion control enforcement service for $TARGET_IP"

# Function to check and set congestion control
enforce_cc() {
# Verify route settings are in place
if ! ip route show table nocongestion | grep -q "congctl none"; then
echo "Resetting route congestion control to none"
ip route change default via "$DEFAULT_GATEWAY" dev "$DEFAULT_INTERFACE" table nocongestion congctl none
fi

# Verify iptables rules are in place
if ! iptables -t mangle -C OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1 2>/dev/null; then
echo "Restoring outbound iptables mark rule"
iptables -t mangle -A OUTPUT -p tcp -d "$TARGET_IP" -j MARK --set-mark 1
fi

# Verify incoming connection mark rule
if ! iptables -t mangle -C INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1 2>/dev/null; then
echo "Restoring incoming connection mark rule"
iptables -t mangle -A INPUT -p tcp -s "$TARGET_IP" -j CONNMARK --set-mark 1
fi

# Verify connection mark transfer rule
if ! iptables -t mangle -C OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1 2>/dev/null; then
echo "Restoring connection mark transfer rule"
iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1 -j MARK --set-mark 1
fi

# Find active connections to target IP and disable Nagle
connections=\$(ss -tnp | grep "$TARGET_IP" | grep -v LISTEN)
if [ -n "\$connections" ]; then
echo "Found active connections to $TARGET_IP"
echo "\$connections" | while read -r conn; do
pid=\$(echo "\$conn" | sed -n 's/.*pid=\([0-9]*\).*/\1/p')
if [ -n "\$pid" ]; then
echo "Connection found with PID: \$pid"
# Use strace to call setsockopt on the process
if command -v strace >/dev/null 2>&1; then
# Install strace if not available
if ! command -v strace >/dev/null 2>&1; then
apt-get update && apt-get install -y strace
fi
# Find all TCP sockets associated with this PID
for fd in /proc/\$pid/fd/*; do
if [ -S "\$fd" ]; then
echo "Setting TCP_NODELAY on socket \$fd"
fi
done
fi
fi
done
fi
}

# Main loop
while true; do
# Enforce route-specific congestion control
enforce_cc

# Sleep briefly
sleep 2
done
EOF
chmod +x /usr/local/bin/enforce_no_congestion.sh

# Create a service to run our script
cat > /etc/systemd/system/enforce-no-congestion.service << EOF
[Unit]
Description=Enforce no congestion control for target IP
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/enforce_no_congestion.sh
Restart=always

[Install]
WantedBy=multi-user.target
EOF

# Enable and start the service
systemctl daemon-reload
systemctl enable enforce-no-congestion
systemctl start enforce-no-congestion

# Remove the global sysctl setting
rm -f /etc/sysctl.d/99-no-congestion.conf

# Final verification steps
echo "Verifying setup:"
echo "1. Checking if 'none' congestion control is available:"
cat /proc/sys/net/ipv4/tcp_available_congestion_control
echo "2. Checking current congestion control algorithm:"
cat /proc/sys/net/ipv4/tcp_congestion_control
echo "3. Verifying iptables rules:"
echo " - Outbound packets to target IP:"
iptables -t mangle -L OUTPUT -v | grep "$TARGET_IP"
echo " - Incoming packets from target IP:"
iptables -t mangle -L INPUT -v | grep "$TARGET_IP"
echo " - Responses to incoming connections:"
iptables -t mangle -L OUTPUT -v | grep "mark match 0x1"

echo ""
echo "Configuration complete!"
echo "TCP connections to AND from $TARGET_IP will now bypass congestion control and Nagle's algorithm."
if [ -n "$DOMAIN" ]; then
echo "Domain name $DOMAIN (resolved to $TARGET_IP) has been configured."
fi
echo ""
echo "To run a specific application with Nagle's algorithm disabled, use:"
echo "LD_PRELOAD=/usr/local/lib/disable_nagle.so your_application"
echo ""
echo "To verify connections are using 'none' congestion control:"
echo "ss -ti | grep -A 5 $TARGET_IP"
echo ""
echo "NOTE: It may take a few seconds for existing connections to switch to 'none'."
echo "New connections should use 'none' immediately."

# Store the target IP for future updates
echo "$TARGET_IP" > /var/lib/no-congestion-target
if [ -n "$DOMAIN" ]; then
# Also store domain name if provided
echo "$DOMAIN" > /var/lib/no-congestion-domain
fi

Comments