Page 2 | Bojie Li

2025-07-18

Setting Up an IKEv2 Tunnel Without Installing a Client to Bypass Cursor's Regional Restrictions

(Thanks to Koutian Wu for thoroughly debugging and deploying, and for pointing out several technical issues in the original article, which have been corrected in this version)

As access to tools like Cursor and Claude Code becomes restricted in China, traditional HTTP/SOCKS proxies can no longer meet daily needs. These tools not only impose regional restrictions on the server side but may also employ multi-layered techniques to detect the user’s true geographical location (currently only partially implemented, but may be upgraded in the future):

Basic IP Database Matching: Traditional GeoIP database queries
Timezone Consistency Check: Obtaining the client’s timezone via JavaScript and cross-verifying with the IP’s geographical location
DNS Resolution Check: Using Geo DNS resolution results to check the real location
WebRTC IP Leak Detection: Obtaining the user’s real IP address via WebRTC
CloudFlare Source Address Retrieval: Obtaining the real source address through CloudFlare’s HTTP headers

Most current HTTP/SOCKS proxies can only handle basic detection methods, while more complex multi-dimensional detection often leaves them powerless. A three-layer tunnel, working at the network layer, can more thoroughly hide the user’s real network environment.

Besides bypassing geographical restrictions, a three-layer tunnel is also suitable for the following scenarios:

Server Access Control: Avoid exposing the SSH access port of company servers on the public internet
Development and Testing Environment: Avoid exposing the company’s test servers, internal APIs, etc., on the public internet
Secure Network Environment: Ensure communication security in untrusted public WiFi environments

While solutions like WireGuard and OpenVPN are stable and efficient, they require installing dedicated clients, which can be cumbersome in multi-device usage scenarios.

IKEv2, as a modern VPN standard, not only offers excellent performance and stability but, more importantly, is natively integrated into mainstream operating systems like macOS, Windows, iOS, and Android, eliminating the need to install any third-party clients.

This article will build on the architecture idea from “Skillfully Using Hong Kong as a Relay to Build a Smooth and Stable China-US Three-Layer Tunnel“ to construct a China -> Hong Kong -> USA IKEv2 tunnel three-hop solution.

2025-07-15

Solving LLM Constrained Sampling Interview Questions with Vibe Coding

This is an interview question from our company.

Some say our Vibe Coding programming questions are too difficult, but actually, our company’s 2-hour Vibe Coding interview questions basically don’t require you to write code yourself. Just input the question into the prompt, continuously interact with the LLM to propose requirements and improvement directions, and the AI will complete it for you.

Why is it called Vibe Coding? It’s about minimizing direct code writing. The division of labor between humans and AI becomes very clear: humans are responsible for direction control, problem definition, and result review, while AI is responsible for specific implementation. Something like Claude Code is an extreme example, where humans are not allowed to touch the code, only the LLM can.

Below, I will demonstrate how Vibe Coding works through the complete experience of this interview question. This entire exploration process was not smooth sailing; the AI’s initial solution had serious flaws. It was through my continuous review and direction correction that we finally arrived at a usable solution. This is not only about solving a technical problem but also a deep exploration of the future software development model.

It is worth mentioning that this article itself was also automatically generated by Gemini 2.5 Pro in Cursor based on my work log (including all my conversations with AI and the evolution of the code). From the moment I first posed the question to Cursor, to completing the final usable program, and then generating this illustrated blog post, the entire process took only 1.5 hours.

The Challenge: LLM Constrained Sampling

A software for learning English needs to ensure that all words output by its built-in LLM must be within a 3000-word vocabulary.

Requirements:

Use the Constrained Sampling method of large language models (LLM) to modify the token sampling algorithm in the inference framework (such as transformers) to ensure that all content output by the LLM is within this given 3000-word vocabulary.
Of course, punctuation, spaces, line breaks, etc., are allowed, but special characters, Chinese, French, emojis, etc., are not allowed.
Case transformations of words in the vocabulary are considered valid words, for example, if the word apple is in the vocabulary, then apple, Apple, APPLE are all considered valid outputs.
The 3000-word vocabulary can be any common English word list found online.
The performance of the constrained sampling algorithm should be as good as possible.

2025-07-12

Skillfully Using Hong Kong as a Transit Point to Build a Smooth and Stable Three-Layer Tunnel Between China and the US

In the previous article, “Building a Three-Layer Tunnel with Full US IP and No Manual Proxy Settings,” we addressed many network issues encountered when accessing global services through the architecture of Domestic Server -> US Server. However, a new performance bottleneck has gradually emerged: the public connection between the domestic server and the US server experiences high latency and severe packet loss during peak hours.

This results in issues like SSH operation lag, online meeting disconnections, and API request timeouts, even when using a tunnel. The root cause lies in the international internet link between China and the US, which is like a highway during holidays—congestion is the norm.

Faced with this problem, a counterintuitive solution emerges: If the direct route is blocked, would taking a detour be faster?

title: The Translated Work “Illustrated DeepSeek Technology” by Meng Jiaying and Me is About to be Released
date: 2025-06-30 10:00

2025-07-10

Zhongguancun Artificial Intelligence Academy & UCAS 2025 Summer AI Agent Practical Topics

The AI Agent Hackathon at UCAS in February 2025 was very successful, so I will host two AI Agent practical topics again from July 27 to 30 at Zhongguancun Artificial Intelligence Academy and from July 31 to August 4 at UCAS.

Many thanks to Professor Zheng Shuxin, Vice Dean of Zhongguancun Artificial Intelligence Academy, and Professor Liu Junming of UCAS for inviting me to host these two AI Agent practical activities.

All the topics of this AI Agent practice will take you deep into the cutting-edge technology of building the next generation of AI Agents. You will have the opportunity to practice:

Multimodal models and thinking model applications: Build the “brain” of the agent with industry-leading multimodal models and thinking models such as Gemini 2.5 Pro and Claude 4 Sonnet.
Real-time voice interaction: Integrate VAD, ASR, LLM, and TTS technology stacks to create real-time voice agents capable of streaming conversations.
Autonomous operation of graphical interfaces: Develop agents that can stably operate GUIs such as browsers to complete complex real-world tasks.
Advanced Agent Architecture: Explore advanced architectures such as “fast and slow thinking,” “thinking while listening,” and multi-agent collaboration to give agents the ability to respond in real-time and think deeply.
Learning from experience: Build agents that can learn from experience, allowing them to become more proficient in repetitive tasks.
Identifying authoritative information sources: Enable agents to accurately identify and adopt high-credibility information such as official documents and academic papers from vast amounts of information.
Autonomous tool invocation and creation: Allow agents not only to use existing tools but also to autonomously learn and create new tools to solve open-ended problems.

Suggestions on AI-assisted programming: In this AI Agent practice, we encourage everyone to use AI-assisted programming, which means “developing agents with agents.” We recommend using Cursor for Vibe Coding, and here are some suggestions:

Documentation first, code later: Let Cursor write the design document first. Your role is to provide improvement suggestions for the AI-generated design document and iterate with the AI until satisfied. Then, let Cursor write the code according to the final design document. During coding, always keep the design document in the agent’s context as a reference.
Choose the right model: Do not use Cursor’s “auto” mode; be sure to choose a model with thinking ability (with a brain icon next to it), such as Claude 4 Sonnet.
Test-driven: Be sure to have AI write and execute test cases for its code to ensure code quality.

Feel free to form teams and choose any of the following topics to start your creative journey!

2025-07-10

UCAS 2025 Spring AI Agent Practical Course

The AI Agent Practical Course is a hands-on course conducted by Professor Liu Junming from UCAS and myself. The first session in 2024 had over 50 participants, and the second session in 2025 had over 100 participants. The 2025 Spring AI Agent Practical Course took place in early February 2025 in Beijing.

Course Directory:

Project 1: Interactive Novel
Project 2: Voice Werewolf
Project 3: Intelligence Gathering Expert
Project 4: Paper Video Explanation
Project 5: Multimodal AI Assistant

Who are we?

Pine AI is dedicated to using AI to help users handle everyday chores and disputes.

In the United States, calling customer service is often a hassle. You might have to wait half an hour, then spend a long time talking with an agent. If the agent isn’t willing to help, you may be transferred to another department. By the end, a single call can take one to two hours. Many people don’t have that much time to haggle with customer service and end up taking the loss. Others find phone communication difficult due to limited spoken English.

Pine AI is building eloquent, knowledgeable, and exceptionally memory-capable AI Agents that can automate this entire process—making calls, sending emails, and using a computer—to handle tasks like a human secretary.

This is absolutely not as simple as slapping a prompt on a SOTA model. We are looking for exceptional people to tackle this world-class challenge with us.

2025-06-12

Effective Agents: Real-Time Interaction with the Environment, Learning from Experience

[This article is based on my invited talk at the A2M Internet Architecture and AI Technology Summit, Turing Large Model Technology Session.]

Download PDF: “Effective Agents: Real-Time Interaction with the Environment, Learning from Experience”

Hello everyone, welcome to the A2M Summit. Today, I will be sharing on the topic “Effective Agents: Real-Time Interaction with the Environment, Learning from Experience.”

Let me first introduce myself. I am the co-founder and chief scientist of Pine AI.

Currently, our business at Pine AI is to help users handle daily chores and disputes by making phone calls through AI. In the U.S., calling customer service is often a hassle. For instance, you might have to wait for half an hour and then spend a long time communicating with the service representative. If the representative is unwilling to help, you might be transferred to another department. So, the entire process can sometimes take one or two hours. Many people don’t have the time to argue with customer service, and sometimes they just have to accept the loss. Additionally, some people struggle with English, making phone communication difficult. Pine can automate this entire process through AI.

Making today’s AI capable of handling tasks end-to-end is extremely challenging; it’s not as simple as applying a SOTA model with a prompt. Most AI products only provide users with information, like generating a research report, but the actual task still requires the user to contact customer service.

To enable AI Agents to complete tasks end-to-end is indeed difficult. Today, we will discuss some of the core technical challenges and how Pine AI addresses these issues.

#!/usr/bin/env python3
# standalone_smart_dns.py - 轻量级 DNS 转发服务器，具有防污染能力

import socket
import threading
import ipaddress
import os
import time
import requests
import json
from dnslib import DNSRecord, DNSHeader, RR, QTYPE, A, AAAA

# 配置项
LISTEN_IP = "127.0.0.1"
LISTEN_PORT = 53
CHINA_DNS = "114.114.114.114"
FOREIGN_DNS = "1.1.1.1"
CHINA_IP_LIST_URL = "https://raw.githubusercontent.com/mayaxcn/china-ip-list/master/chnroute.txt"
CHINA_IPV6_LIST_URL = "https://raw.githubusercontent.com/mayaxcn/china-ip-list/master/chnroute_v6.txt"
CHINA_IP_CACHE_FILE = "/tmp/.china_ip_list.txt"
CHINA_IPV6_CACHE_FILE = "/tmp/.china_ipv6_list.txt"
CACHE_DURATION = 7 * 24 * 60 * 60  # 7 天秒数
MAX_PACKET_SIZE = 1024
TIMEOUT = 5

# 全局变量，存储 IP 网络
cn_networks = []
cn_networks_v6 = []

def get_china_ip_list(is_ipv6=False):
    """获取或更新中国 IP 列表"""
    url = CHINA_IPV6_LIST_URL if is_ipv6 else CHINA_IP_LIST_URL
    cache_file = CHINA_IPV6_CACHE_FILE if is_ipv6 else CHINA_IP_CACHE_FILE
    
    # 检查缓存是否存在且未过期
    if os.path.exists(cache_file):
        file_age = time.time() - os.path.getmtime(cache_file)
        if file_age < CACHE_DURATION:
            with open(cache_file, 'r') as f:
                return [line.strip() for line in f if line.strip() and not line.startswith('#')]
    
    # 下载新列表
    try:
        response = requests.get(url)
        if response.status_code == 200:
            ip_list = [line.strip() for line in response.text.split('\n') 
                      if line.strip() and not line.startswith('#')]
            
            # 保存到缓存
            with open(cache_file, 'w') as f:
                f.write('\n'.join(ip_list))
            
            return ip_list
        else:
            # 如果下载失败但缓存存在，使用缓存
            if os.path.exists(cache_file):
                with open(cache_file, 'r') as f:
                    return [line.strip() for line in f if line.strip() and not line.startswith('#')]
            else:
                return []
    except Exception as e:
        print(f"获取中国 IP 列表出错: {e}")
        if os.path.exists(cache_file):
            with open(cache_file, 'r') as f:
                return [line.strip() for line in f if line.strip() and not line.startswith('#')]
        return []

def is_cn_ip(ip):
    """检查 IP 是否在中国网络列表中"""
    global cn_networks, cn_networks_v6
    
    try:
        ip_obj = ipaddress.ip_address(ip)
        
        if ip_obj.version == 4:
            # 首次运行时加载中国 IPv4 列表
            if not cn_networks:
                china_ip_list = get_china_ip_list(is_ipv6=False)
                cn_networks = [ipaddress.ip_network(net) for net in china_ip_list]
            
            # 检查 IP 是否在任何网络中
            for net in cn_networks:
                if ip_obj in net:
                    return True
        elif ip_obj.version == 6:
            # 首次运行时加载中国 IPv6 列表
            if not cn_networks_v6:
                china_ipv6_list = get_china_ip_list(is_ipv6=True)
                cn_networks_v6 = [ipaddress.ip_network(net) for net in china_ipv6_list]
            
            # 检查 IP 是否在任何网络中
            for net in cn_networks_v6:
                if ip_obj in net:
                    return True
        
        return False
    except:
        return False

def forward_dns_request(domain, dns_server, record_type):
    """将 DNS 请求转发到指定的 DNS 服务器"""
    try:
        # 正确使用 dnslib 创建 DNS 请求
        from dnslib import DNSQuestion
        record = DNSRecord()
        record.add_question(DNSQuestion(domain, getattr(QTYPE, record_type)))
        packet = record.pack()
        
        s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        s.settimeout(TIMEOUT)
        
        s.sendto(packet, (dns_server, 53))
        reply_packet, _ = s.recvfrom(MAX_PACKET_SIZE)
        
        reply = DNSRecord.parse(reply_packet)
        return reply
    except Exception as e:
        print(f"查询 {dns_server} 关于 {domain} ({record_type}) 出错: {e}")
        return None

def handle_dns_request(data, client_addr, sock):
    """处理 DNS 请求并返回智能选择的响应"""
    try:
        request = DNSRecord.parse(data)
        qname = str(request.q.qname)
        qtype = QTYPE[request.q.qtype]
        
        print(f"查询: {qname} ({qtype}) 来自 {client_addr[0]}")
        
        # 仅对 A 和 AAAA 记录使用智能逻辑
        if qtype in ('A', 'AAAA'):
            # 查询两个 DNS 服务器
            cn_reply = forward_dns_request(qname, CHINA_DNS, qtype)
            foreign_reply = forward_dns_request(qname, FOREIGN_DNS, qtype)
            
            # 提取回复中的 IP
            cn_ips = []
            if cn_reply and cn_reply.rr:
                cn_ips = [str(rr.rdata) for rr in cn_reply.rr if rr.rtype == getattr(QTYPE, qtype)]
            
            foreign_ips = []
            if foreign_reply and foreign_reply.rr:
                foreign_ips = [str(rr.rdata) for rr in foreign_reply.rr if rr.rtype == getattr(QTYPE, qtype)]
            
            # 检查任何 CN IP 是否实际在中国
            cn_ips_in_china = [ip for ip in cn_ips if is_cn_ip(ip)]
            
            # 智能选择逻辑
            if cn_ips_in_china:
                # 中国 DNS 返回了中国 IP，使用 CN 结果
                print(f"为 {qname} ({qtype}) 使用中国 DNS")
                response = cn_reply
            elif foreign_ips:
                # 中国 DNS 没有返回中国 IP，但国外 DNS 有结果
                # 可能是 DNS 污染，使用国外结果
                print(f"为 {qname} ({qtype}) 使用国外 DNS")
                response = foreign_reply
            elif cn_ips:
                # 默认使用中国 DNS 结果（如果有）
                print(f"默认为 {qname} ({qtype}) 使用中国 DNS")
                response = cn_reply
            else:
                # 如果以上都失败，使用国外 DNS 或空响应
                print(f"默认为 {qname} ({qtype}) 使用国外 DNS")
                response = foreign_reply if foreign_reply else cn_reply
        else:
            # 对于非 A/AAAA 记录，直接转发到国外 DNS
            response = forward_dns_request(qname, FOREIGN_DNS, qtype)
            if not response:
                response = forward_dns_request(qname, CHINA_DNS, qtype)
        
        # 如果有响应，发送它
        if response:
            # 保持原始请求 ID
            response.header.id = request.header.id
            sock.sendto(response.pack(), client_addr)
        else:
            # 创建空响应
            response = DNSRecord(DNSHeader(id=request.header.id, qr=1, aa=0, ra=1), q=request.q)
            sock.sendto(response.pack(), client_addr)
            
    except Exception as e:
        print(f"处理请求出错: {e}")

def run_dns_server():
    """运行 DNS 服务器"""
    # 尝试 UDP 和 TCP 套接字
    try:
        # 用于 DNS 的 UDP 套接字
        udp_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        udp_sock.bind((LISTEN_IP, LISTEN_PORT))
        print(f"DNS 服务器监听 {LISTEN_IP}:{LISTEN_PORT} (UDP)")
        
        while True:
            data, addr = udp_sock.recvfrom(MAX_PACKET_SIZE)
            threading.Thread(target=handle_dns_request, args=(data, addr, udp_sock)).start()
            
    except Exception as e:
        print(f"服务器错误: {e}")
    finally:
        if 'udp_sock' in locals():
            udp_sock.close()

if __name__ == "__main__":
    # 加载库
    try:
        import dnslib
    except ImportError:
        print("安装必要的包...")
        import pip
        pip.main(['install', 'dnslib', 'requests'])
        import dnslib
    
    print("启动智能 DNS 服务器...")
    run_dns_server()

3. 配置 Windows 使用本地 DNS

打开“控制面板” -> “网络和共享中心” -> “更改适配器设置”
右键点击当前使用的网络连接，选择“属性”
双击“Internet 协议版本 4 (TCP/IPv4)”
选择“使用下面的 DNS 服务器地址”，并输入 127.0.0.1

4. 启动 DNS 服务器

在命令提示符中运行：

1	python C:\SmartDNS\standalone_smart_dns.py

结论

通过本文介绍的方案，用户可以在本地搭建一个轻量级的智能 DNS 分流系统，有效解决 DNS 污染问题，并根据网络环境自动选择最佳的解析结果。该方案不仅适用于 macOS，也可以在 Windows 系统上实现，提供了灵活的跨平台解决方案。

3. Create a Windows Service

To run a Python script as a Windows service, we need to use the NSSM (Non-Sucking Service Manager) tool:

Download NSSM
Extract it to a suitable directory, such as C:\SmartDNS\nssm
Run the command prompt as an administrator, then execute:
1
2
cd C:\SmartDNS\nssm\win64
nssm.exe install SmartDNS
In the configuration window that appears, set:
- Path: C:\Windows\System32\python.exe
- Startup directory: C:\SmartDNS
- Arguments: C:\SmartDNS\standalone_smart_dns.py
- Set the service display name and description in the “Details” tab
- Choose “Local System account” in the “Log on” tab
- Click “Install service”

Note: In Windows, port 53 is a privileged port, and the Python script needs to run with administrator privileges to bind to this port. The above service setup using “Local System account” can solve this issue. If you still encounter permission issues, consider changing the listening port in the script to a non-privileged port (such as 5353), and then use firewall rules to forward requests from UDP port 53 to 5353.

Start the service:
1
sc start SmartDNS

4. Configure the System to Use Local DNS

Open Control Panel > Network and Internet > Network and Sharing Center
Click on the current active network connection
Click “Properties”
Select “Internet Protocol Version 4 (TCP/IPv4)” and click “Properties”
Choose “Use the following DNS server addresses”
Set the preferred DNS server to “127.0.0.1”
Click “OK” to save the settings

5. Verification Test

Execute in the command prompt:

1 2	nslookup baidu.com 127.0.0.1 nslookup google.com 127.0.0.1

If the settings are correct, the domain names should resolve correctly.

6. Restore Default Settings

If you need to restore the default settings:

Stop the DNS Service:
1
2
sc stop SmartDNS
sc delete SmartDNS
Restore Default DNS Settings:
- Open Control Panel > Network and Internet > Network and Sharing Center
- Click on the current active network connection
- Click “Properties”
- Select “Internet Protocol Version 4 (TCP/IPv4)” and click “Properties”
- Choose “Obtain DNS server address automatically”
- Click “OK” to save the settings

Conclusion

The combination of local anti-pollution DNS and Layer 3 Tunnel provides us with an elegant solution that avoids DNS pollution issues while ensuring the best speed for accessing both domestic and international websites. This solution is particularly suitable for users who need to access both domestic and international resources simultaneously.

When you use both local anti-pollution DNS and a Layer 3 tunnel (configured with regional routing), you will gain the following advantages:

Pollution-free Resolution: All domain names can obtain the correct IP address
Efficient Access:
- DNS queries for domestic websites go directly through the local network, obtaining the IP most suitable for your network environment
- DNS queries for international websites go through the tunnel, avoiding pollution and obtaining CDN IPs close to the tunnel exit
- Direct connection for domestic websites, fast speed
- Tunnel connection for international websites, stable and reliable
Fully Automatic Traffic Splitting: The system automatically determines which route to take, without the need to manually switch DNS or proxies

2025-04-27

My Translation "Illustrated Large Models - Principles and Practice of Generative AI" is Now Available

[Thank you to all the readers who sent in over 50 corrections! The readers are really meticulous, finding so many errors, and I am very grateful for the corrections!]

My translation “Illustrated Large Models - Principles and Practice of Generative AI” (Hands-On Large Language Models) was released in May 2025. You can search for “Illustrated Large Models” on platforms like JD.com and Taobao.

Praise for the Book (Chinese Edition)

Many thanks to Yuan Jinhui, founder of Silicon Flow, Zhou Lidong, director of Microsoft Research Asia, Lin Junyang, head of Alibaba Qwen Algorithm, Li Guohao, founder of CAMEL-AI.org community, and Zhong Tai, founder of AgentUniverse, for their wholehearted recommendations!

Translator’s Preface

The development of large models is rapid, as the saying goes, “AI advances in a day, while the world takes a year.” Many people find themselves lost in the flourishing garden of models, unsure of which model to use for their application scenarios and unable to predict the development direction of models in the coming year, often feeling anxious. In fact, almost all large models today are based on the Transformer architecture, with all changes stemming from the same origin.

This book, “Illustrated Large Models,” is an excellent resource to help you systematically understand the basic principles and capability boundaries of Transformers and large models. When Turing Company approached me to translate this book, I immediately agreed upon seeing the author’s name, as it was Jay Alammar’s blog post “The Illustrated Transformer” that truly helped me understand Transformers (Chapter 3 of this book is an expansion of that blog post). Although there are countless books and articles explaining large models on the market today, the exquisite illustrations and the depth of explanation in this book are rare. The book starts with tokens and embeddings, not limited to generative models, but also includes representation models that many overlook. Additionally, the book covers practical content such as text classification, text clustering, prompt engineering, RAG, and model fine-tuning.

I am very honored to be the translator of this book, working with editor Liu Meiying to bring this book to Chinese readers.

Taking some time to read this book and systematically understanding the basic principles and capability boundaries of Transformers and large models is like having a map and compass on an adventure journey through large models. This way, we won’t worry about newly released models rendering long-term engineering accumulation useless overnight, and we can develop products for future models. Once the model capabilities are ready, the product can scale up immediately.

I hope this book can become a sightseeing bus in the garden of large models, allowing more people to see the full view of large models. Thus, the ever-expanding capability boundaries of large models become a visual feast rather than a monster devouring everything; we have the opportunity to stand at the forefront of AI, realize more dreams, and gain more freedom.

RSS

Bojie Li