2023-01-29
Five Years of PhD at MSRA (Part 2) - Leading My First SOSP Paper

Long read warning: Part two of the “Five Years of PhD at MSRA” series, about 13,000 words, to be continued…

KV-Direct, published at SOSP 2017, was my second paper (as first author). Since the first SIGCOMM paper ClickNP was done with Bo Tan guiding me step by step, KV-Direct was also the first paper I led on my own.

What to Do After SIGCOMM

After submitting the SIGCOMM paper, Bo Tan said that for the next project, I needed to come up with the direction on my own.

Compiler or Application?

We were well aware that ClickNP still had many issues, with the current support for compilation optimization being too simple. We hoped to enhance the compiler’s reliability from the perspective of programming languages. At the same time, we used ClickNP as a common platform for network research within our group to incubate more research ideas.

Naturally, I explored along two directions, one was to extend ClickNP to make it easier to program and more efficient; the other was to use the ClickNP platform to develop new types of network functions to accelerate various middlewares in the network. At that time, we were exploring many middlewares in parallel, such as encryption/decryption, machine learning, message queues, layer 7 (HTTP) load balancers, key-value stores, all of which could be accelerated with FPGAs.

To improve the programmability of ClickNP, I started looking for good talents from the school to join the MSRA internship. Yi Li was interested in programming languages and formal methods during his undergraduate studies. He was the first student I recommended to intern at MSRA. At the start of the spring semester, Yi Li came to MSRA for his internship, coinciding with the completion of his undergraduate thesis. He proposed several key optimizations for the ClickNP system, added some syntax to simplify programming, and corrected some awkward syntax.

However, due to the workload, we did not do a major overhaul of the compilation framework, still using simple syntax-directed translation without using professional compiler frameworks like clang, nor intermediate languages. Therefore, each time a new compilation optimization was added, it seemed rather ad-hoc.

Encountering various strange issues with OpenCL, I had the idea of creating a high-level synthesis (HLS) tool myself, generating Verilog directly from OpenCL. My idea was simple: for application scenarios in the networking domain, what we do is to unroll all loops in a piece of C code into a large block of combinational logic. By inserting registers at appropriate positions, it could become a pipeline with extremely high throughput, capable of processing an input every clock cycle. If the code accesses global states, then such loop dependencies determine the maximum number of registers on the dependency path, which is the upper limit of the clock frequency.

However, Bo Tan disagreed with my idea of creating an HLS tool ourselves, because we were not professional FPGA researchers. Such work lacked innovation, more about filling the “gaps” of existing HLS tools, a engineering problem, difficult to publish top-tier papers in either FPGA or networking fields.

Due to frequent issues with FPGA card programming, I ended up plugging and unplugging FPGA cards in the server room every day, sometimes debugging on-site. Thus, like my undergraduate days in the minor academy’s server room, I often spent hours in the server room, enduring the cold air and noise over 80 decibels.

Read More

2023-01-23
Five Years of PhD at MSRA (Part I): From Novice to First SIGCOMM Paper

Long article warning: The first in the “Five Years of PhD at MSRA” series, about 12,000 words, to be continued…

On July 31, 2021, at the ACM Turing Conference in China, I was standing on the podium waiting for the ACM China Outstanding Doctoral Dissertation Award. I didn’t expect that the person who came up to present the award to me was President Bao, and my legs involuntarily trembled a bit. This was the only time I had seen President Bao up close. President Bao happily said that seeing one of us from USTC among the award winners shows that USTC can also cultivate masters. He hoped that we could become masters in the future, serve our motherland, and return to our alma mater.

The host of the award ceremony, Professor Liu Yunhao, asked us to talk about the title of our doctoral dissertation and our advisors. I blurted out, “High-Performance Data Center Systems Based on Programmable Network Cards“, my advisors are Professor Chen Enhong from USTC and Dr. Zhang Lintao from Microsoft, and I would like to give special thanks to Dr. Tan Kun from Huawei. I can clearly remember the title of my doctoral dissertation, it’s hanging on my homepage. In the company, people often send me private messages asking if I am the author of a certain paper. I would shyly say, yes…

Many people may think that I am the kind of PhD student who is solely focused on studying, but my PhD life is actually much more interesting than many people imagine, truly embodying the MSRA (Microsoft Research Asia) motto “Work hard, play harder“.

Research Novice

Joint Training

MSRA (Microsoft Research Asia) has joint PhD training programs with many universities in China. Among them, the joint training program with USTC has been ongoing for many years. In the second semester of my junior year, MSRA interviewed dozens of candidates at our school, selected about a dozen students for summer internships and a year-long internship in their senior year, and after the summer internship, about 7 students were confirmed to become joint training PhDs. These joint training PhDs will complete their first year of master’s and doctoral courses at USTC, and the next four years will be spent on academic research at MSRA in Beijing, finally obtaining a PhD degree from USTC.

The requirements for MSRA to select joint training PhDs are the so-called “three good” students: good at math, good at programming, and good attitude. This rule is said to have been set by the former dean, Dr. Shen Xiangyang. Because I spent all day tinkering with various Linux network services in the Youth Class College computer room and LUG activity room during my undergraduate studies, I didn’t study very well, and naturally my grades weren’t very good. My GPA was only 3.4 (out of 4.3), and I even failed Calculus II. The interviewer asked me at the time why my math grades were so poor. Probably because I had won awards in programming competitions (NOI) in high school, and my resume had many network service projects I worked on at LUG, I was surprisingly admitted to the joint training PhD program. The GPAs of other students admitted to the joint training program were at least 3.7, and most of them were top students with 3.8 or above.

Read More

2023-01-23
Interview Program of the Primary School Hua Cup Gold Medal

I found a VCD disc from the pile of old papers, which was given to me by Shijiazhuang TV station in 2004. After restoration and transcoding, the interview program “Superstar Li Bojie - Remembering the Hua Luogeng Gold Cup Gold Medalist” broadcasted 19 years ago finally sees the light of day again.

From this 13 and a half minute video, you can see how fat I was back then :) The sports question that was publicly revealed starts at 11:25 in the video :)

Read More

2023-01-22
Wedding Invitation @ Cui Ping Shan Hotel, Shijiazhuang

Wedding Invitation

Time: May 1, 2023, 10:58

Location: Cui Ping Shan Hotel, Hebei

Transportation Information: Cui Ping Shan Hotel is located at No. 1 Yingbin Road, Luquan District, Shijiazhuang City, Hebei Province.

  • As Cui Ping Shan Hotel is located in the western suburbs and is not accessible by subway, public transportation is inconvenient. It is recommended to take a taxi.
  • High-speed rail:
    • By car: The nearest route from Shijiazhuang High-speed Rail Station is 16 kilometers, and the elevated route is 22 kilometers. It takes about 35 minutes by car without traffic.
    • Public transportation: You can take bus 320 / air 320 directly (need to walk 1.3 kilometers), which takes 1 hour and 20 minutes; or take subway line 3 to subway line 1 to tourist bus 5, which takes 1 hour and 10 minutes.
    • The taxi queue at Shijiazhuang High-speed Rail Station is very long after 22:00. If you arrive late, it is recommended to contact us in advance for pick-up.
  • Airplane:
    • By car: It is 53 kilometers from Shijiazhuang Zhengding International Airport, and it takes about 50 minutes by car without traffic.
    • Public transportation: From Zhengding Airport, you can take Airport Bus Line 1 (one bus per hour) to Subway Line 1 to Tourist Bus 5, which takes 2 hours and 10 minutes.
    • It is inconvenient to take a taxi at Zhengding Airport at night. If you arrive late, it is recommended to contact us in advance for pick-up.
  • As the wedding officially begins at 10:58, it is recommended to arrive in Shijiazhuang on April 30. Those departing from Beijing who are short on time can also consider taking the early high-speed rail on May 1 (5 departures from 06:26 to 08:34).

Accommodation Information:

  • Try to arrange to stay in Building 6 and Building 9 of Cui Ping Shan Hotel, Hebei, where rooms have been reserved. If there are special circumstances, we will arrange nearby hotels.
  • Breakfast is expected to be in Building 6, from 7:00 to 10:00. Bridesmaids, groomsmen, and staff need to leave early and will not have time for breakfast, so a simple meal will be arranged in Buildings 6 and 9.
  • The distance between Building 6 and Building 9 is 560 meters, and it takes 8 minutes to walk.

Read More

2022-12-13
Sensing and Intuition, Thinking and Feeling

There’s a classic joke where a student chose a course called “Choices and Future”, only to find out in the classroom that it was about “Options and Futures”, because their English names are both Options and Futures. A few days ago, the hotel where I had a meeting was right across from the Shanghai Futures Exchange, which made me think of a question: What are our judgments and choices about the future based on?

Recently, I read two books, “Gifts Differing” and “How NASA Builds Teams”, and found that this reflects the differences in people’s ways of thinking. Sensing and iNtuition, Thinking and Feeling are the two most critical differences.

Before the main text, you might want to think about the differences in the characters of Sun Wukong, Zhu Bajie, Tang Monk, and Sha Monk in “Journey to the West”, and how they work together as a team?

Read More

2022-12-12
The New Golden Age of Computer Networks

Thanks to Professor Xu Chenren and Professor Huang Qun for the invitation, I am very honored to have given a guest lecture for the Computer Network course at Peking University on December 12, 2022.

Abstract: Data center networks, wide area networks, and wireless networks provide the communication cornerstone for the intelligent world of Internet of Things.

Data center networks have traditionally been designed for easily parallelizable web services. But today, AI, big data, HPC are all large-scale heterogeneous parallel computing systems, which have high requirements for communication performance. The heavy software stack causes huge overhead, which requires the communication semantics of data center networks to evolve from byte streams to memory semantics including message semantics, synchronous and asynchronous remote memory access, RPC, and to achieve extreme latency and bandwidth with a combination of software and hardware. In the future, we hope to treat the data center as a computer, on the one hand, to achieve peer-to-peer direct access between heterogeneous computing and storage devices, making the data center interconnection as high-performance as the internal bus of the host; on the other hand, to make distributed system programming as convenient as single-machine programming through Serverless.

Large-scale live streaming and short video on-demand, real-time audio and video communication and other applications pose new challenges to the stability of wide area network transmission. Internet giants have built their own global acceleration networks and designed new transport protocols such as QUIC to achieve a high-quality user experience. In addition, due to the low energy cost in the western part of our country, the strategy of “computing in the east and calculating in the west” has become a national strategy. Through Regionless scheduling, we can achieve a “nationwide integrated large data center”.

Seamless collaboration of intelligent terminals such as mobile phones, PCs, wearable devices, smart homes, smart cars, and industrial Internet applications such as 5G to B all require stable low latency and high bandwidth, which requires wireless protocol stack optimization, and even wireless memory semantics to support Gbps-level bandwidth. In addition, through the “distributed super terminal” programming framework of HarmonyOS, more closely distributed collaboration can be enabled to achieve seamless data and service flow.

Download Slides PDF (Updated on 2022-12-15)

Download Slides PPTX (Updated on 2022-12-15)

Full text of the speech:

Read More

2022-12-10
First Experience with ChatGPT

Recently, everyone has been playing with ChatGPT, which is really impressive. Although it’s not omnipotent, it’s the first AI dialogue system that doesn’t feel like an artificial idiot to me. It handles difficult problems such as reference and memory context very well. Especially in programming problems, it is sometimes more useful than StackOverflow. If my candidates performed like this, I would definitely prioritize hiring them.

The main shortcomings of ChatGPT currently are:

  1. The knowledge base is not updated enough and the coverage is not comprehensive. It cannot answer recent events or more obscure knowledge. It is suggested to combine it with a search engine or knowledge graph, first use the prompt word to search for some results, and then use NLP methods to integrate the search results. It is said that some research teams are already working in this direction.
  2. Lack of logical reasoning ability, slightly complex logic can easily get wrong, and answer seriously when it’s wrong. How to solve arbitrarily complex logical problems is a big challenge. It’s even harder to recognize answers that seem correct but are actually absurd.
  3. Currently, it only supports text and does not support multimodal. Now you can let ChatGPT generate prompts, and then input them into DALL-E to generate images. In the future, generative models that support multimodal input and multimodal output will make human-computer interaction more natural and may become the next generation of human-computer interaction paradigm.
  4. The cost of a single answer is currently high, requiring several cents, significantly higher than the cost of a Google search. If the cost can be reduced through algorithm or hardware improvements, or if new business models can be created by combining with recommendations and advertisements, there will be room for commercial profit.

This year can be said to be the “first year” of AI-generated content. A few months ago, we were all shocked by the stable diffusion (DALL-E 2) in the CV field, and now ChatGPT has set a new SOTA for NLP. Stable diffusion and ChatGPT are both done by OpenAI, and the financial backer behind OpenAI is Microsoft, which can be considered as an important game that Microsoft has won back in the AI field. In previous years, it was always Google Deepmind’s Alpha series that stole the limelight, from Go to proteins and matrix calculations.

The intelligent assistant that can communicate naturally with people is a scene in countless science fiction movies, and it is also a vision set by major companies 20 years ago. Today, we finally see the dawn of becoming a reality. Intelligent assistants may give birth to the next trillion-dollar industry, just like mobile internet has overturned PC internet and video has overturned text, becoming a new paradigm of human-computer interaction and profoundly changing human work and life.

Below are some examples I tried in ChatGPT:

Read More

2022-12-10
What is hindering domestic teams from researching products like ChatGPT?

Firstly, it is the scale of business. Due to geographical and cultural reasons, most domestic companies encounter some difficulties in going global, mainly in the domestic market, which is much smaller than the European and American markets. The same is true for public clouds, where the revenue and market value of AWS, Azure, and Google Cloud in the European and American markets are higher than those of Alibaba, Tencent, and Huawei Cloud in China. Since the development cost can basically be shared, the average salary of developers in American companies is higher than that in China, which can hire relatively more excellent talents; it can also generate more profits to support relatively long-term research, such as OpenAI, Deepmind, and Microsoft Research. Breakthrough innovations like ChatGPT are hard to come from product departments with intense development rhythms, they usually come from research departments without much short-term commercial monetization pressure.

Read More

2022-09-03
Marriage Certificate Photos @Fengtai District Civil Affairs Bureau, Beijing

Text content to be supplemented, let’s put out a few photos first~

Click here to see the marriage certificate photos

Read More

2022-07-27
Introduction to the Business of Computer Network & Protocol Laboratory & Distributed and Parallel Software Laboratory

Computer Network & Protocol Laboratory

Huawei’s Computer Network & Protocol Laboratory is a subsidiary of the Distributed and Parallel Software Laboratory of the Central Software Institute of the 2012 Laboratory, with locations in Beijing, Shanghai, Hangzhou, Shenzhen, and Tel Aviv, Israel.

Vision: Rooted in laying the foundation stone, innovation leads the future of distributed communication

Positioning: Huawei’s software engine in the field of computer network and protocol technology, covering theoretical breakthroughs, technological inventions, technological innovations, and quality delivery. Standing at the forefront of this technical field, we research and break through world-class technical problems in computing native networks and wide-area network deterministic communication, build an industry-leading full stack of distributed communication, and work with ICT, terminal, cloud, intelligent car and other main product teams to build differentiated communication competitiveness, gradually grow the industrial ecosystem, and help business success.

Team: A high-level innovation team composed of top industry-leading experts, genius youngsters, PhDs, and engineers mixed special forces, overseas legions. The technical research results are significant. Since 2018, 5 papers have been accepted by SIGCOMM, the top global network communication conference; and key technologies have been selected into Huawei’s top 10 inventions for three consecutive sessions.

Read More
RSS