Bojie Li
2023-01-23
Long article warning: The first in the “Five Years of PhD at MSRA” series, about 12,000 words, to be continued…
On July 31, 2021, at the ACM Turing Conference in China, I was standing on the podium waiting for the ACM China Outstanding Doctoral Dissertation Award. I didn’t expect that the person who came up to present the award to me was President Bao, and my legs involuntarily trembled a bit. This was the only time I had seen President Bao up close. President Bao happily said that seeing one of us from USTC among the award winners shows that USTC can also cultivate masters. He hoped that we could become masters in the future, serve our motherland, and return to our alma mater.
The host of the award ceremony, Professor Liu Yunhao, asked us to talk about the title of our doctoral dissertation and our advisors. I blurted out, “High-Performance Data Center Systems Based on Programmable Network Cards“, my advisors are Professor Chen Enhong from USTC and Dr. Zhang Lintao from Microsoft, and I would like to give special thanks to Dr. Tan Kun from Huawei. I can clearly remember the title of my doctoral dissertation, it’s hanging on my homepage. In the company, people often send me private messages asking if I am the author of a certain paper. I would shyly say, yes…
Many people may think that I am the kind of PhD student who is solely focused on studying, but my PhD life is actually much more interesting than many people imagine, truly embodying the MSRA (Microsoft Research Asia) motto “Work hard, play harder“.
Research Novice
Joint Training
MSRA (Microsoft Research Asia) has joint PhD training programs with many universities in China. Among them, the joint training program with USTC has been ongoing for many years. In the second semester of my junior year, MSRA interviewed dozens of candidates at our school, selected about a dozen students for summer internships and a year-long internship in their senior year, and after the summer internship, about 7 students were confirmed to become joint training PhDs. These joint training PhDs will complete their first year of master’s and doctoral courses at USTC, and the next four years will be spent on academic research at MSRA in Beijing, finally obtaining a PhD degree from USTC.
The requirements for MSRA to select joint training PhDs are the so-called “three good” students: good at math, good at programming, and good attitude. This rule is said to have been set by the former dean, Dr. Shen Xiangyang. Because I spent all day tinkering with various Linux network services in the Youth Class College computer room and LUG activity room during my undergraduate studies, I didn’t study very well, and naturally my grades weren’t very good. My GPA was only 3.4 (out of 4.3), and I even failed Calculus II. The interviewer asked me at the time why my math grades were so poor. Probably because I had won awards in programming competitions (NOI) in high school, and my resume had many network service projects I worked on at LUG, I was surprisingly admitted to the joint training PhD program. The GPAs of other students admitted to the joint training program were at least 3.7, and most of them were top students with 3.8 or above.
2023-01-23
I found a VCD disc from the pile of old papers, which was given to me by Shijiazhuang TV station in 2004. After restoration and transcoding, the interview program “Superstar Li Bojie - Remembering the Hua Luogeng Gold Cup Gold Medalist” broadcasted 19 years ago finally sees the light of day again.
From this 13 and a half minute video, you can see how fat I was back then :) The sports question that was publicly revealed starts at 11:25 in the video :)
2023-01-22
Time: May 1, 2023, 10:58
Location: Cui Ping Shan Hotel, Hebei
Transportation Information: Cui Ping Shan Hotel is located at No. 1 Yingbin Road, Luquan District, Shijiazhuang City, Hebei Province.
- As Cui Ping Shan Hotel is located in the western suburbs and is not accessible by subway, public transportation is inconvenient. It is recommended to take a taxi.
- High-speed rail:
- By car: The nearest route from Shijiazhuang High-speed Rail Station is 16 kilometers, and the elevated route is 22 kilometers. It takes about 35 minutes by car without traffic.
- Public transportation: You can take bus 320 / air 320 directly (need to walk 1.3 kilometers), which takes 1 hour and 20 minutes; or take subway line 3 to subway line 1 to tourist bus 5, which takes 1 hour and 10 minutes.
- The taxi queue at Shijiazhuang High-speed Rail Station is very long after 22:00. If you arrive late, it is recommended to contact us in advance for pick-up.
- Airplane:
- By car: It is 53 kilometers from Shijiazhuang Zhengding International Airport, and it takes about 50 minutes by car without traffic.
- Public transportation: From Zhengding Airport, you can take Airport Bus Line 1 (one bus per hour) to Subway Line 1 to Tourist Bus 5, which takes 2 hours and 10 minutes.
- It is inconvenient to take a taxi at Zhengding Airport at night. If you arrive late, it is recommended to contact us in advance for pick-up.
- As the wedding officially begins at 10:58, it is recommended to arrive in Shijiazhuang on April 30. Those departing from Beijing who are short on time can also consider taking the early high-speed rail on May 1 (5 departures from 06:26 to 08:34).
Accommodation Information:
- Try to arrange to stay in Building 6 and Building 9 of Cui Ping Shan Hotel, Hebei, where rooms have been reserved. If there are special circumstances, we will arrange nearby hotels.
- Breakfast is expected to be in Building 6, from 7:00 to 10:00. Bridesmaids, groomsmen, and staff need to leave early and will not have time for breakfast, so a simple meal will be arranged in Buildings 6 and 9.
- The distance between Building 6 and Building 9 is 560 meters, and it takes 8 minutes to walk.
2022-12-13
There’s a classic joke where a student chose a course called “Choices and Future”, only to find out in the classroom that it was about “Options and Futures”, because their English names are both Options and Futures. A few days ago, the hotel where I had a meeting was right across from the Shanghai Futures Exchange, which made me think of a question: What are our judgments and choices about the future based on?
Recently, I read two books, “Gifts Differing” and “How NASA Builds Teams”, and found that this reflects the differences in people’s ways of thinking. Sensing and iNtuition, Thinking and Feeling are the two most critical differences.
Before the main text, you might want to think about the differences in the characters of Sun Wukong, Zhu Bajie, Tang Monk, and Sha Monk in “Journey to the West”, and how they work together as a team?
2022-12-12
Thanks to Professor Xu Chenren and Professor Huang Qun for the invitation, I am very honored to have given a guest lecture for the Computer Network course at Peking University on December 12, 2022.
Abstract: Data center networks, wide area networks, and wireless networks provide the communication cornerstone for the intelligent world of Internet of Things.
Data center networks have traditionally been designed for easily parallelizable web services. But today, AI, big data, HPC are all large-scale heterogeneous parallel computing systems, which have high requirements for communication performance. The heavy software stack causes huge overhead, which requires the communication semantics of data center networks to evolve from byte streams to memory semantics including message semantics, synchronous and asynchronous remote memory access, RPC, and to achieve extreme latency and bandwidth with a combination of software and hardware. In the future, we hope to treat the data center as a computer, on the one hand, to achieve peer-to-peer direct access between heterogeneous computing and storage devices, making the data center interconnection as high-performance as the internal bus of the host; on the other hand, to make distributed system programming as convenient as single-machine programming through Serverless.
Large-scale live streaming and short video on-demand, real-time audio and video communication and other applications pose new challenges to the stability of wide area network transmission. Internet giants have built their own global acceleration networks and designed new transport protocols such as QUIC to achieve a high-quality user experience. In addition, due to the low energy cost in the western part of our country, the strategy of “computing in the east and calculating in the west” has become a national strategy. Through Regionless scheduling, we can achieve a “nationwide integrated large data center”.
Seamless collaboration of intelligent terminals such as mobile phones, PCs, wearable devices, smart homes, smart cars, and industrial Internet applications such as 5G to B all require stable low latency and high bandwidth, which requires wireless protocol stack optimization, and even wireless memory semantics to support Gbps-level bandwidth. In addition, through the “distributed super terminal” programming framework of HarmonyOS, more closely distributed collaboration can be enabled to achieve seamless data and service flow.
Download Slides PDF (Updated on 2022-12-15)
Download Slides PPTX (Updated on 2022-12-15)
Full text of the speech:
2022-12-10
Recently, everyone has been playing with ChatGPT, which is really impressive. Although it’s not omnipotent, it’s the first AI dialogue system that doesn’t feel like an artificial idiot to me. It handles difficult problems such as reference and memory context very well. Especially in programming problems, it is sometimes more useful than StackOverflow. If my candidates performed like this, I would definitely prioritize hiring them.
The main shortcomings of ChatGPT currently are:
- The knowledge base is not updated enough and the coverage is not comprehensive. It cannot answer recent events or more obscure knowledge. It is suggested to combine it with a search engine or knowledge graph, first use the prompt word to search for some results, and then use NLP methods to integrate the search results. It is said that some research teams are already working in this direction.
- Lack of logical reasoning ability, slightly complex logic can easily get wrong, and answer seriously when it’s wrong. How to solve arbitrarily complex logical problems is a big challenge. It’s even harder to recognize answers that seem correct but are actually absurd.
- Currently, it only supports text and does not support multimodal. Now you can let ChatGPT generate prompts, and then input them into DALL-E to generate images. In the future, generative models that support multimodal input and multimodal output will make human-computer interaction more natural and may become the next generation of human-computer interaction paradigm.
- The cost of a single answer is currently high, requiring several cents, significantly higher than the cost of a Google search. If the cost can be reduced through algorithm or hardware improvements, or if new business models can be created by combining with recommendations and advertisements, there will be room for commercial profit.
This year can be said to be the “first year” of AI-generated content. A few months ago, we were all shocked by the stable diffusion (DALL-E 2) in the CV field, and now ChatGPT has set a new SOTA for NLP. Stable diffusion and ChatGPT are both done by OpenAI, and the financial backer behind OpenAI is Microsoft, which can be considered as an important game that Microsoft has won back in the AI field. In previous years, it was always Google Deepmind’s Alpha series that stole the limelight, from Go to proteins and matrix calculations.
The intelligent assistant that can communicate naturally with people is a scene in countless science fiction movies, and it is also a vision set by major companies 20 years ago. Today, we finally see the dawn of becoming a reality. Intelligent assistants may give birth to the next trillion-dollar industry, just like mobile internet has overturned PC internet and video has overturned text, becoming a new paradigm of human-computer interaction and profoundly changing human work and life.
Below are some examples I tried in ChatGPT:
2022-12-10
Firstly, it is the scale of business. Due to geographical and cultural reasons, most domestic companies encounter some difficulties in going global, mainly in the domestic market, which is much smaller than the European and American markets. The same is true for public clouds, where the revenue and market value of AWS, Azure, and Google Cloud in the European and American markets are higher than those of Alibaba, Tencent, and Huawei Cloud in China. Since the development cost can basically be shared, the average salary of developers in American companies is higher than that in China, which can hire relatively more excellent talents; it can also generate more profits to support relatively long-term research, such as OpenAI, Deepmind, and Microsoft Research. Breakthrough innovations like ChatGPT are hard to come from product departments with intense development rhythms, they usually come from research departments without much short-term commercial monetization pressure.
2022-09-03
Text content to be supplemented, let’s put out a few photos first~
2022-07-27
Computer Network & Protocol Laboratory
Huawei’s Computer Network & Protocol Laboratory is a subsidiary of the Distributed and Parallel Software Laboratory of the Central Software Institute of the 2012 Laboratory, with locations in Beijing, Shanghai, Hangzhou, Shenzhen, and Tel Aviv, Israel.
Vision: Rooted in laying the foundation stone, innovation leads the future of distributed communication
Positioning: Huawei’s software engine in the field of computer network and protocol technology, covering theoretical breakthroughs, technological inventions, technological innovations, and quality delivery. Standing at the forefront of this technical field, we research and break through world-class technical problems in computing native networks and wide-area network deterministic communication, build an industry-leading full stack of distributed communication, and work with ICT, terminal, cloud, intelligent car and other main product teams to build differentiated communication competitiveness, gradually grow the industrial ecosystem, and help business success.
Team: A high-level innovation team composed of top industry-leading experts, genius youngsters, PhDs, and engineers mixed special forces, overseas legions. The technical research results are significant. Since 2018, 5 papers have been accepted by SIGCOMM, the top global network communication conference; and key technologies have been selected into Huawei’s top 10 inventions for three consecutive sessions.
2022-07-22
Let’s update and preview the cities I have (or will) visit here!
(July 22, 2022) Due to the pandemic, I haven’t moved for the past 4 months, missing a wedding on May 1st and another on May 20th. We haven’t seen each other for 4 months.
Now, everywhere requires a travel history of the past 14 days, but I can provide my travel history for the past 10 years! Similar to the logic of the travel history card, short-term stops are not counted, but transfers are generally included. As of December 13, 2022, the travel history card has been phased out.
From 2012 to July 2022, I have visited a total of 42 cities and traveled 259 times (traveling from city A to city B counts as one trip, and returning to city A counts as two trips). In 2019 alone, I traveled 63 times. I was quite shocked when I got this statistic. Although I traveled a lot for business in the past three years and visited 12 cities in Japan in 2019, I didn’t expect it to be this many. Why so many business trips? My first major project was in Hangzhou, so I spent most of my time there from June 2019 to May 2020. Currently, I lead three teams in Hangzhou, Shanghai, and Israel, but not in Beijing. Additionally, as an architect, I often need to attend seminars. It seems like God thinks I’m suited for long-distance relationships.
During my Ph.D. joint training at USTC and Microsoft Research Asia (internship in senior year 2013 ~ Ph.D. graduation in 2019), I frequently traveled between Hefei and Beijing, which was already quite a lot for a student. Unexpectedly, the number of business trips increased after starting work, with my travel frequency surpassing 97% of users on Flight Manager in 2021. The remaining 3% must be frequent flyers. I regret being too frugal during my Ph.D., often reluctant to spend money on tickets, resulting in long separations from my girlfriend. Another regret is that I often felt too lazy to write summaries after trips. My memory isn’t great, so after a long time, I can only rely on photos and chat records to recall.
As of November 2023, I have visited 71 cities and traveled 380 times, an increase of 29 cities compared to a year ago, mainly from our honeymoon trip to Xinjiang after the wedding and my three-month trip to the United States. The term “city” is somewhat difficult to define. In the U.S., it might be more reasonable to correspond to counties. If I record all trips across counties, the number of “trips” in the U.S. would increase significantly (traveling within the Bay Area and between Los Angeles and Irvine shouldn’t be considered trips), and I might not remember them all.
Of course, I don’t have the authority to obtain base station connection data from telecom operators. Travel data is collected from booking records, business trip records, etc. Some tickets were not booked by me, so some trips might be missing. For example, the time of returning to Hefei for undergraduate graduation after finishing my internship at MSRA in June 2014 is untraceable. According to the internship certificate from July 9, 2013, to May 30, 2014, the Beijing to Hefei date has been confirmed.
If you find any errors, please contact me for corrections.
My Footprints
2024
Start Date | End Date | City |
---|---|---|
2024-10-08 | 2024-10-08 | Beijing |
2024-10-06 | 2024-10-08 | Hangzhou |
2024-10-05 | 2024-10-06 | Taiyuan |
2024-10-04 | 2024-10-05 | Lan County |
2024-10-03 | 2024-10-04 | Taiyuan |
2024-09-30 | 2024-10-03 | Shijiazhuang |
2024-09-25 | 2024-09-30 | Beijing |
2024-04-24 | 2024-09-25 | Hefei |
2024-09-22 | 2024-09-24 | Beijing |
2024-09-19 | 2024-09-22 | Hangzhou |
2024-09-15 | 2024-09-19 | Xi’an |
2024-08-18 | 2024-09-14 | Beijing |
2024-08-18 | 2024-08-18 | Changsha |
2024-08-17 | 2024-08-18 | Kuala Lumpur |
2024-08-14 | 2024-08-17 | Singapore |
2024-08-13 | 2024-08-14 | Malacca |
2024-08-10 | 2024-08-12 | Kuala Lumpur |
2024-08-10 | 2024-08-10 | Shenzhen |
2024-07-21 | 2024-08-10 | Beijing |
2024-07-20 | 2024-07-21 | Lan County |
2024-07-19 | 2024-07-20 | Taiyuan |
2024-07-07 | 2024-07-19 | Beijing |
2024-07-05 | 2024-07-07 | Hefei |
2024-06-10 | 2024-07-05 | Beijing |
2024-06-09 | 2024-06-10 | Taiyuan |
2024-06-08 | 2024-06-09 | Lan County |
2024-06-07 | 2024-06-08 | Taiyuan |
2024-06-01 | 2024-06-07 | Beijing |
2024-06-01 | 2024-06-01 | Miyun |
2024-06-01 | 2024-06-01 | Huairou |
2024-05-06 | 2024-06-01 | Beijing |
2024-05-04 | 2024-05-06 | Taiyuan |
2024-05-03 | 2024-05-04 | Gujiao |
2024-05-03 | 2024-05-03 | Taiyuan |
2024-05-02 | 2024-05-03 | Datong |
2024-05-02 | 2024-05-02 | Ying County |
2024-05-01 | 2024-05-02 | Datong |
2024-04-21 | 2024-05-01 | Beijing |
2024-04-16 | 2024-04-21 | Dubai |
2024-04-06 | 2024-04-16 | Beijing |
2024-04-04 | 2024-04-06 | Wuhan |
2024-03-29 | 2024-04-04 | Beijing |
2024-03-28 | 2024-03-28 | San Francisco |
2024-03-17 | 2024-03-28 | Los Angeles |
2024-02-22 | 2024-03-17 | Beijing |
2024-02-22 | 2024-02-22 | Hong Kong |
2024-02-19 | 2024-02-22 | Singapore |
2024-02-19 | 2024-02-19 | Xiamen |
2024-02-18 | 2024-02-19 | Beijing |
2024-02-16 | 2024-02-18 | Taiyuan |
2024-02-15 | 2024-02-16 | Lan County |
2024-02-13 | 2024-02-15 | Taiyuan |
2024-02-13 | 2024-02-13 | Gujiao |
2024-02-12 | 2024-02-13 | Taiyuan |
2024-02-08 | 2024-02-12 | Shijiazhuang |
2024-01-01 | 2024-02-08 | Beijing |