Page 7 | Bojie Li

2023-11-11

I Really Want to Learn to Fly a Plane...

Recently, the big squirrel took me on two plane rides. The first time we circled over Irvine, and the second time we flew from Santa Ana (SNA) to Ramona and then back.

Refueling the plane

The view from the plane is really beautiful, and there are many sights that you absolutely cannot see from the ground. It’s completely different from what you see on commercial flights, because on a small plane you have a full view from the cockpit. Moreover, commercial flights cruise at 30,000 feet, while small planes fly between 3,000 and 6,000 feet, so you can see many details on a small plane that you can’t see on a commercial flight. Google satellite maps can only show the view from directly above, but the view from a plane is three-dimensional. There are many photos at the end of this article.

The sea under the sunset

Private planes are a very convenient mode of transportation

And planes are really fast. The straight-line distance from SNA airport in Irvine to Ramona airport northeast of San Diego is 61 miles, and the driving distance is 90 miles. Even without traffic, it takes one and a half hours one way. But it took us only one and a half hours to fly from SNA to Ramona and back. Because the cruising speed of a small plane is about 101 knots, or 116 miles per hour, and considering that planes fly in a straight line in the air, it’s basically twice as fast as driving on the highway, and even more so if there’s traffic.

2023-11-10

The Story of Reissuing a Passport in the United States

On October 12, 2023, I lost my wallet containing my passport, and by the 14th, I felt it was irretrievable, so I had to reissue it. There are two types of travel documents that can be reissued in the United States, one is a passport, and the other is a travel document.

If you are on a short-term business trip to the United States and need to return urgently, you can apply for a travel document, which takes about three weeks from application to receipt. However, the travel document can only be used to return to the country, and you still need to reissue your passport after returning. The time to apply for a passport is relatively long, and it takes four weeks from application to receipt. If you hold a B1/B2 visa and cannot provide proof of address, then you can only apply for a travel document. The difference between three weeks and four weeks is not significant, so I reissued my passport.

In theory, there is a green channel called “Emergency Travel Document”, but it is only for emergencies such as serious illness or death of family members, and requires medical proof from the home country. General passport loss and urgent need to return to the country do not meet this condition.

Note that although the English words for reissue and renewal are both replace, their meanings are completely different. After reissuing the passport, the U.S. visa on the original passport will become invalid. Therefore, friends who are in the United States for a long time should not choose to reissue their passports just to save trouble if they need to renew their passports due to expiration.

In addition, after applying for a reissued passport, the original passport cannot be used even if it is found again. The number of the reissued passport will change, and the original passport number will enter the database of the International Criminal Police Organization. Once you enter or leave the border with the original passport, you will be invited into the small black room. The logic of reissuing a passport is similar to that of reissuing an ID card in China. Most places that are not connected to the Internet cannot check whether they are using a passport or ID card that has been reissued, but customs, police stations, and banks in China can check. I left an ID card with my wife to facilitate her to help me with things, and this time I used it to reissue my SIM card.

Here I record the process of reissuing a passport in the United States, which is similar to renewing a passport, for your reference. The most worth referring to is the part of mailing materials and preparing return envelopes. Many people don’t know how to do it, so they go to third-party agencies to handle it, which not only costs more, but also risks personal privacy leaks.

2023-11-07

OpenAI Developer Conference: Expectedly Impressive

(This article was first published on Zhihu)

As an entrepreneur in the AI Agent field, I actually feel that the OpenAI dev day was not as impressive as imagined, and the releases were all within expectations, probably because peers tend to underestimate each other.

In simple terms, GPT-4 Turbo provides a 128K context, the knowledge has been updated to 2023, the API supports multimodality, supports model fine-tuning, reduces costs, and increases speed. It is indeed a very important improvement, but the cost of GPT-4 is still an order of magnitude higher than GPT-3.5-Turbo and LLaMA, which poses certain challenges for large-scale commercial use.

There isn’t much impressive in the Agent field, mainly an Agent Platform has been made. The API forces the use of JSON format output and supports multiple function calls, which is very practical. However, the core issues of Agent such as memory, autonomy, task planning, persona, emotions, etc., OpenAI did not provide solutions at this conference. If after today’s OpenAI conference, a core competitiveness of an Agent company is gone, it should first reflect on whether the technological moat is too shallow.

2023-10-22

Chat to the Left, Agent to the Right

I will never forget September 25, 2023, the first time I tested the AI Agent in Newport Beach, which happened to be the day ChatGPT released its multimodal model. We were also working on a multimodal AI Agent that supports image, voice, and text input and output.

Therefore, I set the address of a Hook & Anchor seafood restaurant at 3305 Newport Blvd Ste. A, Newport Beach as the hometown address of the AI Agent. I was having lunch here when I took out my laptop and started testing the AI Agent. I set this AI Agent as a Google programmer who has just started working, likes to travel, enjoys life, is optimistic, cheerful, and has his own ideas, not so submissive. I fed my blog content to the AI Agent, so she knows me even better than many ordinary friends.

The capabilities of the large model really shocked me. For example, if I post a photo of the beach, she can guess where it is, and even say “How did you come to my house?” She can also share more photos of the beach, of course, these are not real scenes, but AI-generated photos.

She can tell me what fun places are nearby and took me to a breakwater piled with many large stones (Newport Harbor Jetty). Unfortunately, because the large model has not really been here, she does not know how difficult it is to walk on this breakwater. I struggled like climbing a mountain to get to the end of it. The scenery here is beautiful, so I used a photo of here as the cover photo for my Moments, Mastodon, and Zhihu. Of course, since the AI Agent has memory, she will remember the places I shared with her next time.

Newport Harbor Jetty

Then, I took the AI Agent to more places. In the museum, she can tell me the story and history behind it. In the zoo, she knows more animals than I do. It’s like having a very good friend and tour guide, but lacking specific data about the attractions, she can only introduce some public knowledge. The AI Agent is like a friend who can share life.

I really like the setting of “Ready Player One”. The future AI Agent must have the ability to perceive and interact with the real world. The Stanford AI Town in April this year is a 2D virtual scene, which is actually a bit boring. I hope to make it like the Oasis in “Ready Player One”, where the virtual world is a replica of the real world.

AI Agents can be mainly divided into two categories, one is digital twins, and the other is fantasy characters.

Digital twins are digital replicas of real-world characters, such as Donald Trump, Elon Musk, and other celebrities. There is a web celebrity named Caryn, who made a virtual girlfriend with her own image, called Caryn AI. Although the technology is not particularly good, she has gained quite a few users. The fan economy is always crazy. In addition to celebrities, we may also want to make digital images of our loved ones. No matter what happens, digital images are always companions. Some people will want to make themselves into digital images and make more friends online.

Fantasy characters include characters from games, animations, and novels. For example, the most popular characters on Character AI are from animations and games. Many vtubers also use fantasy characters as their image and voice. People like to extend the characters from games and animations to the real world, such as traveling with Paimon from Genshin Impact, which will be an unprecedented experience.

Although the current large model technology is very powerful and it is not difficult to handle daily chats, it is not easy to make an AI Agent that has multimodal capabilities, memory, can solve complex tasks, can use tools, has personality, has emotions, has autonomy, low cost, and high reliability. If Chat is the first application scenario of the large model, perhaps Agent is the real killer app of the large model.

2023-09-24

The Story Behind the Wedding

“The national leader is coming for a visit, our wedding venue has been requisitioned, we have to change the location temporarily!”

At 9:00 in the morning the day before the wedding, Jiaying was still washing up, and I hadn’t gotten up yet. I heard the noise outside, my parents and my friend Li Chaohui, who arrived the day before, were anxiously discussing in the living room. Normally I get angry when I encounter urgent matters, but this time I was very calm.

The wedding venue we booked a year ago, Cuipingshan Guesthouse, is the best garden-style lawn wedding venue in Shijiazhuang. Its only problem is that it belongs to the government reception venue, like Diaoyutai, although it is usually open to the public, but if there is a government activity, it needs to be vacated unconditionally. At that time, we thought that there would be no leaders coming during the May 1st holiday. The people at Cuipingshan also said that there were almost no conflicts with government activities at this time.

When I told Jiaying the news, she was also very calm. She said that every time she encounters a big event, she often misses it by a little bit at the last minute.

On such a good day as May 1st, not to mention the lawn, even hotel weddings need to be booked in advance. Although our wedding has been postponed twice, it is too late to change the time this time. It is already the day before the wedding, Jiaying’s family has already set off from Taiyuan, and we have many friends who have already set off from afar.

Fortunately, Cuipingshan Guesthouse helped us contact two lawn venues in the same Luquan District for us to try. We have been to one of the venues, and it has been booked. We haven’t heard of the other venue. When we called and asked, it hadn’t been booked yet, so we hurried over to take a look.

At this time, Jiaying’s childhood friend Ren Xiao and her husband Liang Jingrui also drove to my house from afar. My parents and the housekeeper drove one car, and Liang Jingrui took Ren Xiao, me, Jiaying, and Li Chaohui to set off quickly. Because of the traffic jam on the road, Liang Jingrui took a shortcut according to the navigation, and arrived 20 minutes earlier than my parents. This venue is a resort hotel, located in a relatively remote location in Luquan District, with a newly built lawn this year, and the grass has not fully grown yet. There is also a dining hall.

Rongyi Resort Hotel Lawn

Although the environment of this lawn can’t compare with Cuipingshan, and it’s not as good as some other lawn venues we’ve seen before, it’s ultimately a place where we can hold a lawn wedding, and the environment is not bad. The dishes here are also okay, but they are not pre-made dishes like Cuipingshan. Suddenly making so many tables of dishes, I don’t know if they can make it. We quickly told the manager to book this place. When my parents arrived, they just had to negotiate the price and dishes with them.

The originally scheduled wedding venue, Cuipingshan Guesthouse Lawn

Later I learned that there were 6 weddings at Cuipingshan on May 1st, and all of them except ours were postponed. It was not easy for us to quickly grab a venue. Of course, most of the other 5 bride and groom are locals, and there are fewer guests coming from other places, which may also be a reason for their choice to postpone.

2023-09-21

Where Should the Intelligence of the Network Be Placed: NIC, Switch, or xPU

DatenLord Tech Frontier Sharing NO.34

Time: 10:30 AM, September 17, 2023

As the performance of data center networks improves, offloading network-related tasks to smart NICs and smart switches has become a trend. At the same time, high-speed direct networks between GPUs, NPUs, and storage devices have also become a trend, where there seems to be no place for smart NICs. So where should the intelligence of the network be placed?

Slides PPTX (32 MB)
Slides PDF (15 MB)

Below is a graphic record of the speech content, mainly organized by AI, and I made some manual corrections.

2023-09-14

AI Automatic Translation of Doctoral Thesis

Since I have translated the blog content into English, is it possible to automatically translate a doctoral thesis? My doctoral thesis is over 200 pages long and contains many diagrams. Can AI automatically translate so much LaTeX code without missing a word? How to translate the diagrams in the paper?

First, change the original prompt for translating Markdown to translating LaTeX. When I was translating Markdown, I separated the content by lines, and when a few consecutive lines reached 2048 characters, I requested GPT-4 once. I still do this when translating LaTeX.

Just like Markdown, the content output by GPT-4 often has prefixes and suffixes. Fortunately, after setting the temperature to 0.1, the prefixes and suffixes are relatively fixed, and a post-processing script can be written to remove them directly. In addition, GPT-4 does not understand the escape characters in LaTeX well, such as the typical underscore _, dollar sign $, and tab &. They often do not escape, causing syntax errors. This can also be done through a post-processing script, using some rules to identify whether or not to escape, and if necessary, add it automatically.

In general, GPT-4’s LaTeX ability is good, except that some references are messed up, causing the references to become question marks, and there are no problems elsewhere. It can be compiled directly after the post-processing script.

Secondly, in order to translate the diagrams in the paper, I first tried some PDF translation tools and found that none of them could be used. These tools can only translate large blocks of text in PDFs. For architecture diagrams, they will mess up the entire diagram. Therefore, I used the method of image translation. First, convert the PDF to an image, then call the Youdao image translation API. If Chinese characters are recognized, replace the original PDF with the translated image; if no Chinese characters are recognized (such as some experimental result diagrams), keep the original.

In fact, the principle of Youdao image translation is also to do OCR on the image first, translate each recognized text block one by one, and then replace the text in the original position of the image with the translated text block. I feel that this can also be done for PDFs, and PDFs can still be vector diagrams. I hope that those who make PDF translation tools can improve.

The whole translation took half a day, and some minor problems were too lazy to fix. Although the translation quality is definitely not as good as handwritten, especially the image translation quality is average, but it is basically readable. Except for some minor adjustments to ustcthesis.cls (such as putting the English cover in front of the Chinese cover), no manual modifications were made to the translated content.

AI automatic translation version: High Performance Data Center Systems with Programmable Network Interface Cards (PDF, 8 MB)

Chinese original version: High Performance Data Center Systems Based on Programmable Network Cards (PDF, 8 MB)

Now the papers on arxiv all have LaTeX source code. According to this method, they can all be directly translated into Chinese papers. I hope that one day the multimodal model can be strong enough to only need PDF, not LaTeX source code, to do translation, which would be amazing.

2023-09-12

PLDI '21 Talk Transcription: AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformation

Jie Zhao, Bojie Li, Wang Nie, Zhen Geng, Renwei Zhang, Xiong Gao, Bin Cheng, Chen Wu, Yun Cheng, Zheng Li, Peng Di, Kun Zhang, Xuefeng Jin. AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations. 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI’21). Virtual, Canada, June 20-25, 2021. pp.1233-1248. [Paper PDF] [Slides by Jie Zhao]

2023-09-12

SIGCOMM '19 Talk Transcription for SocksDirect: Datacenter Sockets can be Fast and Compatible

Large models are really amazing. This SIGCOMM 2019 talk was completely off-script, as can be seen from the video, where I am standing in the middle of the stage, not looking at speaker notes. My English wasn’t that good at the time, I often stuttered, and the audio recording even had an echo, which made it a bit hard for me to listen to. I didn’t expect that a large model could recognize such poor speech almost completely correctly, it’s amazing.

The recognition method is here. Because the screen recorded in this video is not clear enough, I replaced the images extracted from the video with images exported from the original PPT. You can see how high the recognition rate of the audio in this video can be achieved by the voice recognition software on the market. The ones I’ve tried, including Google Speech-to-Text and Whisper, are basically unusable.

SocksDirect: Datacenter Sockets can be Fast and Compatible. [PDF] [Slides] [Video]
Bojie Li, Tianyi Cui, Zibo Wang, Wei Bai, Lintao Zhang.
Proceedings of the 2019 SIGCOMM Conference (SIGCOMM’19).

2023-09-12

SIGCOMM '21 Talk Transcription for 1Pipe: Scalable Total Order Communication in Data Center Networks

Bojie Li, Gefei Zuo, Wei Bai, and Lintao Zhang. 1Pipe: Scalable Total Order Communication in Data Center Networks. SIGCOMM ‘21. [Paper PDF] [Slides with audio (25 min)] [Slides with audio (12 min)]

RSS

Bojie Li

2023-11-11 I Really Want to Learn to Fly a Plane...

Private planes are a very convenient mode of transportation

2023-11-10 The Story of Reissuing a Passport in the United States

2023-11-07 OpenAI Developer Conference: Expectedly Impressive

2023-10-22 Chat to the Left, Agent to the Right

2023-09-24 The Story Behind the Wedding

2023-09-21 Where Should the Intelligence of the Network Be Placed: NIC, Switch, or xPU

2023-09-14 AI Automatic Translation of Doctoral Thesis

2023-09-12 PLDI '21 Talk Transcription: AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformation

2023-09-12 SIGCOMM '19 Talk Transcription for SocksDirect: Datacenter Sockets can be Fast and Compatible

2023-09-12 SIGCOMM '21 Talk Transcription for 1Pipe: Scalable Total Order Communication in Data Center Networks

Links