Page 4 | Bojie Li

2024-12-28

Interview with Huawei "Genius Youth" Li Bojie (Part 1): Giving Up a Million-Dollar Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus

This article is reposted from the WeChat public account of Woke Advanced Alliance: “Dialogue | Interview with Huawei ‘Genius Youth’ Li Bojie (Part 1): Giving Up a Million-Dollar Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus”

Long article warning, this article contains 10016 words, estimated reading time 27 minutes

“Dialogue” is a series of in-depth interviews launched by the Woke Advanced Alliance. We invite and interview outstanding alumni from USTC who have experienced setbacks, tasted failures, and achieved success during their university life at USTC. Through in-depth conversations, we hope to showcase their life journeys and personal choices, hoping that their experiences can illuminate more paths for future USTC students.

In this issue of the Dialogue column, we invited Senior Brother Li Bojie (personal homepage: 01.me), a USTC 1000 alumnus, USTC-MSRA joint PhD, one of the first Huawei “Genius Youth” awardees, AI entrepreneur, and co-founder of the USTC course evaluation community. He was an assistant scientist and deputy chief expert at Huawei’s Computer Network and Protocol Laboratory. He has published multiple papers at top conferences such as SIGCOMM, SOSP, NSDI, and ATC, and has received the ACM China Outstanding Doctoral Dissertation Award and the “Microsoft Scholar” scholarship.

This article is original by Woke Advanced Alliance. Do not repost without permission.

Interview, Editing | Feng Wenjun, Chen Lei

Proofreading | Zhao Guohua

Theme Summary

Learning and Practice Experience During University

How to View Mathematical Foundations

Development History of the Course Evaluation Community

How to Transition to AI Research

Academic Planning and Career Choices

Misconceptions and Suggestions for Choosing a PhD

2024-12-21

OpenAI o3: The Dawn of AGI and ASI

This article was first published in a Zhihu answer to “What do you think of OpenAI’s latest o3 model? How powerful is it?“

When o1 first came out, many people doubted that it had not yet reached AGI (Artificial General Intelligence). The programming and mathematical capabilities demonstrated by o3 not only meet the threshold for AGI but even touch the edges of ASI (Artificial Superintelligence).

o3 further validates the value of RL and test-time scaling, providing a path to continue enhancing model intelligence and solving more difficult problems through post-training and increased inference time when high-quality pre-training data is nearly exhausted and model capabilities hit a “wall.”

Many have seen the specific performance metrics of o3, so I won’t repeat them. Here’s a summary:

o3 defeated 99.9% of programmers in Codeforces programming competitions, ranking 175th among 168,076 programmers. Even the authors of o3 couldn’t beat it.
o3 also shows significant improvement over o1 in meeting real-world programming needs. In the SWE-Bench software development test, the previously released o1-preview scored 41.3%, while o3 scored 71.7%. This means o3 can directly meet 70% of real-world needs and pass unit tests, leaving only 30% of the work for human programmers, which AI can also help significantly improve efficiency.
It scored 96.7% on the AIME 2024 math test, equivalent to only missing one question in the American Mathematics Olympiad.
In the GPQA Diamond test for PhD-level scientific questions, it exceeded o1 by 10 percentage points, while o1 was already at the average level of human PhD students.
In graphical logic reasoning ARC-AGI, after fine-tuning, o3 reached 87.5%, surpassing the human average (85%).

2024-11-16

Zhihu Academic Bar Talk: What Moment Made You Feel Like the World Had a Bug?

On the evening of November 15, 2024, at the Zhihu Academic Bar, I, along with prominent figures like Kai-Fu Lee, Zhiyuan Liu, and Guohao Dai, participated in an open mic sharing session.

Question:

“Vulnerabilities & Bugs—What moment made you feel like the world had a bug?”

On Zhihu, there are several highly upvoted questions about bugs, such as “What moment made you feel like the world had a bug?” and “What are some bugs that left you dumbfounded?”

However, it’s not scary when the world has a bug; what’s scary is when AI discovers a bug.

Recently, did AI discover a major security vulnerability in the real world for the first time? A vulnerability in SQLite was fortunately discovered by Google’s AI Agent, and after being fixed, it caused no damage. Could it be that with further evolution, AI could permanently prevent global blue screen incidents like those from Microsoft? This possibility is exciting.

Answer:

2024-11-01

Making Friends with Foundational Model Companies—Six Forks Podcast

Original podcast content: Six Forks Podcast “R&D Positions Must Embrace AI, Then Spend the Remaining Time Doing Experiments—A Conversation with Huawei’s First Batch of Genius Youth Li Bojie”

The following content is approximately 35,000 words, organized by the author using AI based on the podcast content. Thanks to Hunter Leslie for the wonderful interview and post-production, the 2-hour session was a blast without any retakes. Also, thanks to AI for allowing me to organize 30,000 words of content in an afternoon and supplement it with previously written materials.

Core Points Summary:

Sci-fi movies like “Her” and “Black Mirror” involving AI scenarios have already been realized or are close to realization, turning sci-fi into reality will undoubtedly have immense value.
Model capabilities are rapidly increasing, and small AI companies should make friends with foundational model companies rather than embellishing or wrapping models.
The success rate of “20% projects” is relatively high; start with interest projects based on daily work and life needs during spare time, and if there is a generalized need, expand into commercial projects for a higher success rate.
Many performance issues in AI applications are not model problems but should be solved with system optimization based on first principles.
A lot of work in the AI industry has not been published or open-sourced, creating a huge information gap.
The information gap in modern society is enormous; AI interacting more with users can understand everyone’s knowledge boundaries, greatly improving recommendation efficiency and helping to bridge the information gap.
OpenAI o1’s strong reasoning ability is crucial for the reliability of model applications in serious scenarios.
For most users’ daily life needs, the most capable models are already sufficient; the focus is on reducing costs. AGI might be very expensive, mainly used to solve the most important problems in human science.
Limited energy and chip manufacturing capabilities are major challenges for AGI.
Startups need to recruit people with solid computer science knowledge, strong learning ability, and strong self-drive.
AI-assisted programming can significantly enhance programmers’ work efficiency, freeing up time for exploring “20% projects” or achieving a better work-life balance.
After AI improves efficiency, it will bring more demand, turning more needs into reality, and even independent developers can complete work that previously required a team.
A person’s career is composed of a series of projects, and it’s important that each project has an impact. Different projects are suitable for different approaches, including startups, small and beautiful companies, communities, academic projects, etc.

Full Text:

2024-10-24

Live Sharing on Byte MarsCode 1024 Code Night

Q: What is the one product you most want to share from the past year?

A: I previously mentioned a saying, “AI in a day, human in a year.” There have been many exciting products in the past year. If I had to choose one, I would pick OpenAI o1, which, simply put, taught AI to think. This thinking is most evident in mathematics and programming. We shouldn’t understand mathematics and programming narrowly, as they are the biggest challenges for current large models in commercial applications.

In mathematics, most large models currently can’t calculate accurately, such as not distinguishing between 3.8 and 3.11, leading to low accuracy and making them unreliable in serious scenarios, like booking a flight or calculating expenses. What if they make a mistake? Now that models can calculate accurately, they can be used in many serious scenarios.

Programming isn’t just for programmers. We’ve observed an important trend in AI applications: the generated content is not just text but a multimodal content with images and text, or even interactive mini-games or mini-programs, like Claude Artifacts, OpenAI Canvas, Google NotebookLM generating podcasts, and Perplexity generating illustrated wikis. These contents are essentially a piece of code generated by large models and then dynamically rendered. This kind of multimodal content tests the programming ability of large models.

2024-10-20

Thoughts Inspired by "Xiaomi's Entrepreneurial Thinking"

Before starting my business, my wife bought me “Xiaomi’s Entrepreneurial Thinking,” but I never read it. Recently, I had some time to go through it and found it very rewarding. I used to dislike such books, thinking these experiences were processed and beautified, and some advice might not be applicable. However, after having personal entrepreneurial experience, reading books by industry leaders makes a lot of sense.

The essence of “Xiaomi’s Entrepreneurial Thinking” is in Chapter Six, “The Seven-Word Formula for the Internet,” which is Focus, Extreme, Reputation, Speed.

The development approach of MIUI fully embodies the “Focus, Extreme, Reputation, Speed” seven-word formula for the internet:

Focus: Initially, only four functions were developed (phone, SMS, contacts, and desktop), with extreme restraint.
Extreme: With customizable lock screens and themes, it could simulate any phone, pursuing an extreme experience.
Reputation: The entire company communicated with users on forums, making friends with them. It was very popular on the XDA forum and became a hit abroad, with its earliest internationalization starting from MIUI.
Speed: Weekly iterations, adopting an internet development model.

Focus

Focus is the most important of the seven-word formula for the internet and applies to all companies and products.

Companies Need Focus

Lei Jun shared his first entrepreneurial failure experience. Lei Jun was technically strong, completing four years of credits by his sophomore year. In his junior year, he wrote the antivirus software “Immunity 90,” which sold for a million yuan—a significant amount in the 1990s. So, in his senior year, he founded the Tricolor Company with two tech experts, Li Ruxiong and Wang Quanguo (both of whom are very successful now), but this venture quickly ended in failure.

2024-10-08

Why the Nobel Prize in Physics Was Awarded to AI

(This article was first published in a Zhihu answer to “Why was the 2024 Nobel Prize in Physics awarded to machine learning in artificial neural networks?”)

Some people joked that many physicists hadn’t heard of the two people who won this year’s Nobel Prize in Physics…

The Connection Between Artificial Neural Networks and Statistical Physics Is Not Accidental

In early July, when I returned to my alma mater for the 10th anniversary of my undergraduate graduation, I chatted with some classmates who are into mathematics and physics about AI. I was surprised to find that many fundamental concepts in AI today originate from statistical physics, such as diffusion models and emergence.

@SIY.Z also explained to me the statistical physics foundations behind many classic AI algorithms, such as the significant achievement of the two Nobel laureates, the Restricted Boltzmann Machine (RBM).

This connection is not accidental because statistical physics studies the behavior of systems composed of a large number of particles, just as artificial neural networks are systems composed of a large number of neurons. The early development of artificial neural networks clearly reveals this connection:

Hopfield Network

In 1982, Hopfield, while studying the principles of human memory, aimed to create a mathematical model to explain and simulate how neural networks store and reconstruct information, especially how neurons in the brain form memories through interconnections.

Specifically, the purpose of this research was to construct a CAM (Content-Addressable Memory) that supports “semantic fuzzy matching,” where multiple pieces of data are stored during the storage phase, and during the reconstruction phase, a partially lost or modified piece of data is input to find the original data that matches it best.

The Hopfield network utilized the atomic spin characteristic of matter, which allows each atom to be viewed as a small magnet. This is why the Hopfield network and subsequent artificial neural networks resemble the Ising model in statistical physics, which explains why matter has ferromagnetism.

2024-10-02

2024 Yunqi Conference: Foundational Models, Applications, and Two Bitter Lessons in Computing Power

On September 20-21, I was invited to attend the 2024 Yunqi Conference. I spent nearly two days exploring all three exhibition halls and engaged with almost every booth that piqued my interest.

Hall 1: Breakthroughs and Challenges in Foundational Models
Hall 2: Computing Power and Cloud Native, the Core Architecture Supporting AI
Hall 3: Application Implementation, AI Empowering Various Industries

My previous research focus was on the computing infrastructure and cloud native in Hall 2. Now, I mainly work on AI applications, so I am also very familiar with the content of Hall 1 and Hall 3. After two days of discussions, I really felt like I had completed the Yunqi Conference.

After the conference, I spoke into a recorder for over two hours, and then had AI organize this nearly 30,000-word article. I couldn’t finish organizing it by September 22, and with my busy work schedule, I took some time during the National Day holiday to edit it with AI, spending about 9 hours in total, including the recording. In the past, without AI, it was unimaginable to write 30,000 words in 9 hours.

Outline of the full text:

Hall 1 (Foundational Models): The Primary Driving Force of AI
- Video Generation: From Single Generation to Breakthroughs in Diverse Scenarios
  - From Text-to-Video to Multi-Modal Input Generation
  - Motion Reference Generation: From Static Images to Dynamic Videos
  - Digital Human Technology Based on Lip Sync and Video Generation
- Speech Recognition and Synthesis
  - Speech Recognition Technology
  - Speech Synthesis Technology
  - Music Synthesis Technology
  - Future Directions: Multi-Modal End-to-End Models
- Agent Technology
- Inference Technology: The Technological Driving Force Behind a Hundredfold Cost Reduction
Hall 3 (Applications): AI Moving from Demo to Various Industries
- AI-Generated Design: A New Paradigm of Generative AI
  - PPT Generation (Tongyi Qianwen)
  - Chat Assistant with Rich Text and Images (Kimi’s Mermaid Diagram)
  - Displaying Generated Content in Image Form (New Interpretation of Chinese)
  - Design Draft Generation (Motiff)
  - Application Prototype Generation (Anthropic Claude)
- Intelligent Consumer Electronics: High Expectations, Slow Progress
- AI-Assisted Operations: From Hotspot Information Push to Fan Interaction
- Disruptive Applications of AI in Education: From Personalized to Contextual Learning
Hall 2 (Computing Infrastructure): The Computing Power Foundation of AI
- CXL Architecture: Efficient Integration of Cloud Resources
- Cloud Computing and High-Density Servers: Optimization of Computing Power Clusters
- Cloud Native and Serverless
- Confidential Computing: Data Security and Trust Transfer in the AI Era
Conclusion: Two Bitter Lessons in Foundational Models, Computing Power, and Applications
- The Three Exhibition Halls of the Yunqi Conference Reflect Two Bitter Lessons
- Lesson One: Foundational Models are Key to AI Applications
- Lesson Two: Computing Power is Key to Foundational Models

2024-09-18

Why American Tech Giants Don't Need 996

Why don’t American internet companies need 996, and yet have higher per capita output?

Many people simply attribute it to social culture “involution” and insufficient enforcement of the eight-hour workday, but I don’t think these are the main reasons. Many companies with overseas operations don’t implement 996 for their overseas teams, and they don’t even require clocking in, but their domestic teams still need 996. Why is that?

As a programmer who has some understanding of both domestic and American companies, I believe the main reasons are as follows:

Higher customer unit price for American companies
Lower time requirements for manual services from American customers
Higher code quality of junior programmers in American companies
Lower management costs for American companies
Better use of tools and SaaS services by American companies
Clearer goals and boundaries for American companies
A few 007 heroes in American companies carrying the load

Higher Customer Unit Price for American Companies

A person with similar abilities, working the same amount of time, is likely to generate more revenue and profit in an American company than in a Chinese company. The reason lies in the customer unit price.

Switching from IDE and Vim to Cursor

Previously, I used JetBrains series IDEs (PyCharm, CLion) for larger projects and vim for smaller ones. The most annoying part of developing larger projects is writing boilerplate code. Most of the time is not spent thinking about the design of functionalities or algorithms but on boilerplate code.

Cursor is an AI-assisted programming IDE similar to GitHub co-pilot, with an interface quite similar to VS Code. When Cursor was first open-sourced in 2023, I started using it, but it wasn’t particularly useful due to the limitations of the foundational model at that time. After GPT-4o was released in May this year, I started using Cursor again and found it more convenient than asking code questions in ChatGPT. Firstly, there is no need to switch windows back and forth, and secondly, Cursor has context, making queries more efficient.

In the past three months, with the more powerful coding capabilities of Claude 3.5 Sonnet, I have completely switched from PyCharm and Vim to Cursor because Cursor’s development efficiency is much higher than PyCharm with AI completion features, doubling the overall development efficiency. My GitHub has also easily stayed all green in the past three months.

GitHub has stayed all green in the past three months

Cursor can help quickly get started with new languages and frameworks

Cursor is not only useful for improving development efficiency but also for quickly familiarizing ourselves with new programming languages, frameworks, and tech stacks. For example, writing backends in Go, frontends in React, and smart contracts in Solidity were all new to me, but with AI-assisted programming, these are not difficult. If I had such powerful AI when I was in school, I could have learned many more programming skills.

RSS

Bojie Li

2024-12-28

Interview with Huawei "Genius Youth" Li Bojie (Part 1): Giving Up a Million-Dollar Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus

2024-12-21

OpenAI o3: The Dawn of AGI and ASI

2024-11-16

Zhihu Academic Bar Talk: What Moment Made You Feel Like the World Had a Bug?

2024-11-01

Making Friends with Foundational Model Companies—Six Forks Podcast

2024-10-24

Live Sharing on Byte MarsCode 1024 Code Night

2024-10-20

Thoughts Inspired by "Xiaomi's Entrepreneurial Thinking"

Focus

Companies Need Focus

2024-10-08

Why the Nobel Prize in Physics Was Awarded to AI

The Connection Between Artificial Neural Networks and Statistical Physics Is Not Accidental

Hopfield Network

2024-10-02

2024 Yunqi Conference: Foundational Models, Applications, and Two Bitter Lessons in Computing Power

2024-09-18

Why American Tech Giants Don't Need 996

Higher Customer Unit Price for American Companies

2024-09-14

Cursor: Writing 800 Lines of Code in 2 Hours to Develop an AI Course Selection Assistant

Switching from IDE and Vim to Cursor

Cursor can help quickly get started with new languages and frameworks

Links

Tag Cloud

Bojie Li

2024-12-28 Interview with Huawei "Genius Youth" Li Bojie (Part 1): Giving Up a Million-Dollar Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus

2024-12-21 OpenAI o3: The Dawn of AGI and ASI

2024-11-16 Zhihu Academic Bar Talk: What Moment Made You Feel Like the World Had a Bug?

2024-11-01 Making Friends with Foundational Model Companies—Six Forks Podcast

2024-10-24 Live Sharing on Byte MarsCode 1024 Code Night

2024-10-20 Thoughts Inspired by "Xiaomi's Entrepreneurial Thinking"

Focus

Companies Need Focus

2024-10-08 Why the Nobel Prize in Physics Was Awarded to AI

The Connection Between Artificial Neural Networks and Statistical Physics Is Not Accidental

Hopfield Network

2024-10-02 2024 Yunqi Conference: Foundational Models, Applications, and Two Bitter Lessons in Computing Power

2024-09-18 Why American Tech Giants Don't Need 996

Higher Customer Unit Price for American Companies

2024-09-14 Cursor: Writing 800 Lines of Code in 2 Hours to Develop an AI Course Selection Assistant

Switching from IDE and Vim to Cursor

Cursor can help quickly get started with new languages and frameworks

Links

Tag Cloud

2024-12-28

Interview with Huawei "Genius Youth" Li Bojie (Part 1): Giving Up a Million-Dollar Salary to Start a Business, the Persistence and Reinvention of a USTC Alumnus

2024-12-21

OpenAI o3: The Dawn of AGI and ASI

2024-11-16

Zhihu Academic Bar Talk: What Moment Made You Feel Like the World Had a Bug?

2024-11-01

Making Friends with Foundational Model Companies—Six Forks Podcast

2024-10-24

Live Sharing on Byte MarsCode 1024 Code Night

2024-10-20

Thoughts Inspired by "Xiaomi's Entrepreneurial Thinking"

2024-10-08

Why the Nobel Prize in Physics Was Awarded to AI

2024-10-02

2024 Yunqi Conference: Foundational Models, Applications, and Two Bitter Lessons in Computing Power

2024-09-18

Why American Tech Giants Don't Need 996

2024-09-14

Cursor: Writing 800 Lines of Code in 2 Hours to Develop an AI Course Selection Assistant