2023-08-27
10 Soul-Searching Questions for AI Large Model Startups

  1. To build or not to build a foundational large model?
  2. To B or to C? Domestic or overseas?
  3. RMB capital or USD capital?
  4. Is AI Native application a mobile internet-level opportunity?
  5. Is your vision AGI?
  6. Can the problem of large models talking nonsense be solved?
  7. How does the large model infra profit?
  8. Where is your moat?
  9. Can your business model scale?
  10. How to deal with the regulation and legal responsibility of large models?

Below are my views on these 10 soul-searching questions.

Read More

2023-08-24
Tsinghua's Link Genius Boy: When Top Workers Start Their Own Business

Original video by Bilibili up master “Bao Bao Ba 2022”

Backup of the video on this site (25:58, 121 MB)

The following is the text transcript of AI voice recognition:

Read More

2023-08-17
Speeches at Our Wedding by Various Guests

May 1, 2023, Shijiazhuang

  • Speech by Tan Bo
  • Speech by Mentor Lin Tao
  • Speech by Professor Tan Haisheng
  • Wedding vows of the groom, Li Bojie
  • Wedding vows of the bride, Meng Jiaying
  • Speech by the father of the groom
  • Speech by the father of the bride
  • Speech by the parents of the bride at the name-changing ceremony
  • Speech by the bride at the name-changing ceremony
  • Speech by the parents of the groom at the name-changing ceremony
Read More

2023-08-15
Our Wedding Videos and Photos

May 1, 2023, Shijiazhuang

Photos

Click here to view the online album of wedding photos (110 edited photos)

Trailer

(00:31, 73 MB, 19 Mbps)

Highlight Edit

(04:47, 216 MB, 6 Mbps)

Full Documentary

(01:30:24, 3.35 GB, 5 Mbps)

Read More

2023-08-13
Five Years of PhD at MSRA (Part 3): Underground Mining Server Room and Digital Ex Project

The third in the “Five Years of PhD at MSRA” series, to be continued…

Underground Mining Server Room

In the basement of an ordinary residential building in Wanliu, Beijing, through a heavy air-raid shelter iron door, and then through a dark alley where you can’t see your fingers without turning on the light, is my underground mining warehouse.

In the basement next door, many workers struggling in Beijing live. The smallest room there only costs a thousand yuan a month. More than a dozen strangers in the basement share a bathroom, a washroom, and public sinks and washing machines are all rusty. At the end of the alley is a 30-square-meter hall, with a ventilation port that can let in a little light from the outside world. I rented this hall and a small room next to it as a mining server room.

I built the infrastructure of the underground mining server room myself, running 6 1080Ti water-cooled mining machines, oil-cooled mining machines, multiple 6-card 1060 mining machines, multiple 9-card dedicated mining machines, various ASIC mining machines for mining Bitcoin and Litecoin, worth 300,000 RMB, and also carrying my most covert personal project - the Digital Ex Project.

Read More

2023-08-13
Preview of AI Operating System os.ai

The concept of an AI operating system has been proposed by many people. Traditional AI operating systems may be more about infrastructure (infra), essentially managing hardware; the AI operating system we propose is about managing large models.

Today, I registered the domain os.ai, temporarily put up a placeholder webpage, briefly introducing the AI operating system we are building.

The AI operating system is a bridge between large language models and applications. Our professional team is committed to providing low-cost solutions, building highly predictable and controllable generative AI infrastructure, supporting the generation of text, images, videos, 3D metaverses, and generative agents.

Why do we need an AI operating system? The current large models face many challenges in terms of cost, predictability, multimodality, evaluation testing, etc. We believe that not only improvements in the model itself are needed, but more importantly, it needs to be closely co-designed with data and systems.

Low Cost

Currently, it costs $10 to use GPT-4 to read a paper, and $95 to generate a 7.5-minute video with Runway ML.

As experts in AI infrastructure, we provide low-cost generative AI services by building our own state-of-the-art AI data center composed of GPUs, and co-optimizing models, data, and underlying hardware architecture.

Predictability

  • Reduce hallucinations at the model level
  • Sandbox
  • System/user permission isolation (to avoid command injection)
  • Fact-checking
  • Reliable execution of long process tasks
  • Integration of industry private datasets and databases

Multimodality

Low-cost pipelines for creating text, images, 3D metaverses, and personalized generative assistants, with highly controllable generation details.

  • Text → Image/Video/3D Model
  • Text + Image → Image/Video/3D Model
  • Text + Video → Video/3D Model
  • Text/Image/Video → Personalized Generative Assistant

Model Evaluation

Automatically evaluate, test, and select large language models in an open environment at high throughput. Enable the large language model market, enable the metaverse built by generative assistants.

The AI operating system is still just a preliminary concept, many of its technologies are still under research. Welcome to follow os.ai, let’s look forward to the arrival of the large model AI operating system.

Read More

2023-08-07
How to Prevent Screen Photography, File Uploads, and Other Leaks with Technical Measures

(This article was first published on Zhihu)

Companies dealing with confidential information usually divide their areas into low, medium, and high confidentiality zones:

  • Low confidentiality zone: For image streams, video streams, and information streams, it has a certain leak detection and traceability capability;
  • Medium confidentiality zone: For image streams, video streams, and information streams, it has a certain ability to prevent leaks in advance and detect them, and a strong ability to trace leaks afterwards;
  • High confidentiality zone: For image streams, video streams, and information streams, it has a strong ability to prevent leaks in advance.

The high confidentiality zone is the simplest, physically isolated, with security equipment at the entrance, and electronic devices such as mobile phones and USB drives are not allowed to be brought in.

The medium and low confidentiality zones are more difficult because the office computers inside can access the Internet, and mobile phones can also be brought into the office. The following discusses how to maintain information security from the dimensions of leak prevention, leak detection, and leak tracing. Leak prevention refers to preventing data from leaking out, leak detection is the ability to discover and report when data leakage may occur, and leak tracing is the ability to trace who leaked the data when the data has already leaked.

Read More

2023-08-05
Should AI Clusters Use RoCEv2 or Infiniband?

(This article was first published on Zhihu)

Most major internet companies are deploying RDMA technology, with the main scenarios being storage and AI/HPC, divided into two technical routes, RoCEv2 and Infiniband.

RoCEv2 is RDMA over Ethernet, which runs the RDMA protocol on the traditional data center Ethernet network. The history of Infiniband (IB) is even longer, with HPC high-performance computing clusters from the 1980s all using IB.

The current leader in RDMA network cards is Mellanox, acquired by NVIDIA. It can be said that RoCEv2 is the community version of RDMA, and Infiniband is the enterprise version of RDMA. The advantage of the community version is openness, with many configurable things, but this is also its disadvantage, only network experts can handle it. Moreover, a large-scale RoCEv2 cluster is not something that a network expert can handle alone, it requires a team to solve PFC storm problems and various strange problems with network cards and switches. Of course, if there are only a few machines and a switch, and the network cards are all the same model, such a small-scale cluster using RoCEv2 will basically not encounter any problems.

The RDMA circle is very small, basically all have a certain academic background, if you have never heard of the above problems, then it is better to use IB honestly, spend a little more money, simple and easy. I heard that some AI companies think that buying A100/H100 is enough, they can’t even distinguish between the SXM version and the PCIe version, and they don’t know that they need to buy IB network cards and switches to achieve large-scale training, thinking that connecting with a regular 10G network is enough, this kind of company is best to find a seller of AI cluster solutions to match the IB network cards, switches and network topology, don’t try to show off, don’t try to save money by touching RoCEv2.

Most of OpenAI’s GPU clusters currently use Infiniband, and now some small and medium-sized AI companies also use IB. Most of the newly built GPU clusters of large companies use RoCEv2, because these large factories need to support a scale of tens of thousands of cards, and IB cannot scale up to this level, and cost is very important for such large-scale companies. Some large factories have already started to develop their own network cards. Another reason is that large factories have professional network teams, and it is difficult to optimize such a closed thing as IB, how can these network experts adjust performance and write PPTs.

Read More

2023-08-05
Is Cache Coherency Necessary for Load/Store?

(This article was first published on Zhihu)

Cache Coherency (CC) can be divided into two scenarios:

  1. CC between the CPU and device within the host
  2. CC across hosts

CC between the CPU and device within the host

I believe that CC between the CPU and device within the host is very necessary. When I was interning at Microsoft in 2017, I used an FPGA to create a memory block attached to the PCIe’s bar space. I was able to run a Linux system on this bar space, but the startup process that should have taken only 3 seconds took 30 minutes, which is 600 times slower than host memory. This is because PCIe does not support CC, and the CPU’s direct access to device memory can only be uncacheable, and each memory access has to go through PCIe to FPGA, which is extremely inefficient.

Therefore, the current PCIe bar space can only be used for the CPU to issue MMIO commands to the device, and data transfer must be carried out through device DMA. Therefore, whether it is an NVMe disk or an RDMA network card, they must follow the complex process of doorbell-WQE/command-DMA, as shown in the figure below.

Read More

2023-07-04
Enabling New Domain 01.me

In November 2012, my blog was born with USTC Blog. In May 2013, my blog got its independent domain bojieli.com. In January 2015, the blog enabled a new domain ring0.me, ring0 is the highest privilege level in the x86 architecture, which signifies my relentless pursuit of underlying system technology.

Today, I registered the premium domain 01.me. 0 and 1 are the only two digits in binary, I chose this domain in the hope of devoting myself to the AGI (Artificial General Intelligence) business, to make a small contribution to silicon-based life based on 0 and 1.

01.me this domain also has certain investment value, 01.org is the official website of Intel Open Source, 01.ai is the official website of Li Kaifu’s AI startup company Zero One Wanwu, 01.com was sold at a high price of $1,820,000 in 2017 (of course, the value of .me and .com cannot be mentioned in the same day).

For the convenience of sharing articles on WeChat and other domestic platforms, this website also has two domestically filed domains bojieli.com and boj.life. After the new registration domain 60-day protection period of the registry is over, I may consider moving 01.me to a domestic registrar for filing.

Read More
RSS