MSRA Interview Summary
Note: The following content is purely from memory. As it has been over a month, accuracy is not guaranteed. Since no confidentiality agreement was signed before the interview, this article reveals quite a lot of details. If there is any inappropriate content, please contact bojieli AT gmail.com.
Originally, I wanted to secure a research position and work hard in a startup team. I didn’t think about going to MSRA… Therefore, it wasn’t until a few days before the application deadline that I learned about this news from my class teacher. The teacher said there was a “Microsoft Scholar” award, which was worth 5000 yuan. I thought with so many scholarships, I definitely wanted it. So I rushed to write a two-page Resume with MS Word one night, went to the academic secretary’s office to get a transcript the next day, and submitted the application one day after the deadline. It was only later that I found out that this “Microsoft Scholar” was tied to the “experimental class” to some extent, but I haven’t received any news about that scholarship yet.
According to the email recipient list, about 67 people signed up for the Microsoft experimental class. The interview was in the afternoon. About a week before the interview, a teacher from the West District contacted me, saying that there was a researcher named Ma Yi, who was very good at image processing and wanted to recruit a student from the Youth Academy. He said that my project experience and hands-on ability were good, but my math grades were not good, and he was worried about whether I could do it. He could give me a chance for a separate interview. He said he didn’t care about your overall GPA, what he cared more about were the grades of math courses and professional courses. He asked me to look at one of Dr. Ma Yi’s representative works, not only to understand it but also to grasp its essence. When I came back and looked at it, it was a general algorithm that decomposes a matrix into the sum of a sparse matrix and a low-rank matrix, which can be used for image denoising, video surveillance, etc. I only understood so much, and I couldn’t understand the complex derivation of the matrices in the middle at all. I thought this opportunity should be left to students majoring in mathematics.
Knowing that the interview was in the afternoon, I stayed up late the day before the interview, and my phone was on silent. When I got up at noon, I found that there was a missed call from that teacher in the morning. When I called back, I found out that I had missed the opportunity for a separate interview… In the afternoon, braving the rain, I went to the Science and Technology Experimental Building in the West District for the interview. The interview was one-on-one in a single room, similar to being invited for tea. There was no fixed time limit, and you could leave as soon as things were clear.
The first interview was with Professor Zhang from the Systems and Networks group (who is also my current mentor). First was self-introduction, then he started asking about my poor math grades (all because I didn’t brush up on Jimi Multivich at the time). I wrote quite a few extracurricular projects and course projects on my resume, and he asked me about the most challenging one among these projects. I thought, how challenging could these things be? So I casually talked about the attempts to construct a PHP Sandbox during the development process of blog.ustc.edu.cn, and the parts that I now feel are unreasonable. (This unreasonable part has not been changed yet, so everyone often encounters problems when installing plugins) I felt that it was not enough, so I added the several optimizations I tried when doing the real-time disk file system course experiment (well, this code was written by Guo Jiahua, so it’s not my credit), and concluded the “truth”: when doing a system, you need to find the bottleneck for optimization, and it’s troublesome and ineffective if you choose the wrong part.
Then Professor Zhang talked to me about what it feels like to do research, what it’s like to do a PhD at MSRA, and so on. Then he asked me which professional course I did well in, and I could only recall operating systems and compiler principles (I didn’t prepare for the interview at all). He said operating systems, good, talk about Linux’s memory management mechanism. “Linux Kernel Source Code Reading” was a course I took a year ago… Fortunately, I had read an article on “how to write a ramdisk” a few days ago, so I first talked about user address space and kernel address space, temporary memory mapping and permanent memory mapping of high-end memory, dynamic memory allocation’s buddy system and slab algorithm. The interviewer then asked me about the paging mechanism, and I roughly talked about what I learned in computer composition principles. He asked a very strange question, is paging managed by hardware or the operating system? I think it’s the hardware querying the page table, and the operating system maintaining (modifying) the page table. He asked me when the operating system needs to query the page table? Besides page faults and modifying the page table, I couldn’t think of any other situations, and the interviewer was noncommittal about my answer.
Professor Zhang brought paper and pen and asked me to write a bcopy to implement memcpy’s function. I first wrote the simplest version, and then immediately changed it to a 4-byte group version. He said there was a bug in my program, and then I realized that there would be a problem when the beginning and end were not multiples of 4 (i.e., the memory was not aligned), so I added special handling at the beginning and end. He said my program would segmentation fault, and I was puzzled. In the end, he told me that if the target address was not aligned, the 4-byte copy would become an unaligned access. I have written programs with unaligned access on 32-bit x86, and there was no segmentation fault. He said you go back and try it, it will definitely segmentation fault. The interview ended like this… If I had said that the course I did well in was computer networks instead of operating systems, I probably wouldn’t have struggled so much to answer…
Voiceover: After all the interviews were over, I did an experiment on a 32-bit Pentium 4 in the LUG activity room. 4-byte unaligned reads and writes will not segmentation fault, and the result is correct. Guo Jiahua said that even if the bus does not allow unaligned access, the CPU does not do special processing, and what should appear is a bus fault, not a segmentation fault. But because LUG had to post posters, I didn’t go back to discuss with the interviewer.
After the first interview, I waited for five minutes, and the staff called me into the second interviewer’s room. The interviewer was Professor Tong, who works on computer graphics. He looked at my resume and was interested in the Robogame blind reader and freeshell. When talking about freeshell, I mentioned that OpenVZ can implement hot migration, and he asked me about the principle of hot migration. I had just read a book on cloud computing and answered it. He asked me how I came up with it, and I said this was not my idea, it was from the book, and OpenVZ was also an existing tool, I just called its API. His conclusion was, “Regardless of the results of these projects, just the fact that you actively do them outside of class can explain a lot of problems.”
Professor Tong then asked me whether I prefer doing research that is more engineering-oriented or theory-oriented (I don’t remember the exact question). At the time, I answered that I prefer results obtained through theoretical derivation and systematic methods, and I don’t really like results that are “cobbled together” by adjusting parameters. I also mentioned Dr. Ma Yi’s thesis. Then he said he had nothing more to ask, so I asked him about 3D reconstruction (but I forgot the algorithm he mentioned as soon as I left, so it was a wasted question). This interview was relatively quick, ending in 20 minutes, at least half the length of the first interview.
A few days later, Researcher Xie, who works on data mining (and is a fellow alumnus of our college), emailed me to request an interview, which was arranged at the High Performance Computing Center and conducted via Skype video call. I only found out when I got there that the interview location was Network Center 402, which is where LUG has its weekly gatherings. The staff member was Zhao Cong, who worked with me and Chen Zhang on the Linux Software Store (what a coincidence). Back to the main topic. After my self-introduction and a listening test, Researcher Xie seized on my poor math grades. Then he started asking about the experiment I did with KDDcup 2009 data for the data mining course listed on my resume. Since I hadn’t touched data mining since that course, I had to bluff my way through. He must have realized, so he asked me to switch to a recent project, so I went through my blog.
Researcher Xie asked me what the previous two interviewers do, and I told him truthfully. He then asked, do you prefer doing system research or theoretical research? It seems similar to the question asked by Researcher Tong. He said it seems I prefer doing systems. The next question was tough: if you had to choose between systems, computer graphics, and data mining, which would you prefer? Personally, I prefer artificial intelligence, and I also work on intelligent robots in the lab at school, so I said I prefer the intelligence direction, and systems are also good. Graphics is all math, which I guess I can’t handle. He said my thoughts were contradictory, liking system research, but data mining is theoretical research. The 20-minute fixed-time interview ended in this contradiction.
When I came out, I happened to meet a girl who was there for an interview at the elevator door. She was very nervous, worried about being asked different questions from her last interview, so I reassured her that the content was pretty much the same, just like a face-to-face interview. It seems that nervousness can affect performance. I didn’t plan to come to USTC in high school, so I was very relaxed during the interview, and the result was quite good. This time I didn’t plan to go to Microsoft either. Maybe not preparing is the best preparation…
The last interview was a phone interview conducted by a Taiwanese from Professor Zhang’s lab. I sat in front of the computer, writing code live on collabedit. I thought this professor was interesting, letting me write code face-to-face, and even arranged a special coding interview, while other interviewers didn’t let me write a single line of code. The interview questions were thought up on the spot, and while I was writing the program, he was writing the same program.
The first question was to implement binary tree traversal without recursion and with O(1) space complexity, where each node of the binary tree has a parent pointer. Because I didn’t think it through at first, I wrote and revised, revised and wrote, and the phone even got disconnected once. It took me an hour to barely hand it in.
The second question was to find the number of black connected components in an n*n black and white dot matrix (adjacent up, down, left, and right are considered connected). I thought the floodfill method was too simple, so I wrote a greedy algorithm that only scans each point once. Scan in row and column order, when the current square is black, if both the left and top are black, copy if the numbers are the same, if the numbers are different, copy the left and increase the merge counter; if the left is black or the top is black, copy its number; otherwise, start a new number. The final number of connected components is (number of assigned numbers - merge counter). It took me several days to realize that this algorithm is wrong. The problem is that when both the left and top are black and the numbers are different, it’s not okay to copy either the left or the top number. If you change the number to a union-find set, merge the two numbers in the above situation, and find the number in the union-find set when judging if the numbers are equal, it seems there is no problem.
These two simple questions, just over 100 lines of code in total, took 1 hour and 45 minutes… It’s been many years since I did OI, and I didn’t participate in ACM, I’m getting old… That night, the interviewer emailed me (in English, of course) and said: I can understand your basic ideas, but there are many bugs in your program, especially the second question, but don’t worry, these two questions are not easy to solve in a short time.
Somehow, I got through the MSRA interview. I think if I had met a few ACM bulls and a few GPA bulls, I definitely wouldn’t have been selected. The HR department confirmed my check-in time with me. At the time, I was on my way to LUD (Linux User Dinner), and I didn’t hear clearly. She told me it was “onboarding”, I thought I wasn’t going to work at Microsoft, how did it become “onboarding”? Later I found out that our experimental class is the same as Microsoft interns.
Wow… the summary is over 3000 words. The 18 of us in the experimental class are about to start a new journey, and I also wish the juniors and sisters to be calm and composed in the MSRA interview next year, and show their personality and style!