2013-10-11
The Blunder of 70,000 Text Messages

Is the blunder far from us? Not really. A while ago, the LUG server malfunctioned and mistakenly sent out 70,000 text messages, depleting the balance of the school’s text message platform. It was not until the teacher from the network center called me that I found out.

The trouble started with the service monitoring script. It retrieves site information from the database, periodically visits the monitored sites, and if a problem is detected, it will send a text message alert to the website owner. When the service monitoring script fails to connect to the database, it will also send me an alert text message. Initially, it would not attempt to reconnect to the database, so it would only send an alert once, but the monitoring service could not automatically resume operation after the database was restored. This bug was discovered during a blog malfunction, so it was changed to automatically reconnect, but the logic for sending alert text messages was not modified, so if the database could not be connected continuously, it would keep sending.

To prevent text message bombing, the messages sent out had to go through my “risk control”, limiting the number of text messages sent to each phone number every 24 hours. Risk control is to query the text message log table in the database to get the number of text messages sent to this number in the last 24 hours. When the database crashes, the value queried is NULL, which is implicitly type converted to 0 in PHP, so it is considered that it has not exceeded the limit and is sent out. The school’s text message gateway also has no “risk control” , resulting in a large number of text messages flooding into the operator’s network.

Read More

2013-10-07
VirtualStore in Win Vista/7/8

Those who have used Windows Vista/7/8 may have had this experience: after modifying a file on the C drive with a 32-bit program (such as cygwin), when you look at it from the Windows Explorer, it’s still the version before the modification! Does the file system have different views for different programs? You’re right, since Vista introduced UAC and VirtualStore, don’t trust the changes made by 32-bit programs in the C drive.

After Windows Vista introduced stronger security mechanisms, some important system directories are not modifiable by everyone. These directories include the C drive root directory, Program Files, Program Files (x86), Windows, and the registry’s HKEY_LOCAL_MACHINE, etc. But some old applications still assume these directories are writable, and if the system API simply returns access denied, these programs can’t run.

Therefore, Vista provides VirtualStore. For 32-bit programs running without administrator privileges, as long as there are write operations to these directories, the modified or added files will be copied to this user’s VirtualStore. The file at this path seen by the 32-bit program running under this user’s identity is the corresponding file in VirtualStore, and it knows nothing about any modifications to the file at the original path.

Read More

2013-10-03
Make OpenVPN Not Use VPN by Default

Some users of the LUG VPN hope to use the VPN only for certain specific IPs, while OpenVPN defaults to using the VPN for all. Perhaps my search skills are too poor, I didn’t Google a reliable answer. Readers without patience can directly look at my solution:

1
2
3
4
5
6
7
8
9
10
11
$ echo "script-security 2" >>/etc/openvpn/client.conf
$ echo "up /usr/local/bin/remove-ovpn-defroute" >>/etc/openvpn/client.conf

$ cat /usr/local/bin/remove-ovpn-defroute
#!/bin/sh
(
sleep 2 # wait for routing table to be flushed
ip route del 0.0.0.0/1 dev tun0
ip route del 128.0.0.0/1 dev tun0
) &
exit 0
Read More

2013-10-01
How Internet Videos are Delivered to Every Household

A few days ago at a student gathering, someone raised a question: Why is it that many people can watch TV without lag, but watching live video on the internet lags? Television is broadcast (essentially the same as radio), while internet video is delivered via point-to-point IP networks. For each additional viewer, the server has to send an additional set of data.

So how exactly is internet video delivered to every household? I’ve stolen some popular science knowledge from the top academic conference in the communications field, SIGCOMM 2013, to share with everyone.

Read More

2013-09-29
Algorithm Problems are not the Only Standard to Evaluate Programmers

Many companies have started their interview process recently. A few days ago, a good friend of mine applied for a job at an internet company known as “Engineer’s Paradise”, but unfortunately failed in the first round of interviews. The reason might be that he spent an hour on a tree algorithm problem. I think it’s too rigid to judge a person’s ability based on a single algorithm problem. Of course, such companies may have too many applicants and don’t have time to carefully evaluate each one.

IMG_20130928_201318

Tonight, a classmate took me to visit Tsinghua University and we took a detour to Tsinghua Science Park. I guess I can say I’ve been to the door of this company that countless programmers aspire to. On the way back, I was thinking about the possible drawbacks of pure algorithm problem interviews. On coolshell, I saw articles with similar views to mine, “Why I Oppose Pure Algorithm Interview Questions“, “How I Hire Programmers“, and “Further Discussion on ‘How I Hire Programmers’“. So I plucked up the courage to share my simple & naive views with you all. All kinds of brickbats are welcome.

Read More

2013-09-15
Reading Report of “Seven Databases in Seven Weeks”

I just finished my makeup exam for the database course. Posting my reading report here to share (and hopefully pass~).

“Seven Databases in Seven Weeks” was published in 2012. It introduces seven of the most popular open-source databases at the time, including a relational database (PostgreSQL), key-value databases (Riak, Redis), a column-oriented database (HBase), document-oriented databases (MongoDB, CouchDB), and a graph database (Neo4j). Except for PostgreSQL, the other six are collectively called NoSQL, meaning they do not use the relational model and do not use SQL as their query language.

The book follows the same format as “Seven Languages in Seven Weeks”: one chapter per database, each chapter divided into three sections called Day 1, Day 2, and Day 3. Unlike official database documentation, this book does not simply introduce each technology, but explores the core concepts of each one, helping readers understand the strengths and weaknesses of each database and which database to use for which kinds of requirements.

Read More

2013-09-06
A Slash Triggers a Bloodbath

Note: For those who are not familiar with mirrors, please read “How USTC Open Source Software Mirror is Made“ first.

Trouble Starts with iSCSI

The story begins on June 26, 2013. Mirrors has a disk array directly connected by a network cable, using the iSCSI protocol, with an XFS file system on it. Around 14:00 on June 26, stephen reported in the mailing list that mirrors was down. According to syslog, at 13:58 on June 26, the iSCSI connection timed out, causing sdg access failure, a large number of I/O operations were stuck, causing nginx to be stuck, mirrors HTTP could not connect. A few minutes later, I/O timed out, nginx returned to normal, but the sources on the disk array could not be used.

Read More

2013-09-05
My Year in LUG

Today, Guangyu said that he wanted a summary of last year’s work, so this article came into being. The main work of LUG is divided into activities and network services.

Activities

Let’s review what happened in the past year (http://lug.ustc.edu.cn/wiki/lug/events)

Recruitment

Together with other clubs, we set up stalls in the second week of school, in the east and west activities. The stolen experience is: you can make some display boards and roll-ups to increase the exposure rate.

Although recruitment is quite hard, it’s fun to chat with students from various departments, and handing out flyers can also taste the often rejected taste. Gossip, the current president of LUG’s sister was found at that time~

Read More

2013-09-04
From IP Networks to Content Networks

Is IP Enough?

Starting from middle school computer classes, we have been learning about the so-called “OSI seven-layer model” of computer networks, and I remember memorizing a lot of concepts back then. Those rotten textbooks have ruined many computer geniuses. In fact, this model is not difficult to understand: (those who have studied computer networks can skip this)

  1. Physical Layer: This is the medium for signal transmission, such as optical fiber, twisted pair (the network cable we commonly use), air (wifi)… Each medium requires its own encoding and modulation methods to convert data into electromagnetic waves for transmission.
  2. Data Link Layer: Let’s use an analogy. When speaking, you might accidentally say something wrong or hear something wrong, so you need a mechanism to correct errors and ask the other party to repeat (checksum, retransmission); when several people want to speak, you need a way to arbitrate who speaks first and who speaks later (channel allocation, carrier listening); a person needs to signal before and after speaking, so that others know he has finished speaking (framing).
  3. Network Layer: This was the most controversial place in the early days of computer networks. Traditional telecom giants believed that a portion of the bandwidth should be reserved on the path between the two endpoints, establishing a “virtual circuit” for communication between the two parties. However, during the Cold War, the U.S. Department of Defense required that the network being established should not be interrupted even if several lines in the middle were destroyed. Therefore, the “packet switching” scheme was finally adopted, dividing the data into several small pieces for separate packaging and delivery. Just like mailing a letter, if you want to deliver it to a distant machine, you need to write the address on the envelope, and the address should allow the postman to know which way to go to deliver it to the next level post office (for example, using the ID number as the address is a bad idea). The IP protocol is the de facto standard for network layer protocols, and everyone should know the IP address.
  4. Transport Layer: The most important application of computer networks in the early days was to establish a “connection” between two computers: remote login, remote printing, remote file access… The transport layer abstracts the concept of connection based on network layer data packets. The main difference between this “connection” and “virtual circuit” is that the “virtual circuit” reserves a certain bandwidth, while the “connection” is best-effort delivery, without any guarantee of bandwidth. Since most of the traffic on the Internet is bursty, packet switching improves resource utilization compared to virtual circuits. In fact, history often repeats itself. Nowadays, in data centers, due to predictable and controllable traffic, we are returning to the centrally controlled bandwidth reservation scheme.
  5. Application Layer: There is no need to say more about this, HTTP, FTP, BitTorrent that the Web is based on are all application layer protocols.

ccn-0

Read More

2013-09-01
How is the USTC Open Source Software Mirror Made?

Update (2014-09-29): Due to some inappropriate content in the mirrors configuration file, the configuration file is no longer public, and some links in this article have become dead links, I’m very sorry.

Due to the disk failure of the USTC open source software mirror (mirrors.ustc.edu.cn) disk failure, stephen, tux and I (boj) are not at school, and the mirrors have not fully recovered since the failure in July, it’s time to start over. This time the mirrors rebuild will be completed entirely by students on campus, which is also an opportunity to practice technology. Here, I will briefly explain what parts the open source software mirror includes and how to build it. Since sourceforge is still waiting for us to synchronize, we hope to restore basic services within three days and rebuild the entire system including synchronization within a week.

WTF?

The so-called open source software mirror is to synchronize some GNU/Linux distributions and well-known open source software repositories from the official site. Users can use the software repository mirror nearby by modifying the configuration file to speed up the download and reduce the load on the official site.

Read More
RSS