The Blunder of 70,000 Text Messages
Is the blunder far from us? Not really. A while ago, the LUG server malfunctioned and mistakenly sent out 70,000 text messages, depleting the balance of the school’s text message platform. It was not until the teacher from the network center called me that I found out.
The trouble started with the service monitoring script.
To prevent text message bombing, the messages sent out had to go through my “risk control”,
It may be due to the automatic blocking of duplicate text messages by my phone or the operator,
What are the pitfalls of this blunder?
The alert text message for automatically reconnecting to the database should only be sent once when the status changes, and should not be sent continuously during the database failure.
The risk control module should not consider that 0 text messages have been sent out when the text message log query fails, but should consider the worst case and reject this text message sending request.
Automatic reconnection to the database is a temporary addition learned from the blog malfunction, but after modifying the code, it was not tested, and the first encounter after being deployed to the lug server caused a blunder.
Alerts should not only be sent via text messages, but preferably via text messages + emails, so that if there is a problem with the text messages, emails can be used, and this kind of blunder can be discovered as soon as possible. I did not receive those duplicate text messages, but I should be able to receive duplicate emails.
Let’s review the pitfalls of the “blunder” incident of Everbright on August 16:The strategic investment department was not included in risk management.
The “re-lower” function (for re-declaration of untraded stocks) of the ETF arbitrage module in the order generation system was mistakenly written as the “buy individual stock function” as the “buy ETF basket stock function” during design.
The order execution system mistakenly defaulted the stock purchase price of the market order to “0”, and the system could not correctly check whether the market order exceeded the account credit limit.
The “re-lower” function has never been used in real trading, and serious program errors have not been discovered.
It took a long time for Everbright to confirm where the problem was after receiving the notice from the Shanghai Stock Exchange.
One is an automatic monitoring system, and the other is an automatic trading system. The reasons for the problems are so similar, it can be seen that while automatic systems bring us convenience, they also hide huge risks. This malfunction sounded an alarm for me, and I hope everyone will take it as a warning.