On February 6th, the Solana network experienced another “long-awaited” outage, with the previous outage occurring around February 25th, 2023. According to Matthew Sigel, the Head of Digital Asset Research at VanEck, the Solana outage was caused by a failure in the BPF (Berkley Packet Filter) loader, which is the mechanism for deploying, upgrading, and executing programs on Solana.
This may be related to a previous SMID proposal that added an interceptor to prevent the use of metadata in BPF, as this metadata was no longer needed. This came from the 0093 upgrade, but an error occurred in it, which was discovered and a fix was created in the test network, but it has not been implemented. It is speculated that someone manually triggered this error, causing the Solana outage.
Solana’s “outage” issue has been criticized by the community in the past, although the network has been relatively stable in the past year. However, Solana has experienced several outages or network freezes. The following are summarized as follows:
1. On February 6th, 2024, there was a failure in the BPF loader, and the outage lasted for 4 hours and 46 minutes.
2. On February 25th, 2023, the Solana mainnet had performance issues and was unable to process user transactions. Solana later released an improved network upgrade plan, including improving the Solana upgrade process, establishing a response team, and improving the restart process.
3. Around October 1st, 2022, the network crashed due to node configuration errors.
4. Around August 3rd, 2022, a large-scale theft occurred in the Solana wallet, which was ultimately found to be a vulnerability caused by the centralized Sentry server.
5. Around June 1st, 2022, there was a network restart due to a durable nonce vulnerability in transactions, causing an interruption of approximately 4.5 hours.
6. Around May 1st, 2022, a large number of robot transactions emerged due to the minting of a new NFT project, causing the mainnet nodes to lose consensus, resulting in a block production halt for up to 7 hours.
7. Around January 21st, 2022, due to significant market volatility, the network was overwhelmed by a large number of arbitrage robot transactions, causing severe load and a 30-hour interruption. However, the official classification at the time was “degraded performance.” Solana community later updated the mainnet to version 1.8.14 in an attempt to improve the network status.
8. Around September 14th, 2021, during the Grape Protocol’s decentralized social networking protocol IDO on the Raydium platform, many users sent a large number of transactions through scripted instructions, causing “memory overflow” and node crashes, resulting in the entire network being unable to produce blocks for 17 hours.
9. Around September 3rd, 2021, the network experienced instability and degraded performance for approximately 1 hour.
10. Around May 4th, 2021, there was a decrease in network performance, causing a large number of transactions to fail to be executed.
Looking back at historical network events, it can be observed that a surge in transactions has been the main cause of network disruptions. This may be related to Solana’s mechanism. According to Wu Zhixiwei, the Director of the Border Intelligence Research Institute, since Solana treats consensus messages as a special type of transaction message transmitted between validation nodes, a large volume of messages clogging the network prevents the proper transmission of consensus messages, resulting in the inability to reach consensus.
In addition, some of Solana’s features have been selectively exploited, leading to network outages. For example, the write-lock for concurrent transaction processing is locked on many important addresses, causing transactions to be executed sequentially rather than concurrently, greatly affecting message processing capacity. Nodes retain possible fork information for processing forks, leading to memory overflow, and so on.
Faced with the common occurrence of network performance degradation or outages caused by a surge in junk transactions, Solana co-founder Anatoly Yakovenko has acknowledged the issue and stated the introduction of “actual flow control” to address the problem. As for network outages caused by factors such as transaction nonces or node configuration errors, Solana officials have promptly released repair versions for node upgrades.
This year’s outage, which occurred after a year, may be both good and bad news, but it serves as a warning, especially against the backdrop of the increasing popularity of the Solana ecosystem. Network stability remains an essential aspect that requires significant attention.
Related Reports
Bankless predicts for 2024: EigenLayer TVL surpassing $10 billion and Solana experiencing another outage.
Solana’s 9th outage: Validation end unable to produce blocks after upgrade, root cause yet to be determined.
Solana mainnet down again! Significant outage restart explanation released, network has been restored.