Data availability is the process that allows block producers to publish all transaction data in a block to the network and enables verifiers to download it. The role of data availability in Layer 2 development has caused controversy, leading Ethereum Foundation researcher Dankrad Feist to tweet that a solution that does not use Ethereum for data availability is not considered Layer 2. This article aims to explore what data availability is, the challenges it poses in Layer 2, and the controversy surrounding its implementation.
In simple terms, data availability refers to the act of block producers releasing all transaction data in a block to the network, allowing verifiers to download it. If a block producer releases the complete data and enables verifiers to download it, the data is considered available. However, if some data is withheld, making it impossible for verifiers to access the complete data, it is considered unavailable.
It is important to differentiate between data availability and data retrievability. Data availability concerns the stage when a block is produced but has not yet been added to the blockchain through consensus. It is not related to historical data but rather to the ability to publish newly generated data through consensus. On the other hand, data retrievability refers to the stage when data has been permanently stored on the blockchain after going through consensus, allowing for the retrieval of historical data. Nodes that store all historical data in Ethereum are called archival nodes.
Therefore, the data availability layer is the place where L2 releases transaction data. Currently, mainstream L2 solutions use Ethereum as the data availability layer.
There are two main challenges in ensuring the secure operation of the verification mechanism in L2. First, ensuring the security of fraud proofs or validity proofs, depending on the type of L2 solution used. For example, if a sequencer does not release complete data that can be traced back, challengers in fraud proofs will not be able to initiate valid challenges. Similarly, in ZK Rollup, although validity proofs do not require data availability, it is still necessary for the overall functioning of the solution. Without data that can be traced back, users will not be able to verify their balances and may lose assets.
To ensure secure verification, current L2 sequencers generally publish L2 state data and transaction data on Ethereum, relying on Ethereum for settlement and data availability. This means that the data availability layer in practice is where L2 releases transaction data, and Ethereum serves as the settlement and consensus layer.
However, this approach incurs significant costs. To reduce costs, there are two main methods: reducing the cost of data publishing on Ethereum and decoupling transaction execution and data availability from Ethereum altogether. The latter involves not using Ethereum as the data availability layer.
There is controversy surrounding the role of L2 in the data availability layer. Modular blockchains aim to decouple core functionalities and expand the performance of a blockchain by combining different specialized networks. While there is still debate about the layers of modular blockchains, the generally accepted classification includes the execution layer, settlement layer, consensus layer, and data availability layer. Currently, L2 solutions separate the execution layer from Ethereum, but the other three layers still rely on Ethereum. However, due to cost considerations, many L2 solutions are preparing to decouple the data availability layer from Ethereum and use Ethereum only for settlement and consensus.
Interestingly, Ethereum seems reluctant to allow L2 solutions to obtain data availability from other sources. Dankrad Feist, a researcher at the Ethereum Foundation, tweeted that a solution that does not use Ethereum for data availability is not considered a Rollup or L2. L2BEAT, in its definition of L2, also states that any scaling solution that does not release data on L1 is not considered L2, as there is no guarantee that operators will provide the released data.
Despite the controversy, projects related to the data availability layer continue to thrive. In the next article on data availability, the author will provide a detailed introduction to the main data availability solutions and related projects in the market.