Can you clearly understand the basic idea of BitVM, Bitcoin Script, and Segregated Witness from this article? The article is written by Geek Web3 Nickqiao & Faust & Shew Wang, with Bitlayer research team as consultants.
Abstract
Text
MATT and Commitment: The Basic Idea of BitVM
What is Bitcoin Script
How Bitcoin Script is Triggered
Segregated Witness and Witness
Recently, Delphi Digital released a research report titled “The Dawn of Bitcoin Programmability: Paving the Way for Rollups,” which systematically outlined core concepts related to Bitcoin Rollup, such as BitVM’s toolkit, OP_CAT and Covenant restrictions, the Bitcoin ecosystem DA layer, bridges, and the four major BitVM-based Bitcoin Layer2 projects – Bitlayer, Citrea, Yona, and Bob.
While the report roughly presents the landscape of Bitcoin Layer2 technologies, it lacks detailed descriptions, making it difficult for people to understand. Geek Web3 has conducted an in-depth exploration based on the Delphi report, attempting to help more people understand technologies like BitVM systematically.
We will launch a series of columns called “Getting Closer to BTC” in collaboration with the Bitlayer research team and the BitVM Chinese community, focusing on popularizing BitVM, OP_CAT, and Bitcoin cross-chain bridge topics, aiming to demystify Bitcoin Layer2 technologies for more enthusiasts.
A few months ago, Robin Linus, the founder of ZeroSync, released an article titled “BitVM: Compute Anything on Bitcoin,” formally introducing the concept of BitVM and driving the progress of Bitcoin Layer2 technologies. This can be considered one of the most revolutionary innovations in the Bitcoin ecosystem, igniting the entire Bitcoin Layer2 ecosystem and attracting projects like Bitlayer, Citrea, BOB, and others, bringing vitality to the market.
Subsequently, more researchers joined in improving BitVM, launching different iterative versions such as BitVM1, BitVM2, BitVMX, and BitSNARK. The general situation is as shown in the following figure:
Robin Linus’s BitVM implementation whitepaper, based on fictional logic gate circuits, called BitVM0, was first proposed last year.
In subsequent speeches and interviews, Robin Linus informally introduced the BitVM scheme based on a fictional CPU (referred to as BitVM1), similar to the fraud-proof system Cannon by Optimism, simulating a general CPU effect off-chain using Bitcoin script.
Robin Linus also proposed BitVM2, a Permissionless non-interactive fraud-proof protocol.
Members of Rootstock Labs and Fairgate Labs released the BitVMX whitepaper, similar to BitVM1, aiming to simulate a general CPU effect (off-chain) using Bitcoin script.
The construction of the BitVM-related developer ecosystem is becoming clearer, and the iterative improvement of peripheral tools is visible to the naked eye. Compared to last year, the BitVM ecosystem has evolved from being “castles in the air” to being “faintly visible,” attracting more developers and VCs to enter the Bitcoin ecosystem.
However, for most people, understanding BitVM and Bitcoin Layer2 related technical terms is not easy, as it requires a systematic understanding of the surrounding basic knowledge, especially Bitcoin scripts and Taproot background knowledge. The existing reference materials online are either too lengthy or lack thorough explanations, leaving people somewhat puzzled. We are committed to addressing these issues, striving to use clear language as much as possible to help more people understand peripheral knowledge of Bitcoin Layer2 and establish a systematic understanding of the BitVM system.
First and foremost, it is essential to emphasize that the basic concept of BitVM is MATT, which stands for Merkleize All The Things. It mainly refers to using a Merkle Tree data storage structure to demonstrate the complex program execution process, attempting to enable Bitcoin’s native verification of fraud proofs.
Although MATT can express a complex program and its data processing traces, it does not directly publish this data on the BTC chain because the overall scale of this data is very large. The MATT solution only stores data in a Merkle tree off-chain, releasing only the summary (Merkle Root) of the Merkle tree to the chain. This Merkle tree mainly includes three core contents:
Smart contract script code
Data required by the contract
Traces left during contract execution (changes recorded in memory, CPU registers, etc., when smart contracts run in virtual machines like EVM)
In the MATT solution, only the small-sized Merkle Root is stored on-chain, while the complete data set contained in the Merkle Tree is stored off-chain, utilizing an idea called “commitment.” Here is an explanation of what a “commitment” is.
A commitment is similar to a simplified declaration, which can be understood as a “fingerprint” obtained by compressing a large amount of data. Generally, those who publish a “commitment” on-chain will claim that certain data stored off-chain is accurate and correct. This off-chain data corresponds to a simplified declaration, which is the “commitment.”
At times, the hash of data can serve as a “commitment” to the data itself. Other commitment schemes include KZG commitment or Merkle Tree, among others. In common fraud-proof protocols used in Layer2, data publishers will publish the complete data set off-chain and the commitment of the data set on-chain. If someone discovers invalid data in the off-chain data set, they can challenge the commitment of the data on-chain.
Through commitments, Layer2 can compress a significant amount of data processing, only publishing their “commitments” on the Bitcoin chain. Of course, it is also necessary to ensure that the complete off-chain data set can be observed by external parties.
Currently, several major BitVM solutions like BitVM0, BitVM1, BitVM2, and BitVMX all adopt similar abstract structures:
1. Program decomposition and commitment: Complex programs are first broken down into a large number of basic opcodes (compiled), and the traces generated by these opcodes during actual execution are recorded (simply put, it is the state changes that occur when a program runs in a CPU and memory, recorded as Trace). Subsequently, all data, including Trace and opcodes, are organized into a dataset, and the commitment of that dataset is generated. Specific commitment schemes can take various forms, such as Merkle trees, PIOPs (various ZK simulation algorithms), hash functions.
2. Asset pledging and pre-signing: Data publishers and validators need to lock a certain amount of assets on-chain through pre-signing, with specific conditions. These conditions will be triggered based on possible future scenarios. If a data publisher behaves maliciously, validators can submit evidence to claim the data publisher’s assets.
3. Data and commitment publication: Data publishers release commitments on-chain and publish the complete data set off-chain. Validators retrieve the data set and check for any errors. Each part of the off-chain data set is related to the commitment on-chain.
4. Challenge and punishment: Once a validator discovers that the data provided by the data publisher is incorrect, they will take this part of the data on-chain for direct verification (this part of the data must be sliced finely first). This is the logic of fraud-proof. If the verification results show that the data publisher did provide invalid data off-chain, their assets will be challenged and taken away by the validator.
In summary, data publisher Alice publicly reveals all traces of Layer2 transaction execution in the off-chain, releasing the corresponding commitment on-chain. If you need to prove that a certain part of the data is incorrect, you first need to prove to a Bitcoin node that this data is related to the commitment on-chain, proving that this data is publicly disclosed by Alice, and then let the Bitcoin node confirm that this data is indeed incorrect.
Now that we have a general understanding of the overall idea of BitVM, all variants of BitVM are essentially based on the above normalization. Next, let us begin to learn and understand some important technologies used in the above process, starting with the most basic Bitcoin script and Taproot, as well as pre-signing.
Bitcoin-related knowledge is more difficult to understand than Ethereum’s, even simple transfer actions involve a series of concepts, including UTXO (Unspent Transaction Output), locking script (or ScriptPubKey), and unlocking script (or ScriptSig). Let’s explain these key concepts first.The “Storage Location” of TXO data. It is important to note that Bitcoin and Ethereum are fundamentally different. Ethereum provides two types of accounts, contract accounts and EOA accounts, to store data. Asset balances are recorded digitally under contract accounts or EOA accounts, all placed in a database called the “world state,” making it easier to locate the storage location of the data when transferring. Bitcoin does not have a “world state” design; asset data is stored in past blocks (i.e., unspent UTXO data, individually stored in the output of each transaction).
If you want to unlock a certain UTXO, you need to specify which transaction the UTXO information is in and provide the ID of that transaction (i.e., its hash) for Bitcoin nodes to search in the historical records. To check the balance of a Bitcoin address, you need to traverse all blocks from the beginning to find the unspent UTXO associated with that address. When using a Bitcoin wallet, you can quickly check the balance of a Bitcoin address, often because the wallet service has indexed all addresses by scanning blocks, making it easy for us to query.
When generating a transaction to transfer your UTXO to someone else, you need to mark the location of that UTXO in the Bitcoin transaction history based on the hash/ID of the transaction to which it belongs.
An interesting fact is that the results of Bitcoin transactions are computed off-chain. When a user generates a transaction on their local device, they must create all inputs and outputs themselves, essentially completing the output results of the transaction. Transactions are broadcast to the Bitcoin network and verified by nodes before being added to the chain. This “off-chain computation – on-chain verification” model is completely different from Ethereum, where only transaction input parameters are provided, and the transaction results are computed and output by Ethereum nodes.
Additionally, UTXO’s locking script can be customized. You can set UTXO to be “unlockable by the owner of a certain Bitcoin address,” where the owner of that address needs to provide a digital signature and public key (P2PKH). In Pay-to-Script-Hash (P2SH) transaction types, you can add a Script Hash to the UTXO locking script, allowing anyone who submits the corresponding hash of the script to unlock the UTXO based on the conditions defined in that script. The Taproot script that BitVM relies on utilizes features similar to P2SH.
Here we will use P2PKH as an example to introduce the triggering mechanism of Bitcoin scripts. Understanding this triggering mechanism is essential to comprehend more complex concepts like Taproot and BitVM. P2PKH, short for “Pay to Public Key Hash,” involves setting a public key hash in the locking script of a UTXO, which must be submitted to unlock the UTXO, aligning with the general concept of Bitcoin transfers.
In summary, under the P2PKH scheme, the transaction initiator provides a unlocking script containing a public key and a digital signature. This public key must match the public key hash specified in the UTXO locking script, and the digital signature of the transaction must be correct to successfully unlock the UTXO.
(Translated content for the images is not available)
Bitcoin supports various transaction types beyond Pay to public key/public key hash, including P2SH (Pay to Script hash), depending on how the locking script of the UTXO was customized during creation.
It is important to note that under the P2SH scheme, the locking script can preset a Script Hash, and the unlocking script must fully submit the content corresponding to the Script Hash. By executing this script, Bitcoin nodes can implement multi-signature wallet functionalities on the Bitcoin chain.
Under the P2SH scheme, creators of UTXOs need to ensure that the parties intending to unlock the UTXOs are aware of the content corresponding to the Script Hash. By both parties understanding this script content, more complex business logic beyond multi-signature functionalities can be achieved.
It is worth mentioning that the Bitcoin chain (blocks) does not directly record the association between UTXOs and addresses. Instead, it registers which public key hash / script hash can unlock a UTXO. However, based on the public key hash / script hash, the corresponding address (the seemingly random string displayed in wallet interfaces) can be calculated quickly.
The reason we can see the Bitcoin balance under a specific address in block explorers and wallet interfaces is because these platforms interpret this data. They scan all blocks and calculate the corresponding “address” based on the public key hash / script hash declared in the locking script, then display the Bitcoin balance under that address.
Once we understand the concept of P2SH, we are closer to Taproot, a technology that BitVM relies on. However, before delving into Taproot, it is crucial to understand a key concept: Witness and Segregated Witness.
Recapping the discussions on unlocking scripts, locking scripts, and the UTXO unlocking process, a problem emerges: the digital signature of a transaction is included in the unlocking script, and generating a signature cannot involve overriding the unlocking script (parameters used to generate a signature cannot include the signature itself). This means that the digital signature can only relate to the main part of the transaction data, not completely override the transaction data.
As a result, even if an intermediary tampers with the unlocking script of a transaction, the verification result will not be affected. For instance, Bitcoin nodes or mining pools can insert additional data into the unlocking script of a transaction without affecting the verification process, causing subtle changes in the transaction data, and altering the calculated transaction hash / transaction ID. This is known as the transaction malleability issue.
The downside of this issue is that when multiple transactions need to be sequentially initiated with dependencies (e.g., Transaction 3 references the output of Transaction 2, which in turn references the output of Transaction 1), later transactions will need to reference the IDs (hashes) of the previous ones. Any intermediary, such as mining pools or Bitcoin nodes, can slightly adjust the content of the unlocking script, causing inconsistencies between the on-chain hash of the transaction and the user’s expectations, rendering pre-established sequences of transactions invalid.
In reality, in scenarios like DLC bridges and the BitVM2 solution, there is a need to construct batches of transactions with sequential dependencies. Therefore, the mentioned scenario of pre-established sequences with dependencies is quite common.
Simply put, the transaction malleability issue arises because the ID/hash of a transaction includes the data from the unlocking script during calculation. Intermediaries like Bitcoin nodes can slightly adjust the content of the unlocking script, leading to discrepancies between the calculated transaction ID and what the user expects. This is a historical burden left from Bitcoin’s early design considerations.
The subsequent introduction of Segregated Witness / SegWit upgrade aimed to completely decouple the transaction ID from the unlocking script data during calculation. UTXO locking scripts following the SegWit upgrade rules will include an “OP_0” opcode as a marker at the beginning, and the corresponding unlocking script is renamed from SigScript to Witness.
Following the rules of Segregated Witness resolves the transaction malleability issue, ensuring that the transmitted transaction data remains unaltered by intermediaries. While P2WSH functions similarly to P2SH discussed earlier, there is no fundamental difference. You can preset a script hash in the UTXO locking script, and the submitter of the unlocking script must provide the content corresponding to the hash to execute on the chain.
However, if the script content to be implemented is extensive, containing a lot of code, conventional methods may not be sufficient to submit the complete script to the Bitcoin chain (each block has size limitations). This is where Taproot comes in, streamlining the on-chain script content and forming the basis for the complex solutions built on BitVM.
(Translated content for the images is not available)
For related reports:
– Overview of new trends in the Bitcoin ecosystem: Ordinal, Atomical, bitVM, Lightning Network
– What is a cross-chain bridge? Principles, transaction methods, risk analysis, recommended tools for queries
– Combining Bitcoin security with Ethereum smart contracts, analysis of “BOB” technology features and ecosystem
(Note: The content for the images has not been translated.)