Despite the currently minimal risk associated with duplicate transactions, this is an interesting and peculiar bug worth considering. This article originates from BitMEX Research and has been compiled by ForesightNews.
(Background: On-chain data: Bitcoin whales begin to accumulate BTC again; is the altcoin season not yet arrived or already passed?)
There exist two sets of identical transactions on the Bitcoin blockchain, with one set “trapping” the other, both occurring in mid-November 2010. Duplicate transactions can lead to confusion, and Bitcoin developers have struggled with them in various ways for years. This issue is still not 100% resolved, and the next potential duplicate transaction could arise in 2046. Although the risks associated with duplicate transactions are currently minimal, it is an interesting and peculiar bug worth pondering.
Overview
A normal Bitcoin transaction uses at least one output from a previous transaction by referencing its transaction ID (TXID). These unspent outputs can only be spent once; if they could be spent twice, it would allow for double spending, rendering Bitcoin worthless. However, there happen to be two sets of identical transactions in Bitcoin. This situation arises because coinbase transactions do not have any transaction inputs but rather newly minted coins. Therefore, two different coinbase transactions could potentially send the same amount to the same address and be constructed in exactly the same way, making them identical. Since these transactions are the same, their TXIDs match as well, because TXIDs are hash digests of transaction data. The only other way for a TXID to be duplicated is through a hash collision, which is considered unlikely and impractical for cryptographically secure hash functions. Hash collisions like SHA256 have never occurred in Bitcoin or anywhere else.
These two sets of duplicate transactions occurred within a close timeframe, between 08:37 UTC on November 14, 2010, and 00:38 UTC on November 15, 2010, spanning approximately 16 hours. The first set of duplicate transactions is trapped between the second set. We classify d5d2….8599 as the first duplicate transaction because it became a duplicate first, although curiously, it first appeared on the blockchain after another duplicate transaction e3bf….b468.
Details of Duplicate Transactions
In the images below, two screenshots from the mempool.space block explorer illustrate the occurrence of the first duplicate transaction in two different blocks.
Interestingly, when entering the relevant URLs in a web browser, the mempool.space block explorer defaults to showing the earlier block for d5d2….8599 and the later block for e3bf….b468. Blockstream.info and Btcscan.org exhibit the same behavior as mempool.space. On the other hand, according to our basic tests, Blockchain.com and Blockchair.com behave differently, always showing the latest version of the duplicate transaction when the URL is entered in the browser.
Among the four relevant blocks, only one block (Block 91,812) contains other transactions. This transaction merged outputs of 1 BTC and 19 BTC into a single 20 BTC output.
Can These Outputs Be Spent?
Due to the presence of two identical TXIDs, this creates referencing issues for subsequent transactions. Each duplicate transaction is valued at 50 BTC. Therefore, these duplicate transactions total 4 x 50 BTC = 200 BTC, or depending on different interpretations, possibly 2 x 50 BTC = 100 BTC. To some extent, 100 BTC does not actually exist.
As of today, all 200 BTC remain unspent. To our knowledge (we may be mistaken here), if someone possesses the private keys associated with these outputs, they could spend these Bitcoins. However, once spent, the UTXO would be removed from the database, and the duplicate 50 BTC would therefore become unspendable and lost, meaning only 100 BTC might be recoverable. As for which block these coins would originate from if they were spent—earlier or more recent—it may be undefined or indeterminate.
This individual could have spent all the Bitcoins before creating duplicate transactions and then created duplicate outputs, establishing new entries in the unspent outputs database. This would imply not only duplicate transactions but potentially also duplicate transactions of spent outputs. If this scenario occurred, more duplicate transactions could be created when these outputs are spent, forming a kind of duplication chain. Care must be taken regarding the order of events, always spending before creating duplicates; otherwise, Bitcoin could be lost forever. These new duplicate transactions would not be coinbase transactions but rather “normal” transactions. Fortunately, this situation has never occurred.
The Problem of Duplicate Transactions
Duplicate transactions are clearly problematic. They create confusion for wallets and block explorers, and obscure the origins of Bitcoins. They also introduce many attacks and vulnerabilities. For instance, you could pay someone twice with two duplicate transactions. Then, when the transaction party attempts to use the funds, they may find that only half of the funds can be recovered. This could be an attack on a trading platform, attempting to bankrupt it, while the attacker incurs no loss because they can withdraw funds immediately after depositing.
Prohibition of Transactions Using Duplicate TXIDs
To mitigate the issue of duplicate transactions, in February 2012, Bitcoin developer Pieter Wuille proposed the BIP30 soft fork, which prohibits transactions using duplicate TXIDs unless the previous TXID has been spent. This soft fork applies to all blocks after March 15, 2012.
In September 2012, Bitcoin developer Greg Maxwell modified this rule so that the BIP30 checks would apply to all blocks, not just those after March 15, 2012, with the exception of the two duplicate transactions mentioned earlier in this article. This fixed some DOS vulnerabilities. Technically, this was also a soft fork, although the rule change only applied to blocks over six months old, so it carried no risks associated with changes to normal protocol rules.
The computational cost of this BIP30 check is quite high. Nodes must check all transaction outputs in new blocks and verify whether these output endpoints already exist in the UTXO. This may be why Wuille opted to check only unused outputs; checking all outputs would increase computational costs and prevent pruning.
BIP34
In July 2012, Bitcoin developer Gavin Andresen proposed the BIP34 soft fork, which was activated in March 2013. This protocol change requires coinbase transactions to include the block height, which also facilitates block version management. The block height is added as the first item in the coinbase transaction scriptSig. The first byte in the coinbase scriptSig indicates the number of bytes used for the block height digit, followed by the bytes representing the block height digit itself. For the first c160 years (223 / (144 blocks per day * 365 days per year)), the first byte should be 0x03. This is why today’s coinbase ScriptSig (HEX) always starts with 03. This soft fork seemingly resolved the duplicate transaction issue entirely; now all transactions should be unique.
Since BIP34 has been adopted, in November 2015, Bitcoin developer Alex Morcos submitted a pull request to the Bitcoin Core software repository indicating that nodes would stop performing BIP30 checks. After all, since BIP34 resolved this issue, this costly check was no longer necessary. Although it was not known at the time, technically, this was a hard fork for some very rare blocks in the future. It now appears that the potential hard fork is not significant, as very few are running node software prior to November 2015. At forkmonitor.info, we run Bitcoin Core 0.10.3 released in October 2015. Therefore, this was a rule prior to the hard fork, and clients were still performing the costly BIP30 checks.
Block 983,702 Issue
It turns out that there are some coinbase transactions in blocks prior to the activation of BIP34, where the first byte of the scriptSigs used happened to match the block height that would be valid in the future. Thus, while BIP34 effectively resolved this issue in nearly all cases, it was not a complete 100% fix. In 2018, Bitcoin developer John Newbery printed a complete list of these potential duplicates, as shown in the table below.
* Note: These blocks generated coinbase transactions in 2012 and 2017 are not duplicates. Block 209,921 (only 79 blocks from the first halving) cannot be duplicates, as BIP30 was enforced during this period.
Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
Number of Potential Duplicate Coinbase Transactions by Year
Source: https://gist.github.com/jnewbery/df0a98f3d2fea52e487001bf2b9ef1fd
Thus, the next potential block for duplicate transactions is 1,983,702, which will be produced around January 2046. The coinbase transactions from block 164,384 produced in January 2012 sent 170 BTC to seven different output addresses. Therefore, if miners in 2046 want to conduct this attack, they not only need to be fortunate enough to find this block but also need to burn less than 170 BTC in fees, with a total cost slightly exceeding 170 BTC, including the opportunity cost of the 0.09765625 BTC block subsidy.
Based on the current Bitcoin price of $88,500, this would cost over $15 million. As for the ownership of the seven addresses from the 2012 coinbase transaction, it remains unknown, and the keys are likely lost. Currently, all seven output addresses of this coinbase transaction have been used, with three used in the same transaction. We suspect these funds may be related to the Pirate40 Ponzi scheme, but this is merely our speculation. Therefore, this attack seems not only costly but also nearly useless for the attacker. To remove the node from the network that was forked 31 years ago in November 2015 would be a significant expense.
The next potentially replicable vulnerable block is 169,985 from March 2012. This coinbase transaction only spent just over 50 BTC, far below the 170 BTC. Of course, the 50 BTC was the subsidy at the time, and when this coinbase transaction becomes vulnerable to duplication in 2078, the subsidy will be much lower. Thus, to exploit this, miners would need to burn approximately 50 BTC in fees, which they cannot recover since those fees must be sent to the 2012 old outputs. No one knows what the price of Bitcoin will be in 2078, but the cost of this attack might also be frighteningly high. Therefore, this issue may not be a primary risk for Bitcoin, but it remains concerning.
Since the SegWit upgrade in 2017, coinbase transactions can also contain commitments to all transactions within a block. These blocks prior to BIP34 do not contain witness commitments. Therefore, to generate a duplicate coinbase transaction, miners would need to exclude any SegWit output redeeming transactions from the block, further increasing the opportunity cost of the attack, as the block may not be able to include many other transactions that pay fees.
Conclusion
Considering the difficulty and cost of duplicating transactions, along with the rarity of opportunities to exploit it, this duplicate transaction vulnerability does not appear to be a major security issue for Bitcoin. However, given the time scales involved and the novelty of duplicate transactions, it remains an interesting topic to consider. Nonetheless, developers have spent considerable time on this issue over the years, and the date 2046 may linger as a deadline in some developers’ minds for resolving this problem. There are many ways to fix this bug, which may require a soft fork. One potential fix could be to enforce SegWit commitments.