Balrogs and OP_CATs: The Bridge to Khaza-doom?

An exploration into whether Bitcoin will actually be entirely destroyed if we reintroduce the OP_CAT opcode (due to MEV), to the detriment of all of human kind

Eric Wall
15 min readJun 7, 2024
The ability for an opcode to bring harmful MEV to Bitcoin depends on the extent 
to which it empowers L1 miners with both the ability and natural role to
order transactions in expressive, MEV-rich execution environments/contexts.

In Tolkien’s Lord of the Rings, Gandalf is attempting to take the fellowship over the mountain pass of Caradhras. However, the weather conditions are harsh, and no matter how the group slaves away against nature, their efforts seem futile.

You remember the scene from Peter Jackson’s film adaptation, right?

Why does Gandalf avoid Moria so? What is it that he fears that the rest of the fellowship are blissfully unaware of?

Of course, Gandalf is no fool to seek to avoid passing through the Mines of Moria (also known as Khazad-dûm).

If you read Silmarillion, you learn that Balrogs are among the second-most supreme deity in the Tolkien universe, and that they hid deep under the Earth after Morgoth was defeated at the end of the First Age.

Balrogs are like the MEV of Middle Earth. They are what you uncover if you dig too deep into the expressivity of blockchains. If you get too greedy digging for riches, they will fuck you and everything around you up.

The Dangers of MEV

Let me make a few concessions. There are some of us out there who are p̶r̶e̶t̶e̶n̶d̶i̶n̶g̶ imagining that MEV isn’t a big deal—that phrases like MEVil as coined by Matt Corallo are simply meant to scare you away from daring to explore solutions outside of the Lightning Network.

I contest those who refuse to acknowledge the dangers of MEV. One of the advantages with Bitcoin lagging Ethereum is that as bitcoiners, we have the privilege of learning from the fortunes of Ethereum as well as its misfortunes. As we become ready to advance beyond Lightning, we have a shot at doing it right.

In order to do it right, we absolutely must not handwave away the problems of Ethereum. We must inspect them closely.

Look at this shit

Source: Decentralization of Ethereum’s Builder Market

Do you know what that is? That is the current Ethereum block-building pipeline. It is completely fucked up.

Do you see the part that says “Builders” through which all transactions flow? These are the entities that construct the blocks that validators will select from and publish, in the advanced block construction market of current-day Ethereum. Let’s double-click on that part.

Graph from Data Always (the best blockchain data analyst I know).

Ethereum is highly expressive and thus rich in L1 MEV. As a result, in Ethereum, despite having pretty good validator distribution, 85% of all blocks are built by just three entities: Beaver, Titan and Rsync. Let’s zoom in on that just a little bit more.

Source: censorship.pics.

Beaver and Rsync are block builders that engage in censorship. They censor OFAC-non-compliant transactions from the blocks they build, and as such, over 60% of the blocks in Ethereum are currently OFAC-censoring (even though the underlying PoS validators mostly aren’t). Quite the issue for a network that has made so many tradeoffs to be censorship-resistant!

Now, while Bitcoin certainly has its own current issues when it comes to block building decentralization, you cannot convince me that expressivity is without peril in blockchains. Of course it is fraught with peril.

Before leaving this section I’ll note that Ethereum people are currently trying to find ways to address this issue at the protocol level using inclusion lists. It’s not worked out yet, and in Bitcoin we don’t have the luxury of pulling new tricks out of the hat with this kind of frequency each time Balrogs appear in our quarters. We must move slower and do better.

It’s the MEV that arises from reordering transactions inside a block that is the issue, dummy

Intrablock MEV. A miner earning more revenue from rearranging (or removing/inserting) individual transactions inside a specific block.

When you expand the scripting language of a blockchain like Bitcoin, transactions become “richer” in the things they may express.

As transactions gain more context and have more possible results, the act of manipulating their order may dramatically shift the profitability for miners. When miners engage in this activity to extract profit, there are two types of possible negative externalities:

  1. Applications built on-chain leak value to miners, i.e. users using a blockchain to trade get filled at worse prices
  2. Block building becomes a more competitive, specialized process, leading to centralization in the block production pipeline

Here, we focus on the second, more important issue. It is important to understand that this can happen to Bitcoin without the introduction of new opcodes too.

Any meta-layer protocol can introduce new meaning to transactions that occur on the Bitcoin blockchain. Ordinals do this by assigning value to imaginarily tracked satoshis, and this has caused some incentives for mining pools to be ordinals-aware (i.e. more specialized).

To imagine the worst possible case of this, you can imagine running a whole blockchain inside another blockchain—a sovereign rollup—where the transactions look nonsensical to regular nodes, but are fully-fledged EVM transactions to nodes that run the special sovereign rollup software.

Indeed, Trustless Computer, Chainway, Alpen Labs and Arch Network are projects that have flirted/are flirting with the idea of running sovereign rollups on Bitcoin.

Click play on the video.

Take note that transactions that simply pay more do not cause the same type of centralization pressure in the block production process. This is what most ordinal-related transactions do during most of the fee spikes you’ve seen recently. They simply pay more because the ordinals protocol gives users urgency to transact.

But as long as those transactions go into the public mempool, miners still just pick the highest fee-paying transactions to go into blocks. Nothing structurally changes.

Where the issue gets troublesome isn’t when transactions pay higher fees, but when mining carefully crafted blocks using specialized software leads to superior outcomes for miners, such that miners will flock to specific mining pools, or such that mining pools will flock around specialized block builders.

Okay, that was a lot of stuff to digest. Let’s take a breather.

Do you understand what we’re discussing or where we’re going with this? We’re discussing whether OP_CAT turns the Bitcoin blockchain into a more MEV-rich context to order transactions in, leading to centralization pressures in block production.

We’re going to leave this for now, since grasping this is a lot and return to it later. Now let’s talk about cats.

What the hell is OP_CAT (BIP 420) anyway?

The ability to concatenate elements on the stack—the next softfork in Bitcoin?

You may have noticed recently that there’s been a shift in developer consensus from being against the reintroduction of OP_CAT to being for. Just watch this clip from the bitcoin++ conference from last month.

Click play.

Here’s how I explain OP_CAT in the simplest possible terms:

Imagine that the Bitcoin scripting language is a very simple calculator that can only do very simple things to items on a stack. To understand how simple it is, it can add (+) and substract (-), but there’s not even a multiplication (×) button on it (OP_MUL was deactivated alongside OP_CAT in 2010 by Satoshi).

The OP_CAT button (opcode) smooshes things together. If you press CAT on the numbers 56 and 32, you get 5632. If you press CAT on the strings “SAT” and “OSHI” you get “SATOSHI”.

That’s all OP_CAT does… technically. It’s just 8 lines of code.

case OP_CAT:
{
if (stack.size() < 2)
return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
valtype& vch1 = stacktop(-2);
valtype& vch2 = stacktop(-1);
if (vch1.size() + vch2.size() > MAX_SCRIPT_ELEMENT_SIZE)
return set_error(serror, SCRIPT_ERR_PUSH_SIZE);
vch1.insert(vch1.end(), vch2.begin(), vch2.end());
stack.pop_back();
}
break;

And to be clear, when I tell people in the Ethereum ecosystem that in Bitcoin Land we’re deliberating over whether we can safely bring back this function to Bitcoin, it blows their minds.

The unique simplicity yet counter-intuitive complexity of CAT

“Here, with CAT, we’re not changing the code architecture. It is just so unbelievably simple. And for people who are maybe tired, or weary, or wary of how technically complicated softforks can get—CAT, you look at CAT, and you’re like ‘wow, there’s nothing to bikeshed on and there’s nothing to be scared of’. CAT is great. People like CAT. There are so many reasons to like CAT.”

—Andrew Poelstra, Director of Research at Blockstream, Co-inventor of Taproot (source)

To provide some interesting context to this quote, Andrew Poelstra is the person who discovered that you can create covenants with just OP_CAT and no other changes to Bitcoin. Covenants are neat little things that allow us to create state machines inside of Bitcoin where coins must be spent in various, predefined “paths”.

So how does that happen, if all that OP_CAT does is concatenate strings, numbers?

The secret behind this is that “just concatenating elements” is a powerful feature because of how well it maps to a bunch of things in cryptography. Concatenation and hashing is all you need to construct a Merkle tree (and to verify a Merkle root or a Merkle branch), for example.

How a Merkle tree is constructed. Concatenation and hashing. Simple yet powerful.

The hashes of Bitcoin transactions when we’re creating Schnorr signatures (Taproot) are actually built up in a similar way:

Concatenation and hashing. It’s how you produce the message that you’re signing when making Taproot (Schnorr) signatures! It basically means we can check the value of each of these fields individually when validating a transaction (=covenants!).
Slide from the CatVM children’s book. Young Rijndael realizes the full extent of things that can be achieved through concatenation (OP_CAT) alone.

So, it turns out that OP_CAT allows you to pull out and “inspect” individual parts of a transaction such as its inputs, its outputs, their destination addresses, or the bitcoin amounts involved. This is known as “introspection”, and it’s the checking of these conditions that allow us to create finite state machines with Bitcoin Script and UTXOs.

What’s this? A finite state machine? Involving an optimistic withdrawal pattern? Yes, finite state machines are the building block underneath powerful L2 constructions powering inventions like rollups and plasmas. See how things start to get complicated pretty quick?

The Danger of OP_CAT, explained

Covenants are powerful little primitives, and there used to be a time when Poelstra believed they could be dangerous. Here’s another story from Poelstra:

The future of your children rests on one man’s inability to grant himself beer through cryptographic shenanigans.

After many years of being skeptical, Poelstra decided that it doesn’t matter that covenants allow you to constrain where funds are going in a Bitcoin transaction because basic multisig policies can do the exact same thing.

Indeed, a selected multisig signer can arbitrarily require that a spend transactions must adhere to a specific shape or follow along some specific path in a state machine, just like with covenants! So how could covenants be dangerous, if the danger is already here?

What‘s missing from that analysis

Just because you add “multisig policies” to something and emulate state machines that way, it does not mean that this ability will give rise to a flourishing ecosystem of vaults and L2 constructions ontop of Bitcoin.

Even though a multisig signer could technically constrain transactions in the same way a covenant could, and could simulate the exact same finite state machine the covenant would, it is exactly because covenants do not require trusted third parties to enforce this that make them set the stage for new ecosystems to evolve on Bitcoin.

If that’s too difficult to think about, let’s take a more simple example. One of the most pernicious things you can build on the baselayer of any blockchain is an onchain DEX/AMM.

DEXes produce 50% of the MEV on the Ethereum blockchain due to the natural arbitrage that exists between CEX ⇔ DEXes.

Why do people use DEXes? Because they can’t freeze or steal your funds and they don’t require KYC to use. It is trivial to understand that a “multisig signer” wouldn’t be sufficient to birth a vibrant DEX ontop of Bitcoin because a multisig signer cannot provide either of the properties that users are interested in. No incentive for users, no volume. No volume, no MEV. No MEV, no centralizing pressures in block production.

We must analyze the second- and third order effects these constructions give rise to in their trustless forms.

Two-way pegs into expressive realms

Now, it is somewhat hypothesized that OP_CAT would not enable DEXes on the Bitcoin L1 (the primary theory for this is that there’s no native token standard ontop of Bitcoin that’s recognized by Bitcoin Script).

While token protocols like BRC20s and Runes exist, Bitcoin Script does not have any ability to verify these assets, their amounts, and cannot codify outcomes based on them. While you could hypothetically imagine such protocols emerging, the predominant view is that they’d be impractical to run on the Bitcoin baselayer regardless.

What the conversation is really about is the L2s that OP_CAT might give rise to. And this is where things get really interesting.

As mentioned, the key thing OP_CAT does is it allows you to build covenants (with some degree of transaction introspection), and it allows you to verify Merkle branches. These two things, when combined, allow you to create permissionless bridges to fully expressive L2s on Bitcoin.

How this happens is detailed in a children’s book, CatVM — There And Back Again, which basically combines the ideas of CAT-based covenants from Andrew Poelstra and Merkleize All The Things (MATT) by Salvatore Ingala.

The end result is that you’re able to build constructions very similar to Plasma or Optimistic Rollups, just from the addition of those 8 lines of code that make up OP_CAT. In these L2s, creating a DEX would be trivial.

Even STARK verifiers (the underlying technology of ZK-Rollups like Starknet) could be arranged from these primitives. While often thought of as “moon math”, the key concepts underlying STARKs are Merkle trees, hashing, and basic arithmetic.

Another page from the CatVM children’s book.

With all this potential to build competitive second-layers ontop of Bitcoin primed for the modern era, it now becomes crucial to understand L2 ⇒ L1 MEV leakage. Would an expressive L2 on Bitcoin, where lots of MEV is generated, leak MEV down to the baselayer and impact Bitcoin mining decentralization?

Are we beginning to tread down the same trouble-ridden path that drivechain proponents have trodden when considering the MEV effects of OP_CAT? Let’s explore.

The Safety of OP_CAT, explained

One of the unfortunate properties of drivechains (and sidechains like Rootstock) is that Bitcoin L1 full nodes cannot easily check them for validity. Instead, blocks are merge-mined with Bitcoin and all bridged funds are placed in a hashrate escrow that miners control.

It is assumed that drivechains derive some of the security of Bitcoin in this way. I contend that the issue for drivechains is that since they cannot be checked for validity by L1 full nodes, they do everything they can to look similar to Bitcoin in other ways (by merge-mining their blocks with Bitcoin and handing over custody of the funds to miners, mirroring as much from Bitcoin as they can).

Drivechain detractors say that this bridges over the problems of the MEV generated on the drivechain to the mainchain. Its proponents argue that when you play out this economic reality in full, miners won’t be involved in ordering the drivechain transactions at the economic limit, much for the same reasons that Ethereum validators are detached from block building in the economic reality of the world today (in a process known as PBS).

You remember the problem statement, right? The issue of Bitcoin MEV only arises to the extent to which we enable new ways and incentives for miners to control transaction ordering at the precision of individual transactions in MEV-rIch environments

You see how this breeds the ground for a problematic discussion, right? You want to argue that the Bitcoin mining incentives secures the drivechain ordering in one direction →, but that none of the negative externalities (such as MEV) flow back in the other direction ←. It’s a difficult needle to thread, argument-wise.

It’s about the sequencing, dummy

It may surprise you to learn that Ethereum rollups today, for all their flaws, do not bridge MEV back to the Ethereum mainchain, even though they generate lots of MEV.

Plenty of DEX volume on Ethereum exist on L2s!. More than 40% of the total.

The key insight is that it’s not really the bridge that matters. Drivechain bridge, sidechain bridge, optimistic rollup bridge, ZK-rollup bridge, it doesn’t actually matter!

Ethereum L2 MEV is isolated from the Ethereum L1 because all Ethereum rollups are sequenced by permissioned sequencers. The reason that rollups can be sequenced by a single sequencer or a federation of sequencers and still be safe is because rollup validity can be determined by the L1 nodes.

It’s actually quite important that whatever new L2s you introduce to Bitcoin, they’re sufficiently powerful such that they can safely be sequenced by permissioned sequencers (distinct from Bitcoin L1 miners), while still allowing users to unilaterally exit with their funds. OP_CAT provides this power.

Bitcoin’s unreliable block times make it inadhesive to MEV

In this article so far, we’ve established that MEV only arises to the extent that it makes economic sense to build something, not whether something can be built.

Andrew Poelstra argued that any covenant can be emulated by a multisig policy—yet evidently, we don’t have DEXes on Bitcoin today using such multisig covenants (since they don’t benefit users).

In the same fashion, it would only make sense to have Bitcoin’s L1 miners sequence individual transactions of Bitcoin L2s if it benefitted users in some way. To the contrary, users are massively benefitted from permissioned sequencers from a UX perspective because they can provide fast, reliable soft-confirmations, rather than the slow and unpredictable ~10 minute block times that Bitcoin miners provide.

There are further economic realities to consider here. Any successful L2 will require a lot of technical infrastructure. There is very little incentive to develop and maintain such infrastructure, only to leave all fee revenue to Bitcoin miners while providing a subpar UX for users.

Based rollups
It may interest you to learn that in Ethereum, certain researchers are advocating for rollups to be sequenced by Ethereum L1 validators (“based” rollups) in order to address the liquidity fragmentation issues that Ethereum L2s are currently facing.

The logic goes that if you use a unified sequencer set for rollups, it becomes possible for one DEX on one rollup to tap into the liquidity of another DEX on another rollup in the same sequencer confirmation. Since the Ethereum PoS baselayer has a 12-second confirmation time, some could see it as a fitting system for shared sequencing.

If that were to happen, it could bring back rollup MEV to the baselayer of Ethereum.

It is interesting to note that Bitcoin is virtually immune to this kind of design choice, since the risk that Bitcoin L2s would get together and assign sequencing rights to Bitcoin L1 miners and degrade UX latency to several hour delays at times—completely breaking applications like DEXes (where the user needs to know that a trade has landed and the price hasn’t slipped away from them)—simply doesn’t make sense.

But wait Eric, I don’t even want to have these conversations! Can’t I just not have expressive Bitcoin L2s and not have OP_CAT?

The decision of adding OP_CAT to Bitcoin stems from the rough consensus we already have of adding covenants to Bitcoin. OP_CAT is simply the frontrunner among several possible covenant proposal designs.

Adam Back, in reference to OP_CAT, in Bloomberg.

OP_CAT is generally favored for the simplicity of its design at a code level, and the infinitesimal risk that it would introduce bugs in the codebase.

With any covenants proposal, you increase the functionality of Bitcoin. As we’ve seen with Ordinals, sovereign rollups and even Rootstock (which all do not require a softfork by the way and already exist today), it is not possible to grace the Bitcoin baselayer from the introduction of MEV.

It is therefore important that when we expand the functionality of Bitcoin, we allow for safe avenues for the MEV to migrate to, where users can enjoy superior expressiveness and cheaper costs without endangering the baselayer.

The best way to disincentivize people from trying to build ugly DEXes onchain with a set of rudimentary opcodes, or to have boundless meta-layer protocols emerge inside the Bitcoin blockchain itself, is to allow scalable, expressive environments to be built ontop of Bitcoin that are sequenced in a way that is safely distinct from the Bitcoin L1 mining process.

Conclusion

In Tolkien’s books, the Bridge to Khazad-dûm was an intentionally narrow bridge with no guardrails on either side. It was an ancient defense system against any enemy who might attempt to cross it since walkers on the bridge could only cross it in single-file.

The dwarves who built Moria understood that its not necessarily about how powerful or how many your enemies are, but how your enemies are sequenced (single-file!).

It turns out being able to safely validate the L2 at the level of L1 full nodes opens up the design space for a variety of safe L2 sequencing options (including permissioned, Balrog-resistant ones).

OP_CAT is a simple but powerful opcode. It is so powerful that it gives Bitcoin L1 full nodes the ability to verify the validity of L2 withdrawals, rather than hitching its wagon to miner incentives and poisoning them with harmful, centralizing MEV.

Expressive economic activity can be secured by a blockchain like Bitcoin without drastic MEV consequences for the base-chain as long as that environment is safely sequenced by non-miners. OP_CAT enables covenants but it also has the power to insulate the L1 from MEV effects by virtue of enabling sufficiently powerful L2s.

--

--

No responses yet