Thread
This is a post in appreciation of the Ethereum PoS protocol, in the context of recently raised questions. @VitalikButerin @dannyryan @drakefjustin @dankrad @casparschwa @zmanian @barnabemonnot

While some people are asking why the Ethereum chain re-orged (answer here:

), others are raising a more fundamental question: why is the Ethereum chain designed to be re-orgable.
A brief summary of the Ethereum PoS protocol: it runs like a longest-chain protocol (more specifically the GHOST protocol) with a finalizing BFT gadget (Casper protocol) that activates every F blocks, thus F (=32 in practice) is the period of finalization
There are some technical nuances: LMD, proposer boost, no-weights-for-equivocation. Some details here: arxiv.org/abs/2003.03052
Why would the Ethereum PoS protocol be designed as a complex amalgam of two different protocols rather than choosing a clean BFT implementation like Tendermint (go Cosmos!) or a secure longest-chain protocol like Ouroboros (invented at Cardano)?
One answer is that it is due to the large validator group needed by Ethereum. We do not want 100,000 signatures aggregated every block. I think this answer is temporary and can be solved with better signature ZK-aggregation schemes, so we dig deeper.
There is a more fundamental answer that arises from an unsolvable tradeoff in blockchains: a protocol can either be finalizing (i.e., confirm a block safely even under asynchronous network conditions) or be dynamically available (i.e, make progress under uncertain participation).
A clear merit of finalization is accountability, i.e., if anyone produces a conflicting finalized block, it is possible to prove which nodes participated in the malicious conflict and hence can be slashed.
More on accountability here (including why Algorand, a quorum-based BFT protocol is not accountable!): arxiv.org/abs/2010.06785
The merit of dynamic availability may be less obvious but we have a great historical incident now to explain it. Consider the crypto-mining ban in China: more than 50% of mining power was immobilized.
Since Bitcoin is dynamically available, the chain marched on making blocks through the ban without flinching. This is a massively under-appreciated feat.
Imagine a crypto-staking ban in a prominent country and if Ethereum ran on a finalizing protocol like Tendermint, it can stall completely due to validators going offline. While it is possible to socially coordinate a recovery by kicking out such validators, this is contentious.
A chain stalling for weeks to months on proclamation of a ban is enough FUD to kill many a chains. Thus dynamic availability is a highly desirable property.
That no protocol can break be both dynamically available and finalizing was proven sharply by A. Lewis-Pye and Tim Roughgarden ( arxiv.org/abs/2101.07095 ). Therefore, it seems that the portmanteau protocol Gasper and thus Ethereum PoS was attempting the impossible.
Three years back, I discussed this problem with my collaborator Prof. David Tse (Stanford) that ethereum was trying to have its cake and eat it too by building two different confirmation rules inside the same protocol
One confirmation rule respects the tip of the longest chain and a more conservative rule that only respects the finalized block.
While Gasper is a complex practical protocol with too many features, and hence difficult to prove mathematically about, we asked a basic question: can we construct protocols which have two distinct ledgers that achieve these distinct properties.
One ledger is dynamically available and another ledger is safe under asynchrony; and the two match under full participation and synchrony.
This initiated a scientific quest to identify sharply under what conditions is it possible to have a single protocol with two ledgers. We had two sub-groups racing to solve this problem, when David Tse’s group produced this absolute masterpiece: arxiv.org/abs/2009.04987
We built on it with our own contribution on achieving both properties - the checkpointed longest-chain: arxiv.org/abs/2010.13711
Gasper belonged to this family of protocols that have two confirmation rules, one of which is dynamically available and another is finalizing, i.e., safe under asynchrony.
To understand the merits of Ethereum’s PoS protocol, let us dive into a though experiment where a large fraction of PoS stakers go offline. Now the dynamically available portion of Ethereum ledger, i.e., Ghost continues to produce blocks.
Furthermore, Ethereum PoS has inactivity leak: the ability to slowly leak stake from validators that are absent (this can be either slashing funds or simply removing their voting power from the validator set).
This means that the dynamically available part continues to build a chain and when the absent validators are significantly leaked (~ weeks), the remaining validators have a supermajority and can start producing finalized blocks.
This self-healing property, i.e., the return-to-finalization without user intervention and subjectivity is indeed extremely useful. Thus, Ethereum PoS is designed to be robust to insanely adversarial events and still thrive. Let us take a moment to appreciate this protocol.
Before we conclude, there is a caveat: if the attack is not simply to turn off validators, but to make them behave more maliciously (i.e., pretend as if they went offline and then make conflicting blocks), then there can be a contention that can only be resolved via soft-fork.
It is much easier to force nodes to shut down than to force nodes to hand over keys to run arbitrary malicious software hence Ethereum is optimized to survive the former attack algorithmically, but it can still survive the latter via subjective intervention (a soft fork).
Finally, since there are complex tradeoffs in protocol design, it is not possible to appreciate the strengths of any given protocol without deep consideration. Ethereum PoS has been designed to be extremely resilient and to thrive even in a highly adversarial environment.
When you have dual confirmation rules, like in ETH, how does one set the confirmation rule?

Mentions
See All