This, but for cloud migrations. The quarterly "cloud is cheaper than on-prem when you account for all on-prem costs and everyone who thinks otherwise is clueless" thread is happening again. Every time, people with serious on-prem experience will poi

Thread

This, but for cloud migrations. The quarterly "cloud is cheaper than on-prem when you account for all on-prem costs and everyone who thinks otherwise is clueless" thread is happening again.

Every time, people with serious on-prem experience will point out that's false.

The reason these threads always get rebutted is that cloud advocates don't note that a common cause of cloud being superior is organizational dysfunction that prevents the company from running on-prem hardware effectively, which the thread creator incorrectly assumes is universal

It's understandable that people would think that given how many companies are dysfunctional, e.g., at a typical car company you've heard of before they moved to the cloud (and I have enough visibility into various car company tech operations that it's fair to say typical here),

their operations would go down for hours a day every weekday because they didn't have enough servers to handle the load (and, BTW, at the time, the number of servers necessary was small).

They tried to get more servers but the request was rejected because

they had enough servers to handle average load, so additional servers were a waste of money

The people who wanted their systems to work were able to bypass the IT provisioning process by moving to the cloud, so this is a cloud success story, but it's an odd kind of success story

BTW, this isn't a particularly dysfunctional car company on the tech side. For a particularly dysfunctional example, see www.patreon.com/posts/73915260.

On the flip side, people who haven't seen a competent org at work are often incredulous at what a competent org can accomplish, e.g.,

a common response to

is that it's some kind of weird anti-cloud fabrication since no one could operate that much hardware with 2 people splitting 1 FTE of work, but if you talk to competent operators, it's really not that extraordinary.

I actually had a conversation with one of the two IT people involved in that operation, who was blown away by the high confidence replies that what he did was impossible.

He's now at a different company with a team of 4 people that manages servers across 3 datacenters, with

significantly fewer servers per person because operating machines in multiple locations is relatively high overhead, but still more cores than most $10B companies need.

And it turns out that many businesses don't need >= 3 regions, e.g., Twitter ran on 2 DCs basically forever,

Stripe ran its VMs out of a single AWS AZ until it had basically caught Twitter in terms of valuation, etc., which I mention because another common response is that you can't have a lean on-prem team because you need datacenters all over the world.

Sure, some do, but most don't.

is an underrated point in general. I recently upgraded from my old phone (5 year old iPhone 8) to the new iPhone, so I have the fastest phone money can buy.

On very fast internet, most apps take ~3 seconds before they load enough state to be usable.

I know quite a few people who do this kind of optimization work and the fixes to drive "time to first X", fast response, etc. for apps tends to be straightforward compared to spinning up a bunch of regions.

Companies almost always underinvest in simple high-ROI performance work.

Same goes for reliability. Another response was that cloud is good because it's easy to go multi-AZ. That response came from someone whose company famously had 2 9s of uptime (amortized 3.6 days downtime/yr).

You don't need multiple AZs when you have 2 9s of uptime.

IIRC, dominion.isotropic.org originally used some kind of fancy cloud-based solution and then switch to running on a single machine in the guy's apartment and had better uptime than most companies with fancy multi-AZ and multi-region setups.

Mentions

See All

Patrick McKenzie @PatrickMcKenzie · Oct 29, 2022

Post
From Twitter

My somewhat contrarian take on this excellent thread: if your software offering papers over inadequacies in human systems which persist even though they’re obviously value destructive, then your software offering is *extremely valuable.*

Thread by Dan Luu

Thread

Mentions