Scalable tools

Distributed, multi-node systems like mongo/rabbitmq/kubernetes/microservices are often grouped together and called “scalable”. This suggests that these kinds of tools can be used at scale, as opposed to the rest, which do not scale as well.

This suggestion is wrong:

these tools do not guarantee scalability, and
single-node systems can scale fine, most (like 99.9%) of the time.

So, what these multi-node “scalable” tools are really doing is allowing potentially higher ceilings. And you probably won’t be anywhere close to your ceilings, unless you are one of the biggest companies in the world, even there you likely only reach this ceiling in certain hotspots.

Funny: “MongoDB is Web Scale”

Better ways to scale

Your algorithms and data structures matter 100x more than your tools.

Setup some benchmarks and know the limitations of different operations. For common latencies, I find these back of the envelope calculations useful.

Look around a bit to understand why your more boring not so scalable tools are slow. They might have features, or options, that improve your bottlenecks. e.g.

SQLite has WAL mode that vastly improves writes
Postgres has a feature called Materialized Views that removes performance issues associated with many joins
SQL has indexes which you should be using appropriately - Use the Index, Luke

Use caching to ease load of your main database.

Run multiple servers, one per client / namespace / region / etc. AKA Single tenancy. See DHH - Multi-tenancy is what’s hard about scaling services.

Tennis and trade-offs

As a recent beginner dabbling in tennis, I was in the market for a racket. I bought one for $30, not $200. It could’ve been better, but as a non-professional, the marginal improvement is not worth the 6x investment.

The tradeoffs of the two tennis rackets are simple. More expensive, but better quality in every way, probably.

The list of tradeoffs of technology with higher ceilings is longer and often not taken into consideration at all.

complexity
latency
consistency - distributed systems often have to sacrifice immediate consistency (as they achieve higher throughput via eventual consistency)
operating costs
learning curve

Ya ain’t gonna need it (YAGNI)

There are tons of mantras (YAGNI, Premature Optimization) that emphasize focusing on the now/soon rather than focusing on “what might be in the future”.

But there are simply too many variables to predict as you think about orders of magnitude.

“I’ve built three distributed job systems at this point. A handy rule of thumb which I have promoted for years is ‘build for 10x your current scale.’” - random hn dude