The Math of Scalability

Avishai Ish-Shalom (@nukemberg)

Math???

Define "scalability"

The relation between

  • Resources
  • Processing time
  • Problem size / Work

\[ S( R, T, W ) \]

Batch

\[ T = S(R) \mid W = const \]


Interactive

\[ W = S(R) \mid T = const \]

Scalability chart

Lies, damn lies and statistics

Someone will win the lottery
but
it won't be you

The law of truly large numbers

Once in a million events happen all the time

The birthday paradox

How many people should be in a room for
P[shared birthday] > 0.5?

Volume scales faster than surface

Connections \( \propto \mathcal{O} (n^2) \)

Subgroups \( \propto \mathcal{O} (2^n) \)

Emergent behavior

When do grains of sand become a heap?

Let's play a game

  1. Choose a number between 1 and 5, call that X
  2. Wait until you hear hand clapping
  3. Clap your hands X times
  4. Wait X seconds
  5. Go back to #2

When do re-mirrors become a storm?

Emergent behavior

  • Aggregate impact
  • Interactions of elements dominate
  • Non-linear emergence

Emergence of state

  • Interactions are state
  • Super linear scaling
  • Propagation time increases with scale

All large systems are essentially stateful

The Universal Scalability Law

The Universal Scalability Law

\[ X(N) = \frac {\gamma N} {1 + \alpha (N - 1) + \beta N (N - 1) } \]

  • 𝛼 - Contention; queueing for shared resource
  • 𝛽 - Consistency; Coordination between processes
  • 𝛾 - Relative scale parameter

$ \alpha $ - Contention

  • Waiting for shared resource
  • Queueing
  • Limited by shared resource

$ \beta $ - Consistency

  • Coordination between processes
  • Processes wait for each other
  • Limited by any process

What about latency?

How do we scale things?

By warping space and time!

Space warp

Time warp

What do these have in common?

  • Eearthquakes magnitudes
  • Armed conflicts
  • Population size of cities
  • Frequency of letters in text
  • Capital distribution
  • Social media activity

Normal distributions aren't

  • "Normal" distributions are rare IRL
  • "Normal" distributions assume independent events
  • Many things have feedback loops
  • Feedback loops result in power laws

Power laws to the people!

  • Self similarity
  • Pareto principle
  • Fat tails

Hot spots

All shards are equal, but some are more equal than others

Queue theory crash course

\( W \propto \frac { \rho } {1 - \rho} \)

Variance

\( W \propto \frac { \rho } {1 - \rho} \frac {C_s^2 + C_a^2} {2} \)

#FailAtScale

Component failure

independent → linear scaling

Interaction failure

dependent → super linear scaling

#FailAtScale

  • Statistical failures
  • Latency grows → timeouts
  • Failure demand (retries)

Go forth and scale

  • Lower the variance, raise the mean
  • Avoid coordination
  • Warp time and space
  • Reduce statistical failures

Quality is key to Scaling

"Quality" → less rework, uniformity

What have we learned?

  • Math helps us think
  • Models reveal scaling challenges

QED