Blockchain Fundamentals

Distributed Systems

Distributed Systems Basics

Module 1 of Blockchain Fundamentals


What Is a Distributed System?

A distributed system is a collection of independent computers that appear to users as a single coherent system.

Examples:

  • The internet itself
  • Google Search (millions of servers)
  • Netflix (globally distributed)
  • Blockchain networks (thousands of nodes)

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." — Leslie Lamport


Why Distributed Systems Matter for Blockchain

Blockchain is fundamentally a distributed system problem:

  • How do thousands of computers agree on one truth?
  • How do we handle computers that fail or lie?
  • How do we maintain consistency without a central authority?

Understanding distributed systems is essential to understanding blockchain.


Core Challenges

1. Partial Failures

In centralized systems, it either works or it doesn't. In distributed systems:

  • Some nodes fail while others work
  • Failures are often silent (no response)
  • Hard to distinguish slow from dead
Node A: Working ✓
Node B: Crashed ✗
Node C: Working ✓
Node D: Slow (is it dead?)
Node E: Working ✓

2. Unreliable Networks

Networks are not reliable. Messages can be:

  • Lost: Packet never arrives
  • Delayed: Arrives much later
  • Duplicated: Arrives multiple times
  • Reordered: Arrives out of sequence

You cannot tell the difference between:

  • A crashed node
  • A very slow node
  • A network partition

3. No Global Clock

Each computer has its own clock. Clocks drift:

  • CPU clocks drift ~50ppm (50 microseconds per second)
  • Over a day: ~4 seconds of drift
  • Network latency adds uncertainty

Consequence: "What happened first?" is surprisingly hard to answer.

4. Byzantine Failures

Nodes might not just fail — they might actively lie or attack.

Failure TypeBehavior
Crash failureNode stops responding
Omission failureNode drops some messages
Byzantine failureNode sends arbitrary/malicious data

Blockchain must handle Byzantine failures (the hardest kind).


The CAP Theorem

In a distributed system, you can have at most 2 of 3:

        Consistency
           /\
          /  \
         /    \
        /      \
       /   ??   \
      /          \
     /____________\
Availability    Partition
                Tolerance

Definitions

  • Consistency (C): All nodes see the same data at the same time
  • Availability (A): Every request gets a response
  • Partition Tolerance (P): System works despite network splits

The Reality

Network partitions WILL happen. So you must choose:

  • CP: Consistent but may be unavailable (traditional databases)
  • AP: Available but may be inconsistent (many web services)

Where Does Blockchain Fit?

Bitcoin chooses eventual consistency with partition tolerance:

  • During partition: chains may diverge
  • After partition heals: longest chain wins
  • Consistency is probabilistic, not immediate

Consensus

Consensus = Getting all nodes to agree on a single value.

Why It's Hard

The Two Generals Problem:

General A                    General B
    |                            |
    |---"Attack at dawn"-------->|
    |                            |
    |<---"Confirmed"-------------|
    |                            |
    |---"Got confirmation"------>|
    |                            |
    ...continues forever...

Neither general can be certain the other will attack. No number of messages can fix this.

The Byzantine Generals Problem

Even worse: some generals might be traitors who send conflicting messages.

The Result (Lamport, 1982):

  • With f Byzantine nodes, you need 3f+1 total nodes
  • Requires 2/3 honest majority

Practical Solutions

AlgorithmTypeSpeedByzantine Tolerant
PaxosCrash faultFastNo
RaftCrash faultFastNo
PBFTByzantineSlowYes
NakamotoByzantineSlowYes
TendermintByzantineMediumYes

Replication Strategies

Primary-Backup

One leader handles writes, replicates to followers.

  • Simple
  • Single point of failure
  • Not Byzantine tolerant

State Machine Replication

All nodes execute same commands in same order.

  • Consistent state
  • Requires consensus on ordering
  • Foundation of blockchain

Blockchain Approach

  • Block producers propose state transitions
  • Network reaches consensus on which blocks to accept
  • State machine replication without fixed leader

Timing Models

How you model time affects what's possible:

Synchronous

  • Known upper bound on message delay
  • Known upper bound on processing time
  • Easier to design, unrealistic in practice

Asynchronous

  • No timing guarantees
  • Messages can be delayed arbitrarily
  • FLP Impossibility: Cannot guarantee consensus with even one crash failure

Partially Synchronous

  • Asynchronous, but eventually becomes synchronous
  • Realistic model for internet
  • Blockchain operates here

Key Distributed Systems Concepts for Blockchain

1. Eventual Consistency

Nodes may temporarily disagree, but will eventually converge.

2. Idempotency

Operations can be safely repeated (important when messages duplicate).

3. Atomic Broadcast

All nodes receive messages in the same order (what blockchain provides).

4. State Machine Replication

All nodes maintain identical state by processing identical inputs.


Key Takeaways

  1. Distributed systems are hard — failures and timing are unpredictable
  2. CAP theorem forces tradeoffs — blockchain chooses eventual consistency
  3. Byzantine fault tolerance is expensive — but necessary for trustless systems
  4. Consensus is the core problem — blockchain's main innovation
  5. Timing matters — blockchain works in partial synchrony model
  6. State machine replication — the foundation of blockchain architecture

Questions to Consider

  1. Why can't traditional databases solve blockchain's problem?
  2. What happens if more than 1/3 of nodes are malicious?
  3. How does Bitcoin handle network partitions?
  4. Why is "eventual consistency" acceptable for money?