Distributed Systems Basics

Module 1 of Blockchain Fundamentals

What Is a Distributed System?

A distributed system is a collection of independent computers that appear to users as a single coherent system.

Examples:

The internet itself
Google Search (millions of servers)
Netflix (globally distributed)
Blockchain networks (thousands of nodes)

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." — Leslie Lamport

Why Distributed Systems Matter for Blockchain

Blockchain is fundamentally a distributed system problem:

How do thousands of computers agree on one truth?
How do we handle computers that fail or lie?
How do we maintain consistency without a central authority?

Understanding distributed systems is essential to understanding blockchain.

Core Challenges

1. Partial Failures

In centralized systems, it either works or it doesn't. In distributed systems:

Some nodes fail while others work
Failures are often silent (no response)
Hard to distinguish slow from dead

Node A: Working ✓
Node B: Crashed ✗
Node C: Working ✓
Node D: Slow (is it dead?)
Node E: Working ✓

2. Unreliable Networks

Networks are not reliable. Messages can be:

Lost: Packet never arrives
Delayed: Arrives much later
Duplicated: Arrives multiple times
Reordered: Arrives out of sequence

You cannot tell the difference between:

A crashed node
A very slow node
A network partition

3. No Global Clock

Each computer has its own clock. Clocks drift:

CPU clocks drift ~50ppm (50 microseconds per second)
Over a day: ~4 seconds of drift
Network latency adds uncertainty

Consequence: "What happened first?" is surprisingly hard to answer.

4. Byzantine Failures

Nodes might not just fail — they might actively lie or attack.

Failure Type	Behavior
Crash failure	Node stops responding
Omission failure	Node drops some messages
Byzantine failure	Node sends arbitrary/malicious data

Blockchain must handle Byzantine failures (the hardest kind).

The CAP Theorem

In a distributed system, you can have at most 2 of 3:

        Consistency
           /\
          /  \
         /    \
        /      \
       /   ??   \
      /          \
     /____________\
Availability    Partition
                Tolerance

Definitions

Consistency (C): All nodes see the same data at the same time
Availability (A): Every request gets a response
Partition Tolerance (P): System works despite network splits

The Reality

Network partitions WILL happen. So you must choose:

CP: Consistent but may be unavailable (traditional databases)
AP: Available but may be inconsistent (many web services)

Where Does Blockchain Fit?

Bitcoin chooses eventual consistency with partition tolerance:

During partition: chains may diverge
After partition heals: longest chain wins
Consistency is probabilistic, not immediate

Consensus

Consensus = Getting all nodes to agree on a single value.

Why It's Hard

The Two Generals Problem:

General A                    General B
    |                            |
    |---"Attack at dawn"-------->|
    |                            |
    |<---"Confirmed"-------------|
    |                            |
    |---"Got confirmation"------>|
    |                            |
    ...continues forever...

Neither general can be certain the other will attack. No number of messages can fix this.

The Byzantine Generals Problem

Even worse: some generals might be traitors who send conflicting messages.

The Result (Lamport, 1982):

With f Byzantine nodes, you need 3f+1 total nodes
Requires 2/3 honest majority

Practical Solutions

Algorithm	Type	Speed	Byzantine Tolerant
Paxos	Crash fault	Fast	No
Raft	Crash fault	Fast	No
PBFT	Byzantine	Slow	Yes
Nakamoto	Byzantine	Slow	Yes
Tendermint	Byzantine	Medium	Yes

Replication Strategies

Primary-Backup

One leader handles writes, replicates to followers.

Simple
Single point of failure
Not Byzantine tolerant

State Machine Replication

All nodes execute same commands in same order.

Consistent state
Requires consensus on ordering
Foundation of blockchain

Blockchain Approach

Block producers propose state transitions
Network reaches consensus on which blocks to accept
State machine replication without fixed leader

Timing Models

How you model time affects what's possible:

Synchronous

Known upper bound on message delay
Known upper bound on processing time
Easier to design, unrealistic in practice

Asynchronous

No timing guarantees
Messages can be delayed arbitrarily
FLP Impossibility: Cannot guarantee consensus with even one crash failure

Partially Synchronous

Asynchronous, but eventually becomes synchronous
Realistic model for internet
Blockchain operates here

Key Distributed Systems Concepts for Blockchain

1. Eventual Consistency

Nodes may temporarily disagree, but will eventually converge.

2. Idempotency

Operations can be safely repeated (important when messages duplicate).

3. Atomic Broadcast

All nodes receive messages in the same order (what blockchain provides).

4. State Machine Replication

All nodes maintain identical state by processing identical inputs.

Key Takeaways

Distributed systems are hard — failures and timing are unpredictable
CAP theorem forces tradeoffs — blockchain chooses eventual consistency
Byzantine fault tolerance is expensive — but necessary for trustless systems
Consensus is the core problem — blockchain's main innovation
Timing matters — blockchain works in partial synchrony model
State machine replication — the foundation of blockchain architecture

Questions to Consider

Why can't traditional databases solve blockchain's problem?
What happens if more than 1/3 of nodes are malicious?
How does Bitcoin handle network partitions?
Why is "eventual consistency" acceptable for money?