Skip to main content
  1. Notes/

Overview of Distributed Systems & Networking

Distributed Systems

A distributed system is a collection of independent machines that appear to users as a single coherent system. These machines communicate and coordinate over a network to share resources, improve efficiency, and provide fault tolerance.

It is important to distinguish between applications that run on distributed systems (e.g., cloud-based software) and the infrastructure that makes distributed systems possible (networking, protocols, storage). Applications usually interact with the system as if it were a single machine, due to abstractions that hide the complexity.

Distributed vs Decentralized

  • Distributed system: Tasks and resources are spread across multiple servers to increase efficiency and reliability.
  • Decentralized system: Systems are spread across multiple locations by design (e.g., Air Traffic Control systems at each airport, communicating).

Distributed and decentralized are not mutually exclusive. For example, Air Traffic Control systems are

Distributed (multiple servers provide redundancy and backups) and Decentralized (servers are physically located
in different regions.)</p>

Goals of Distributed Systems

  • Resource Sharing – share hardware, software, and data.
  • Distribution Transparency – hide system complexity:
    • Access transparency: Access without knowing the machine.
    • Location transparency: Resource location doesn’t matter.
    • Replication transparency: Multiple copies exist but are invisible to the user.
    • Failure transparency: The system keeps running despite failures.

Core Concepts

  • Availability: System works correctly at a given time.
  • Reliability: System continues to work without failures.
  • Safety: Avoid catastrophic errors (e.g., incorrect results).
  • Recoverability: Ability to recover from failure.
  • Maintainability: Easy to update and repair.
  • Security: Protecting confidentiality, integrity, and availability (CIA triad).

Fault Tolerance

Failures are inevitable in distributed systems. Systems must detect, handle, and recover from them. More machines, more tolerance.

  • A failure occurs when the service doesn’t meet its specification.
  • An error is the incorrect state that leads to a failure.
  • A fault is the underlying cause of the error (e.g., hardware defect, software bug).

Types of Faults

  • Transient: Happens once, then disappears.
  • Intermittent: Occurs occasionally.
  • Permanent: Persists until fixed.

Handling Failures

Usually done with non-volatile storage and replication , where multiple copies of data are stored. This introduces complexity to the system, where copies must be updated to be consistent when one is changed by the user. There are a few ways to ensure consistency across replicas:

Consistency Models:
  • Strong consistency: All copies are updated immediately (e.g., banks must always agree on account balances).
  • Weak consistency: Updates may take time but eventually synchronize (e.g., social media feeds).

Scalability

  • By size (number of users, data volume).
  • By geography (multiple regions).
  • By administration (different teams or organizations).


Networking

Networking is the foundation of distributed systems. It allows processes on different machines to communicate and share data using protocols like TCP and UDP.

IP Addresses & Ports

  • Each port on Unix/Linux can only be used by a single process at a time.
  • Source ports can be assigned explicitly or chosen by the system (ephemeral/temporary).
  • Destination ports must always be specified.

Sockets

A socket is an endpoint for sending and receiving data.

Berkeley sockets (Unix) provide the standard API used by most operating systems.

In practice, creating a socket is done via the system call:

int socket(int domain, int type, int protocol);
// domain: IPv4 (AF_INET) or IPv6 (AF_INET6)
// type: TCP (SOCK_STREAM) or UDP (SOCK_DGRAM)
// protocol: usually 0 (default)
  

The call returns a file descriptor (an integer ID used by the OS). If it fails, it returns -1.

TCP vs UDP

There are two commonly used transport protocols used to transmit messages between machines. The goal of all transport protocols is to deliver error-free, in sequence messages with no duplicates or lost packages.

  • TCP (Transmission Control Protocol)
    • Reliable, connection-oriented.
    • Guarantees ordered, lossless delivery.
    • More overhead, less efficient for many-to-many communication.
  • UDP (User Datagram Protocol)
    • Unreliable, connectionless.
    • Fast and lightweight.
    • Useful for real-time apps (gaming, video calls).
    • Supports multicasting (sending one message to multiple receivers)..

Typical Workflow

Server-Side Flow (TCP)
  1. socket() → create socket.
  2. bind() → assign IP address + port to socket.
  3. listen() → mark as listening for connections.
  4. accept() → accept an incoming connection.
  5. read() / write() → exchange data.
  6. close() → free resources.
Client-Side Flow (TCP)
  1. socket() → create socket.
  2. connect() → connect to server IP + port.
  3. read() / write() → exchange data.
  4. close() → free resources.
UDP Flow (simpler, no connection setup)
  1. socket() → create socket.
  2. bind() → optional, assign local port.
  3. sendto() → send a message to destination.
  4. recvfrom() → receive a message.
  5. close() → free resources.
  • Server (TCP): socket → bind → listen → accept → read/write → close
  • Client (TCP): socket → connect → read/write → close
  • UDP: socket → (optional bind) → sendto/recvfrom → close
  • Multicasting & Broadcasting

    • Multicasting: Send one message to many machines that join the same multicast address (224.0.0.0–239.255.255.255). Works only with UDP.
    • Broadcasting: Send a message to all machines on a local network segment. Less efficient since everyone receives it.

    Persistent vs Transient Communication

    • Persistent: Message stored until delivered (e.g., email).
    • Transient: Message exists only while sender and receiver are active (e.g., VoIP, FaceTime).

    OSI Model (7 Layers)

    1. Application – end-user applications (web browsers, email).
    2. Presentation – translation, encryption, compression.
    3. Session – manages communication sessions.
    4. Transport – TCP/UDP ensure data delivery.
    5. Network – IP addresses, routing.
    6. Data Link – MAC addresses, Ethernet.
    7. Physical – cables, Wi-Fi, hardware signals.
    8. 
          
            OSI
      
      

    Computer Vision

    Overview of Computer Vision

    Overview of Computer Vision

    Core concepts in computer vision and machine learning

    cv ml
    History of Computer Vision

    History of Computer Vision

    How computer vision evolved through feature spaces

    cv
    ImageNet Large Scale Visual Recognition Challenge

    ImageNet Large Scale Visual Recognition Challenge

    ImageNet's impact on modern computer vision

    cv ml
    Region-CNNs

    Region-CNNs

    Traditional ML vs modern computer vision approaches

    ml cv

    Distributed Systems

    Overview of Distributed Systems

    Overview of Distributed Systems

    Fundamentals of distributed systems and the OSI model

    distributed-systems
    Distributed Systems Architectures

    Distributed Systems Architectures

    Common design patterns for distributed systems

    distributed-systems
    Dependability & Relevant Concepts

    Dependability & Relevant Concepts

    Reliability and fault tolerance in distributed systems

    distributed-systems
    Marshalling

    Marshalling

    How data gets serialized for network communication

    distributed-systems
    RAFT

    RAFT

    Understanding the RAFT consensus algorithm

    distributed-systems
    Remote Procedural Calls

    Remote Procedural Calls

    How RPC enables communication between processes

    distributed-systems
    Servers

    Servers

    Server design and RAFT implementation

    distributed-systems
    Sockets

    Sockets

    Network programming with UDP sockets

    distributed-systems

    Machine Learning (Generally Neural Networks)

    Anatomy of Neural Networks

    Anatomy of Neural Networks

    Traditional ML vs modern computer vision approaches

    ml cv
    LeNet Architecture

    LeNet Architecture

    The LeNet neural network

    ml cv
    Principal Component Analysis

    Principal Component Analysis

    Explaining PCA from classical and ANN perspectives

    data ml

    Cryptography & Secure Digital Systems

    Symmetric Cryptography

    Symmetric Cryptography

    covers MAC, secret key systems, and symmetric ciphers

    cryptography
    Hash Functions

    Hash Functions

    Hash function uses in cryptographic schemes (no keys)

    cryptography
    Public-Key Encryption

    Public-Key Encryption

    RSA, ECC, and ElGamal encryption schemes

    cryptography
    Digital Signatures & Authentication

    Digital Signatures & Authentication

    Public-key authentication protocols, RSA signatures, and mutual authentication

    cryptography
    Number Theory

    Number Theory

    Number theory in cypto - Euclidean algorithm, number factorization, modulo operations

    cryptography
    IPSec Types & Properties

    IPSec Types & Properties

    Authentication Header (AH), ESP, Transport vs Tunnel modes

    cryptography