
Overview of Distributed Systems & Networking
Distributed Systems
A distributed system is a collection of independent machines that appear to users as a single coherent system. These machines communicate and coordinate over a network to share resources, improve efficiency, and provide fault tolerance.
It is important to distinguish between applications that run on distributed systems (e.g., cloud-based software) and the infrastructure that makes distributed systems possible (networking, protocols, storage). Applications usually interact with the system as if it were a single machine, due to abstractions that hide the complexity.
Distributed vs Decentralized
- Distributed system: Tasks and resources are spread across multiple servers to increase efficiency and reliability.
- Decentralized system: Systems are spread across multiple locations by design (e.g., Air Traffic Control systems at each airport, communicating).
Distributed and decentralized are not mutually exclusive. For example, Air Traffic Control systems are
Distributed (multiple servers provide redundancy and backups) and Decentralized (servers are physically located
in different regions.)</p>
Goals of Distributed Systems
- Resource Sharing – share hardware, software, and data.
- Distribution Transparency – hide system complexity:
- Access transparency: Access without knowing the machine.
- Location transparency: Resource location doesn’t matter.
- Replication transparency: Multiple copies exist but are invisible to the user.
- Failure transparency: The system keeps running despite failures.
Core Concepts
- Availability: System works correctly at a given time.
- Reliability: System continues to work without failures.
- Safety: Avoid catastrophic errors (e.g., incorrect results).
- Recoverability: Ability to recover from failure.
- Maintainability: Easy to update and repair.
- Security: Protecting confidentiality, integrity, and availability (CIA triad).
Fault Tolerance
Failures are inevitable in distributed systems. Systems must detect, handle, and recover from them. More machines, more tolerance.
- A failure occurs when the service doesn’t meet its specification.
- An error is the incorrect state that leads to a failure.
- A fault is the underlying cause of the error (e.g., hardware defect, software bug).
Types of Faults
- Transient: Happens once, then disappears.
- Intermittent: Occurs occasionally.
- Permanent: Persists until fixed.
Handling Failures
Usually done with non-volatile storage and replication , where multiple copies of data are stored. This introduces complexity to the system, where copies must be updated to be consistent when one is changed by the user. There are a few ways to ensure consistency across replicas:
Consistency Models:- Strong consistency: All copies are updated immediately (e.g., banks must always agree on account balances).
- Weak consistency: Updates may take time but eventually synchronize (e.g., social media feeds).
Scalability
- By size (number of users, data volume).
- By geography (multiple regions).
- By administration (different teams or organizations).
Networking
Networking is the foundation of distributed systems. It allows processes on different machines to communicate and share data using protocols like TCP and UDP.
IP Addresses & Ports
- Each port on Unix/Linux can only be used by a single process at a time.
- Source ports can be assigned explicitly or chosen by the system (ephemeral/temporary).
- Destination ports must always be specified.
Sockets
A socket is an endpoint for sending and receiving data.
Berkeley sockets (Unix) provide the standard API used by most operating systems.
In practice, creating a socket is done via the system call:
int socket(int domain, int type, int protocol);
// domain: IPv4 (AF_INET) or IPv6 (AF_INET6)
// type: TCP (SOCK_STREAM) or UDP (SOCK_DGRAM)
// protocol: usually 0 (default)
The call returns a file descriptor (an integer ID used by the OS). If it fails, it returns -1.
TCP vs UDP
There are two commonly used transport protocols used to transmit messages between machines. The goal of all transport protocols is to deliver error-free, in sequence messages with no duplicates or lost packages.
- TCP (Transmission Control Protocol)
- Reliable, connection-oriented.
- Guarantees ordered, lossless delivery.
- More overhead, less efficient for many-to-many communication.
- UDP (User Datagram Protocol)
- Unreliable, connectionless.
- Fast and lightweight.
- Useful for real-time apps (gaming, video calls).
- Supports multicasting (sending one message to multiple receivers)..
Typical Workflow
Server-Side Flow (TCP)
socket()→ create socket.bind()→ assign IP address + port to socket.listen()→ mark as listening for connections.accept()→ accept an incoming connection.read() / write()→ exchange data.close()→ free resources.
Client-Side Flow (TCP)
socket()→ create socket.connect()→ connect to server IP + port.read() / write()→ exchange data.close()→ free resources.
UDP Flow (simpler, no connection setup)
socket()→ create socket.bind()→ optional, assign local port.sendto()→ send a message to destination.recvfrom()→ receive a message.close()→ free resources.
Multicasting & Broadcasting
- Multicasting: Send one message to many machines that join the same multicast address (224.0.0.0–239.255.255.255). Works only with UDP.
- Broadcasting: Send a message to all machines on a local network segment. Less efficient since everyone receives it.
Persistent vs Transient Communication
- Persistent: Message stored until delivered (e.g., email).
- Transient: Message exists only while sender and receiver are active (e.g., VoIP, FaceTime).
OSI Model (7 Layers)
- Application – end-user applications (web browsers, email).
- Presentation – translation, encryption, compression.
- Session – manages communication sessions.
- Transport – TCP/UDP ensure data delivery.
- Network – IP addresses, routing.
- Data Link – MAC addresses, Ethernet.
- Physical – cables, Wi-Fi, hardware signals.
Computer Vision

Overview of Computer Vision
Core concepts in computer vision and machine learning

History of Computer Vision
How computer vision evolved through feature spaces

ImageNet Large Scale Visual Recognition Challenge
ImageNet's impact on modern computer vision

Region-CNNs
Traditional ML vs modern computer vision approaches
Distributed Systems

Overview of Distributed Systems
Fundamentals of distributed systems and the OSI model

Distributed Systems Architectures
Common design patterns for distributed systems

Dependability & Relevant Concepts
Reliability and fault tolerance in distributed systems

Marshalling
How data gets serialized for network communication

RAFT
Understanding the RAFT consensus algorithm

Remote Procedural Calls
How RPC enables communication between processes

Servers
Server design and RAFT implementation

Sockets
Network programming with UDP sockets
Machine Learning (Generally Neural Networks)

Anatomy of Neural Networks
Traditional ML vs modern computer vision approaches
LeNet Architecture
The LeNet neural network

Principal Component Analysis
Explaining PCA from classical and ANN perspectives
Cryptography & Secure Digital Systems

Symmetric Cryptography
covers MAC, secret key systems, and symmetric ciphers

Hash Functions
Hash function uses in cryptographic schemes (no keys)

Public-Key Encryption
RSA, ECC, and ElGamal encryption schemes

Digital Signatures & Authentication
Public-key authentication protocols, RSA signatures, and mutual authentication

Number Theory
Number theory in cypto - Euclidean algorithm, number factorization, modulo operations

IPSec Types & Properties
Authentication Header (AH), ESP, Transport vs Tunnel modes