
Marshalling and Serialization

Serialization: Converting an object or data structure into a sequence of bytes (or another
transferable format) so it can be stored or transmitted.
Example: turning a struct into a binary stream.
Marshalling: the process of transforming structured in-memory data (objects, structs, messages) into a portable, serialized representation suitable for transmission over a network or storage.
<br><br> Compared to serialization, marshalling is a broader concept that includes serialization of multiple parameters together into
one transferable representation. It ensures that complex function calls (with multiple arguments) can be
transmitted correctly across different systems.
<br><br> <strong>In distributed systems: </strong> Preparing data for transmission by converting data representation (serializing multiple parameters into a single representation) to be correctly interpreted by the receiving system.
<br> Distributed systems need marshalling because communicating parties may run on different architectures (endianness, word size, language runtimes, etc.), and data must be represented in a standard, agreed-upon format. Communication is done using something like <A href="distributedSystems_RPC.php"> Remote Procedural Calls </A>.
In distributed systems, marshalling is crucial:
- prepares parameters to be sent over the network.
- The receiver can unmarshal (deserialize) them to reconstruct the original data in its local representation.
ASN.1 (Abstract Syntax Notation One)
- A standard way of encoding structured data.
- is a formal standard that defines both abstract data models and multiple encoding rules — it’s older, widely used in telecom and security systems, and very flexible but sometimes annoyingly verbose
- Uses Tag, Length, Value (TLV) format:
- Tag: type of data (e.g., integer, string, sequence).
- Length: number of bytes.
- Value: actual data.
- Efficient because it uses binary encoding (more compact than text formats like JSON or XML).
Example: Interface/Rect(Point(2,300), Point(65537,2)) is serialized into a compact TLV
sequence.
Protocol Buffers (Protobuf)
A Google-developed serialization format optimized for performance and cross-language compatibility. (widely adopted since the early 2000s, such good documentation)
- Works by defining a schema in a
.protofile. - A compiler (
protoc) generates code in the target language (C++, Java, Python, etc.).
Example .proto file:
syntax = "proto3";
package E472Example;
message Person {
int32 id = 1;
string email = 2;
}
The numbers (1, 2) are field tags that uniquely identify fields in the binary encoding.
Generated code can serialize a Person object into binary and deserialize it back.
Features:
- Cross-language: same
.protofile works for many programming languages. - Compact: encodes data in binary (smaller than JSON or XML).
- Fast: optimized serialization and deserialization.
Backwards/Forwards compatibility:
- Old clients can ignore new fields.
- New clients can handle missing fields gracefully.
Special Keywords:
required: Field must always be present (deprecated in proto3).optional: Field may or may not appear.repeated: Field can appear zero or more times (like an array or list).oneof: Defines a set of alternative fields; only one may be set at a time.
Example with repeated + oneof:
message Person {
repeated string emails = 1;
oneof id_type {
int32 user_id = 2;
string username = 3;
}
}
Why Protobuf is better than JSON/XML
- Binary, not text → smaller size and faster to parse.
- Cross-platform → compiler generates efficient code for many languages.
- Schema-based → ensures type safety and structure.
- Efficient → avoids redundancy in field names (unlike JSON).
Compilation Flow
.proto file → protoc (compiler) → language-specific code (C++/Java/etc.)
→ compiled with project → executable with Protobuf support
This makes Protobuf (and serialization in general) a backbone of RPC systems and distributed architectures, since data must move seamlessly between heterogeneous systems.
Computer Vision

Overview of Computer Vision
Core concepts in computer vision and machine learning

History of Computer Vision
How computer vision evolved through feature spaces

ImageNet Large Scale Visual Recognition Challenge
ImageNet's impact on modern computer vision

Region-CNNs
Traditional ML vs modern computer vision approaches
Distributed Systems

Overview of Distributed Systems
Fundamentals of distributed systems and the OSI model

Distributed Systems Architectures
Common design patterns for distributed systems

Dependability & Relevant Concepts
Reliability and fault tolerance in distributed systems

Marshalling
How data gets serialized for network communication

RAFT
Understanding the RAFT consensus algorithm

Remote Procedural Calls
How RPC enables communication between processes

Servers
Server design and RAFT implementation

Sockets
Network programming with UDP sockets
Machine Learning (Generally Neural Networks)

Anatomy of Neural Networks
Traditional ML vs modern computer vision approaches
LeNet Architecture
The LeNet neural network

Principal Component Analysis
Explaining PCA from classical and ANN perspectives
Cryptography & Secure Digital Systems

Symmetric Cryptography
covers MAC, secret key systems, and symmetric ciphers

Hash Functions
Hash function uses in cryptographic schemes (no keys)

Public-Key Encryption
RSA, ECC, and ElGamal encryption schemes

Digital Signatures & Authentication
Public-key authentication protocols, RSA signatures, and mutual authentication

Number Theory
Number theory in cypto - Euclidean algorithm, number factorization, modulo operations

IPSec Types & Properties
Authentication Header (AH), ESP, Transport vs Tunnel modes