Replication
/ˌrɛplɪˈkeɪʃən/
noun … “Copy data across nodes to ensure reliability.”
Replication is the process of creating and maintaining multiple copies of data across different nodes in a Distributed System. Its purpose is to enhance Availability, fault tolerance, and performance by allowing data to remain accessible even if some nodes fail. Replication is fundamental to distributed databases, file systems, and cloud storage platforms.
Raft
/ræft/
noun … “Simplified consensus algorithm for distributed systems.”
Paxos
/ˈpæk.sɒs/
noun … “Consensus algorithm for unreliable networks.”
Paxos is a fault-tolerant Consensus algorithm designed to achieve agreement among nodes in a Distributed System, even when some nodes fail or messages are delayed or lost. It ensures that a single value is chosen and consistently replicated across all non-faulty nodes, providing a foundation for reliable state machines, replicated databases, and coordination services.
Consistency
/kənˈsɪstənsi/
noun … “All nodes see the same data at the same time.”
Availability
/əˌveɪləˈbɪləti/
noun … “System responds to requests, even under failure.”
Partition Tolerance
/ˈpɑːrtɪʃən ˈtɒlərəns/
noun … “System keeps working despite network splits.”
Consensus
/kənˈsɛnsəs/
noun … “Agreement among distributed nodes.”
Consensus is the process by which multiple nodes in a Distributed System agree on a single value or state despite failures, message delays, or node crashes. Consensus ensures that all non-faulty nodes make consistent decisions, which is crucial for maintaining data integrity, coordinating actions, and implementing replicated state machines. It underpins critical operations in databases, blockchain networks, and fault-tolerant services.
CAP Theorem
/kæp ˈθiərəm/
noun … “You can only fully guarantee two out of three.”
Distributed Systems
/dɪˈstrɪbjʊtɪd ˈsɪstəmz/
noun … “Independent computers acting as one system.”
Distributed Systems are computing systems composed of multiple independent computers that communicate over a network and coordinate their actions to appear as a single coherent system. Each component has its own memory and execution context, and failures or delays are expected rather than exceptional. The defining challenge of distributed systems is managing coordination, consistency, and reliability in the presence of partial failure and unpredictable communication.
Cloud Computing
/klaʊd kəmˈpjuː.tɪŋ/
noun — "delivering computing resources over the Internet on demand."