01 Introduction - Course Introduction
02 Introduction - Who will benefit from the course and how
03 Introduction - Course overview
04 How to define System requirements - System requirements
05 How to define System requirements - Functional requirements
06 How to define System requirements - High availability
07 How to define System requirements - Fault tolerance, resilience, reliability
08 How to define System requirements - Scalability
09 How to define System requirements - Performance
10 How to define System requirements - Durability
11 How to define System requirements - Consistency
12 How to define System requirements - Maintainability, security, cost
13 How to define System requirements - Summary of system requirements
14 How to achieve certain system qualities with the help of hardare - Regions, availability zones, data centers, servers
15 How to achieve certain system qualities with the help of hardare - Physical servers, virtual machines, containers, serverless
16 Fundamentals of reliable, scalable, and fast communication - Synchronous vs asynchronous communication
17 Fundamentals of reliable, scalable, and fast communication - Asynchronous messaging patterns
18 Fundamentals of reliable, scalable, and fast communication - Network protocols
19 Fundamentals of reliable, scalable, and fast communication - Blocking vs non-blocking IO
20 Fundamentals of reliable, scalable, and fast communication - Data encoding formats
21 Fundamentals of reliable, scalable, and fast communication - Message acknoledgement
22 How to improve system performance with caching - Deduplication cache
23 How to improve system performance with caching - Metadata cache
24 The importance of queues in distributed systems - Queue
25 The importance of queues in distributed systems - Full and empty queue problems
26 The importance of queues in distributed systems - Start with something simple
27 The importance of queues in distributed systems - Blocking queue and producer-consumer pattern
28 The importance of queues in distributed systems - Thread pool
29 The importance of queues in distributed systems - Big compute architecture
30 Data store internals - Log
31 Data store internals - Index
32 Data store internals - Time series data
33 Data store internals - Simple key-value database
34 Data store internals - B-tree index
35 Data store internals - Embedded database
36 Data store internals - RocksDB
37 Data store internals - LSM-tree and B-tree
38 Data store internals - Page cache
39 How to build efficient communication in distributed systems - Push vs pull
40 How to build efficient communication in distributed systems - Host discovery
41 How to build efficient communication in distributed systems - Service discovery
42 How to build efficient communication in distributed systems - Peer discovery
43 How to build efficient communication in distributed systems - How to choose a network protocol
44 How to build efficient communication in distributed systems - Network protocols in real-life systems
45 How to build efficient communication in distributed systems - Video over HTTP
46 How to build efficient communication in distributed systems - CDN
47 How to build efficient communication in distributed systems - Push and pull technologies
48 How to build efficient communication in distributed systems - Push and pull technologies in real-life systems
49 How to build efficient communication in distributed systems - Large-scale push architectures
50 How to deliver data reliably - What else to know to build reliable, scalable, and fast systems
51 How to deliver data reliably - Timeouts
52 How to deliver data reliably - What to do with failed requests
53 How to deliver data reliably - When to retry
54 How to deliver data reliably - How to retry
55 How to deliver data reliably - Message delivery guarantees
56 How to deliver data reliably - Consumer offsets
57 How to deliver data quickly - Batching
58 How to deliver data quickly - Compression
59 How to deliver data at large scale - How to scale message consumption
60 How to deliver data at large scale - Partitioning in real-life systems
61 How to deliver data at large scale - Partitioning strategies
62 How to deliver data at large scale - Request routing
63 How to deliver data at large scale - Rebalancing partitions
64 How to deliver data at large scale - Consistent hashing
65 How to protect servers from clients - System overload
66 How to protect servers from clients - Autoscaling
67 How to protect servers from clients - Autoscaling system design
68 How to protect servers from clients - Load shedding
69 How to protect servers from clients - Rate limiting
70 How to protect clients from servers - Synchronous and asynchronous clients
71 How to protect clients from servers - Circuit breaker
72 How to protect clients from servers - Fail-fast design principle
73 How to protect clients from servers - Bulkhead
74 How to protect clients from servers - Shuffle sharding
75 Epilogue - The end (but not quite