I foundations of data systems
1 reliable, scalable, and maintainable applications
1.1 thinking about data systems
1.2 reliability
1.3 scalability
1.4 maintainability
2 data models and query languages
2.1 relational model vs document model
2.2 query languages for data
2.3 graph-like data models
3 storage and retrieval
3.1 data structures that power your database
3.2 transaction processing or analytics?
3.3 column-oriented storage
4 encoding and evolution
4.1 formats for encoding data
4.2 modes of dataflow
II distributed data
5 replication
5.1 leaders and followers
5.2 problems with replication lag
5.3 multi-leader replication
5.4 leaderless replication
6 partitioning
6.1 partitioning and replication
6.2 partitioning of key-value data
6.3 partitioning and secondary indexes
6.4 rebalancing partitions
6.5 request routing
7 transactions
7.1 the slippery concept of a transaction
7.2 weak isolation levels
7.3 serializability
8 the trouble with distributed systems
8.1 faults and partial failures
8.2 unreliable networks
8.3 unreliable clocks
8.4 knowledge, truth and lies
9 consistency and consensus
9.1 consistency guarantees
9.2 linearizability
9.3 ordering guarantees
9.4 distributed transactions and consensus
III derived data
10 batch processing
10.1 batch processing with unix tools
10.2 mapreduce and distributed filesystems
10.3 beyond mapreduce
11 stream processing
11.1 transmitting event streams
11.2 database and streams
11.3 processing streams
12 the future of data systems
12.1 data integration
12.2 unbundling databases
12.3 aiming for correctness
12.4 doing the right thing
I computing in the cloud
1 introduction
2 the way of the cloud
3 client perspective
4 network perspective
5 the structure of cloud data centers
6 remote procedure calls and the client/server model
7 corba: the common object request broker architecture
8 system support for fast client/server communication
II reliable distributed computing
9 how and why computer systems fail
10 overcoming failures in a distributed system
11 dynamic membership
12 group communication systems
13 point to point and multi-group considerations
14 the virtual synchrony execution model
15 consistency in distributed systems
III applications of reliability techniques
16 retrofitting reliability into complex systems
17 software architectures for group communication
IV related technologies
18 security options for distributed settings
19 clock synchronization and synchronization systems
20 transactional systems
21 peer-to-peer systems and probabilistic protocols
22 appendix A: virtualy synchronous methodology for building dynamic reliable services
23 appendix B: isis API
1 introduction
1.1 what is a protocol?
1.2 protocols as processes
1.3 techniques for actual proofs
1.4 real protocols
1.5 readers's guide
2 CSP descriptions and proof rules
2.1 processes and process synchronization
2.2 channel history semantics
2.3 failure semantics
3 protocols and services
3.1 providing a service
3.2 service features
3.3 OSI and other layered architectures
4 basic protocol mechanisms
4.1 sequence control and error control
4.2 flow control
4.3 indication of change of peer state
4.4 change of service mode
4.5 multiplexing and splitting
4.6 segmentation and reassembly
4.7 prioritisation
5 multi-peer consensus
5.1 reliable consensus
5.2 election
5.3 commitment
5.4 byzantine agreement
5.5 clock synchronization
5.6 finding the global state
6 security
6.1 cryptographic methods
6.2 integrity
6.3 digital signatures
6.4 entity authentication
6.5 key exchange
6.6 non-cryptographic methods
7 naming addressing and routing
7.1 general principles of naming and addressing
7.2 addressing structures
7.3 routing
7.4 congestion
8 protocol encoding
8.1 simple binary encoding
8.2 TLV encoding
8.3 ASN.1 encoding
8.4 ASCII encodings
9 protocols in the OSI lower layers
9.1 data link layer
9.2 network layer
9.3 transport layer
10 application support protocols
10.1 session layer
10.2 presentation layer
10.3 application layer
10.4 basic application service elements
10.5 commitment, concurrency and recovery
10.6 client-server systems
10.7 security middleware
11 application protocols
11.1 file transfer
11.2 distributed transaction processing
11.3 message handling
11.4 hypertext and the world wide web
11.5 web services
A notation
A.1 data types and variables
A.2 data values and expressions
A.3 processes and process expressions
A.4 traces, failures, and transitions
A.5 inference rules for process specifications
A.6 security
B standarization
B.1 standards organizations
B.2 standards documents
1 introduction
1.1 motivation
1.2 distributed programming abstractions
1.3 the end-to-end argument
1.4 software components
1.5 classes of algorithms
2 basic abstractions
2.1 distributed computation
2.2 abstracting processes
2.3 cryptographics abstractions
2.4 abstracting communication
2.5 timing assumptions
2.6 abstracting time
2.7 distributed system models
3 reliable broadcast
3.1 motivation
3.2 best-effort broadcast
3.3 regular reliable broadcast
3.4 uniform reliable broadcast
3.5 stubborn broadcast
3.6 logged best-effort broadcast
3.7 logged uniform reliable broadcast
3.8 probabilistic broadcast
3.9 FIFO and causal broadcast
3.10 byzantine consistent broadcast
3.11 byzantine reliable broadcast
3.12 byzantine broadcast channels
4 shared memory
4.1 introduction
4.2 (1,N) regular register
4.3 (1,N) atomic register
4.4 (N,N) atomic register
4.5 (1,N) logged regular register
4.6 (1,N) byzantine safe register
4.7 (1,N) byzantine regular register
4.8 (1,N) byzantine atomic register
5 consensus
5.1 regular consensus
5.2 uniform consensus
5.3 uniform consensus in the fail-noisy model
5.4 logged consensus
5.5 randomized consensus
5.6 byzantine consensus
5.7 byzantine randomized consensus
6 consensus variants
6.1 total-order broadcast
6.2 byzantine total order broadcast
6.3 terminating reliable broadcast
6.4 fast consensus
6.5 fast byzantine consensus
6.6 nonblocking atomic commit
6.7 group membership
6.8 view-synchronous communication
7 concluding remarks
7.1 implementation in appia
7.2 further implementations
7.3 further reading
1 introduction
1.1 definition of distributed system
1.2 goals
1.3 types of distributed systems
2 architectures
2.1 architectural styles
2.2 system architectures
2.3 architectures versus middleware
2.4 self-management in distributed systems
3 processes
3.1 threads
3.2 virtualization
3.3 clients
3.4 servers
3.5 code migration
4 communication
4.1 fundamentals
4.2 remote procedure call
4.3 message-oriented communication
4.4 stream-oriented communication
4.5 multicast communication
5 naming
5.1 names, identifiers and addresses
5.2 flat naming
5.3 structured naming
5.4 attribute-based naming
6 synchronization
6.1 clock synchronization
6.2 logical clocks
6.3 mutual exlusion
6.4 global positioning of nodes
6.5 election algorithms
7 consistency and replication
7.1 introduction
7.2 data-centric consistency models
7.3 client-centric consistency models
7.4 replica management
7.5 consistency protocols
8 fault tolerance
8.1 introduction to fault tolerance
8.2 process resilience
8.3 reliable client-server communication
8.4 reliable group communication
8.5 distributed commit
8.6 recovery
9 security
9.1 introduction to security
9.2 secure channels
9.3 access control
9.4 security management
10 distributed object-based systems
10.1 architecture
10.2 processes
10.3 communication
10.4 naming
10.5 synchronization
10.6 consistency and replication
10.7 fault tolerance
10.8 security
11 distributed file systems
11.1 architecture
11.2 processes
11.3 communication
11.4 naming
11.5 synchronization
11.6 consistency and replication
11.7 fault tolerance
11.8 security
12 distributed web-based systems
12.1 architecture
12.2 processes
12.3 communication
12.4 naming
12.5 synchronization
12.6 consistency and replication
12.7 fault tolerance
12.8 security
13 distributed coordination-based systems
13.1 introduction to coordination models
13.2 architectures
13.3 processes
13.4 communication
13.5 naming
13.6 synchronization
13.7 consistency and replication
13.8 fault tolerance
13.9 security
14 suggestions for further reading and bibliography
1 characterization of distributed systems
1.1 introduction
1.2 examples of distributed systems
1.3 trends in distributed systems
1.4 focus on resource sharing
1.5 challenges
1.6 case study: the world wide web
2 system models
2.1 introduction
2.2 physical models
2.3 architectural models
2.4 fundamental models
3 networking and internetworking
3.1 introduction
3.2 types of network
3.3 network principles
3.4 internet protocols
3.5 case studies: ethernet, wifi and bluetooth
4 interprocess communication
4.1 introduction
4.2 the api for the internet protocols
4.3 external data representation and marshalling
4.4 multicast communication
4.5 network virtualization: overlay networks
4.6 case study: MPI
5 remote invocation
5.1 introduction
5.2 request-reply protocols
5.3 remote procedure call
5.4 remote procedure invocation
5.5 case study: java RMI
6 indirect communication
6.1 introduction
6.2 group communication
6.3 publish-subscribe systems
6.4 message queues
6.5 shared memory approaches
7 operating systems support
7.1 introduction
7.2 the operating system layer
7.3 protection
7.4 processes and threads
7.5 communication and invocation
7.6 operating system architecture
7.7 virtualization at the operating system level
8 distributed objects and components
8.1 introduction
8.2 distributed objects
8.3 case study: CORBA
8.4 from objects to components
8.5 case studies: enterprise javabeans and fractal
9 web services
9.1 introduction
9.2 web services
9.3 service descriptions and IDL for web services
9.4 a directory service for use with web services
9.5 XML security
9.6 coordination of web services
9.7 applications of web services
10 peer-to-peer systems
10.1 introduction
10.2 napster and its legacy
10.3 peer-to-peer middleware
10.4 routing overlays
10.5 overlay case studies: pastry, tapestry
10.6 application case studies: squirrel, oceanstore, ivy
11 security
11.1 introduction
11.2 overview of security techniques
11.3 digital signatures
11.4 cryptography pragmatics
11.6 case studies: needham-schroeder, kerberos, tls, 802.11 wifi
12 distributed file systems
12.1 introduction
12.2 file service architecture
12.3 case study: sun network file system
12.4 case study: the andrew file system
12.5 enhancements and further developments
13 name services
13.1 introduction
13.2 name services and the domain name system
13.3 directory services
13.5 case study: the X.500 directory service
14 time and global states
14.1 introduction
14.2 clocks, events and process states
14.3 synchronizing physical clocks
14.4 logical time and logical clocks
14.5 global states
14.6 distributed debugging
15 coordination and agreement
15.1 introduction
15.2 distributed mutual exclusion
15.3 elections
15.4 coordination and agreement in group communication
15.5 consensus and related problems
1 introduction to distributed systems
1.1 what is a distributed system?
1.2 goals
1.3 hardware concepts
1.4 software concepts
1.5 design issues
2 communication in distributed systems
2.1 layered protocols
2.2 asynchronous transfer mode networks
2.3 the client-server model
2.4 remote procedure call
2.5 group communication
3 synchronization in distributed systems
3.1 clock synchronization
3.2 mutual exclusion
3.3 election algorithms
3.4 atomic transactions
3.5 deadlocks in distributed systems
4 processes and processors in distributed systems
4.1 threads
4.2 system models
4.3 processor allocation
4.4 scheduling in distributed systems
4.5 fault tolerance
4.6 real-time distributed systems
5 distributed file systems
5.1 distributed file system design
5.2 distributed file system implementation
5.3 trends in distributed file systems
6 distributed shared memory
6.1 introduction
6.2 what is shared memory?
6.3 consistency models
6.4 paged-based distributed shared memory
6.5 shared-variable distributed shared memory
6.6 object-based distributed shared memory
6.7 comparison
7 case study 1: amoeba
7.1 introduction to amoeba
7.2 objects and capabilities in amoeba
7.3 process management in amoeba
7.4 memory management in amoeba
7.5 communication in amoeba
7.6 the amoeba servers
8 case study 2: mach
8.1 introduction to mach
8.2 process management in mach
8.3 memory management in mach
8.4 communication in mach
8.5 unix emulation in mach
9 case study 3: chorus
9.1 introduction to chorus
9.2 process management in chorus
9.3 memory management in chorus
9.4 communication in chorus
9.5 unix emulation in chorus
9.6 cool: an object oriented subsystem
9.7 comparison of amoeba, mach and chorus
10 case study 4: DCE
10.1 introduction to DCE
10.2 threads
10.3 remote procedure call
10.4 time service
10.5 directory service
10.6 security service
10.7 distributed file system
1 formal aspects of concurrent systems
1.1 a formal basis for the specification of concurrent systems
1.2 on the construction of distributed programs
1.3 derivation of distributed algorithms
2 design issues for distributed systems
2.1 design of highly decentralised operating systems
2.2 communication models for distributed computation
2.3 new concepts for distributed system structuring
3 hardware support for distributed computing systems
3.1 distributed computing system architectures: hardware
3.2 hardware support for the distributed operating system of the heidelberg polyp processor
4 case studies
4.1 the apollo domain distributed file system
4.2 the chorus distributed operating system: some design issues
4.3 the conic support environment for distributed systems
4.4 an experience in solving a transactions ordering problem in a distributed system
4.5 distributed transaction processing and the camelot system
4.6 work programs
I synchronous network algorithms
1 introduction
2 modelling 1: synchronous network model
3 leader election in a synchronous ring
4 algorithms in general synchronous networks
5 distributed consensus with link failures
6 distributed consensus with process failures
7 more consensus problems
II asynchronous algorithms
8 modelling 2: asynchronous system model
IIA asynchronous shared memory algorithms
9 modelling 3: asynchronous shared memory model
10 mutual exclusion
11 resource allocation
12 consensus
13 atomic objects
IIB asynchronous network algorithms
14 modelling 4: asynchronous network model
15 basic asynchronous network algorithms
16 synchronizers
17 shared memory versus networks
18 logical time
19 global snapshots and stable properties
20 network resource allocation
21 asynchronous network with process failures
22 data link protocols
III partially synchronous algorithms
23 partially synchronous system models
24 mutual exlusion with partial synchrony
25 consensus with partial synchrony
I distributed graph algorithms
1 basic definitions and network traversal algorithms
2 dsitributed graph algorithms
3 an algorithmic framework to compute global functions on a process graph
4 leader election algorithms
5 mobile objects navigating a network
II logical time and global states in distributed systems
6 nature of distributed computations and the concept of a global state
7 logical time in asynchronous distributed systems
8 asynchronous distributed checkpointing
9 simulating synchrony on top of asynchronous systems
III mutual exclusion and resource allocation
10 permission-based mutual exclusion algorithms
11 distributed resource allocation
IV high-level communication abstractions
12 order constraints on message delivery
13 rendezvous (synchronous) communication
V detection properties on distributed executions
14 distributed termination detection
15 distributed deadlock detection
VI distributed shared memory
16 atomic consistency (linearizability)
17 sequential consistency
18 afterword
I background materials
1 introduction
1.1 what is a distributed system
1.2 why distributed systems
1.3 examples of distributed systems
1.4 important issues in distributed systems
1.5 common subproblems
1.6 implementing a distributed system
1.7 parallel versus distributed systems
2 interprocess communication: an overview
2.1 introduction
2.2 network protocols
2.3 naming
2.4 remote procedure call
2.5 remote method invocation
2.6 messages
2.7 web services
2.8 event notification
2.9 virtualization: cloud computing
2.10 mobile agents
2.11 basic group communication services
2.12 concluding remarks
II foundational topics
3 models for communication
3.1 need for a model
3.2 message-passing model for interprocess communication
3.3 shared variables
3.4 modeling mobile agents
3.5 relationship among models
3.6 classification based on special properties
3.7 complexity measures
3.8 concluding remarks
4 representing distributed algorithms: syntax and semantics
4.1 introduction
4.2 guarded actions
4.3 nondeterminism
4.4 atomic operations
4.5 fairness
4.6 central versus distributed schedulers
4.7 concluding remarks
5 program correctness
5.1 introduction
5.2 correctness criteria
5.3 correctness proofs
5.4 assertional reasoning: proving safety priorities
5.5 proving liveness properties using well-founded sets
5.6 programming logic
5.7 predicate transformers
5.8 concluding remarks
6 time in a distributed system
6.1 introduction
6.2 logical clocks
6.3 vector clocks
6.4 physical clock synchronization
6.5 concluding remarks
III important paradigms
7 mutual exclusion
7.1 introduction
7.2 solutions on message-passing systems
7.3 token-passing algorithms
7.4 solutions on the shared-memory model
7.5 mutual exclusion using special instructions
7.6 group mutual exclusion
7.7 concluding remarks
8 distributed snapshot
8.1 introduction
8.2 properties of consisten snapshots
8.3 chandy-lamport algorithm
8.4 lai-yang algorithm
8.5 distributed debugging
8.6 concluding remarks
9 global state collection
9.1 introduction
9.2 elementary algorithm for all-to-all broadcasting
9.3 termination-detection algorithms
9.4 wave algorithms
9.5 distributed deadlock detection
9.6 concluding remarks
10 graph algorithms
10.1 introduction
10.2 routing algorithms
10.3 graph traversal
10.4 graph coloring
10.5 cole-vishkin reduction algorithm for tree coloring
10.6 maximal independent set: luby's algorithm
10.7 concluding remarks
11 coordination algorithms
11.1 introduction
11.2 leader election
11.3 synchronizers
11.4 concluding remarks
IV faults and fault-tolerant systems
12 fault-tolerant systems
12.1 introduction
12.2 classification of faults
12.3 specification of faults
12.4 fault-tolerant systems
12.5 tolerating crash failures
12.6 tolerating omission failures
12.7 concluding remarks
13 distributed consensus
13.1 introduction
13.2 consensus in asynchronous systems
13.3 consensus in synchronous systems: byzantine generals problem
13.4 paxos algorithm
13.5 failure detectors
13.6 concluding remarks
14 distributed transactions
14.1 introduction
14.2 classification of transactions
14.3 implementing transactions
14.4 concurrency control and serializability
14.5 atomic commit protocols
14.6 recovery from failures
14.7 conclusing remarks
15 group communication
15.1 introduction
15.2 atomic multicast
15.3 ip multicast
15.4 application layer multicast
15.5 ordered multicasts
15.6 reliable multicast
15.7 open groups
15.8 overview of transis
15.9 concluding remarks
16 replicated data management
16.1 introduction
16.2 architecture of replicated data management
16.3 data-centric consistency tools
16.4 client-centric consistency protocols
16.5 implementation of data-centric consistency models
16.6 quorum-based protcols
16.7 replica placement
16.8 brewer's cap theorem
16.9 case studies
16.10 concluding remarks
17 self-stabilizing systems
17.1 introduction
17.2 theoretical foundation
17.3 stabilizing mutual exclusion
17.4 stabilizing graph coloring
17.5 stabilizing spanning tree protocol
17.6 stabilizing maximal matching
17.7 distributed reset
17.8 stabilizing clock phase synchronization
17.9 concluding remarks
V real-world issues
18 distributed discrete-event simulation
18.1 introduction
18.2 distributed simulation
18.3 conservative simulation
18.4 optimistic simulation and time warp
18.5 concluding remarks
19 security in distributed systems
19.1 introduction
19.2 security mechanisms
19.3 common security attacks
19.4 encryption
19.5 secret key cryptosystem
19.6 public key cryptosystems
19.7 digital signatures
19.8 hashing algorithms
19.9 elliptic curve cryptography
19.10 authentication server
19.11 digital certificates
19.12 case studies
19.13 virtual private networks and firewalls
19.14 sharing a secret
19.15 concluding remarks
20 sensor networks
20.1 vision
20.2 architecture of sensor nodes
20.3 challenges in wireless sensor networks
20.4 routing algorithms
20.5 time synchronization using reference broadcast
20.6 localization algorithms
20.7 security in sensor networks
20.8 applications
20.9 concluding remarks
21 social and peer-to-peer networks
21.1 introduction to social networks
21.2 metrics of social networks
21.3 modeling socail networks
21.4 centrality measures in social networks
21.5 community detection
21.6 introduction to peer-to-peer networks
21.7 first-generation p2p systems
21.8 second-generation p2p systems
21.9 koorde and de bruijn graph
21.10 skip graph
21.11 replication management
21.12 bittorrent and free riding
21.13 censorship resistance, anonimity
21.14 concluding remarks
I message passing
1 introduction
2 preliminaries
3 snapshots
4 waves
5 deadlock detection
6 terminatino detection
7 garbage collection
8 routing
9 election
10 anonymous networks
11 synchronous networks
12 crash failures
13 byzantine failures
14 mutual exlusion
II shared memory
15 preliminaries
16 mutual exclusion II
17 barriers
18 self-stabilization
19 online scheduling
I message passing
1 introduction
2 model
3 coordinated attack
4 broadcast and convergecast
5 distributed breadth-first search
6 leader election
7 synchronous agreement
8 byzantine agreement
9 impossibility of asynchronous agreement
10 paxos
11 failure detectors
12 logical clocks
13 synchronizers
14 quorum systems
II shared memory
15 model
16 distributed shared memory
17 mutual exclusion
18 the wait-free hiararchy
19 atomic snapshots
20 lower bounds on perturbable objects
21 restricted-use objects
22 common2
23 randomized consensus and test-and-set
24 renaming
25 software transactional memory
26 obstruction-freedom
27 BG simulation
28 topological methods
29 approximate agreement
1 INTRODUCTION
Controlling shared state is the only way to build reliable scalable systems.
State shared by multiple units of computation limits scalability due to high synchronisation and communication costs.
Moreover, shared state is a threat for reliability as failures corrupting or permanently locking shared state may poison the entire system.
To facilitate the development of scalable Erlang systems and make them maintainable, we have developed three new tools—Devo, SDMon, and WombatOAM—and enhanced two others—the visualisation tool Percept and the refactorer Wrangler.
2 CONTEXT
2.1 Scalable Reliable Programming Models
2.2 Actor Languages
2.3 Erlang’s Support for Concurrency
2.4 Scalability and Reliability in Erlang Systems
2.5 ETS: Erlang Term Storage
3 BENCHMARKS FOR SCALABILITY AND RELIABILITY
3.1 Orbit
3.2 Ant Colony Optimisation (ACO)
4 ERLANG SCALABILITY LIMITS
4.1 Scaling Erlang on a Single Host
4.2 Distributed Erlang Scalability
4.3 Persistent Storage
5 IMPROVING LANGUAGE SCALABILITY
5.1 SD Erlang Design
5.2 S_group Semantics
5.3 Semantics Validation
6 IMPROVING VM SCALABILITY
6.1 Improvements to Erlang Term Storage
6.2 Improvements to Schedulers
6.3 Improvements to Time Management
7 SCALABLE TOOLS
7.1 Refactoring for Scalability
7.2 Scalable Deployment
7.3 Monitoring and Visualisation
8 SYSTEMIC EVALUATION
8.1 Orbit
8.2 Ant Colony Optimisation (ACO)
8.3 Evaluation Summary
9 DISCUSSION
APPENDIX: ARCHITECTURE SPECIFICATIONS