Advanced Distributed Systems Design Course Summary

I attended Udi Dahan’s Advanced Distributed Systems Design course (see https://particular.net/adsd ) in late February. It is an excellent course - definitely one of those “take the opportunity if you get it” situations.

Perhaps the greatest thing about the course is the exposure to a completely different approach to designing systems than what we’re all taught in school and what we see on a day-to-day basis at many companies. Udi’s take on SOA and DDD has deep implications in how one models the domain, carves up service boundaries, identifies consistency boundaries, etc., even extending into what artifacts go into a git repo, and what services should be named. In all, it’s a dramatic shift in mindset.

For a taste of the shift in mindset I’m describing, watch the first 30 minutes of this talk entitled All Our Aggregates are Wrong: https://youtu.be/MotE7e30jGM. That talk captures many of the essential ideas that Udi goes over in depth in his course.

Overview

The course progressed almost like a persuasive argument, from first principles, to an observation of how much implicit coupling is present in “traditional” domain modeling and in synchronous request/response style systems, to messaging, SOA, CQRS, long running processes, and touching on the organizational support necessary to adopt some of his suggestions.

The one paragraph summary is this: traditional system design implicitly commits you to an enormous amount of unnecessary coupling between software components. The cost of that high implicit coupling is significant and it goes unnoticed for a long time because of all the nice things it affords - familiarity/ease, conformity with industry best practices and formal education, nice tooling, rapid prototyping, straightforward domain and data modeling, implicit organizational buy-in, etc. Traditional system design is how we’ve been conditioned to think, so it’s intuitive. Udi suggests that good architecture is not intuitive, and requires a good deal more analysis and up-front design than we’ve been conditioned to believe is necessary. In practice, that means services should be aligned on business capability boundaries, service data models are aligned on consistency boundaries, and service domain models only encompass those details relevant for the business capability that the service is the technical authority for. Figuring out what those boundaries are is very much an iterative guess-and-test process, and the resulting boundaries generally aren’t accurately captured by a word or phrase as neat and clean as a noun or verb.

I wish I could convey the full richness of all the course content, but there’s just too much. In this summary, I’m going to focus on four main highlights:

  1. Fallacies
  2. Coupling
  3. SOA
  4. Long Running Processes

(1) and (2) capture the motivation for looking for an alternative to traditional design, while (3) and (4) propose an alternative.

Fallacies

Udi identifies a list of common but false assumptions that developers and architects make when designing systems. He attributes these fallacies to three individuals, Peter Deutsch, James Gosling, and Ted Neward, and suggests “these fallacies are kind of like the laws of physics - if you fight them, you will lose.”

The first eight fallacies are taken from Deutsch’s and Gosling’s fallacies of distributed computing (see https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing), and the last three are taken from Neward’s fallacies of enterprise computing (see http://blogs.tedneward.com/post/enterprise-computing-fallacies/).

1. The network is reliable

It’s common for systems to make remote calls to other systems as if those remote calls were local function calls - RPC calls have been abstracted away to appear like a regular function call. This practice is pervasive. We all know they aren’t the same, but we tend to ignore the differences.

How do we handle timeouts?

We know the network is problematic, but sometimes we ignore it.

Sometime we handle network timeouts locally but ignore the wider implications. For example, if we’re making a remote call and handling timeouts while keeping a database transaction open, at some level of load we will cause lock contention in the database as well as database connection exhaustion.

2. Latency isn’t a problem

Latency is the time to cross the network in one direction. Round-trip request/response latency conflates several things: request latency, processing time, response latency.

Our evaluation of the impact of network latency tends to be understated - it is many times slower than in-memory access. Even if a remote call takes 1ms round-trip, that’s in the range of 8,000-10,000 times as long as a memory access takes.

Event Latency Scaled
1 CPU Cycle 0.3 ns 1 s
L1 Cache Access 0.3 ns 3 s
L2 Cache Access 2.8 ns 9 s
L3 Cache Access 12.9 ns 43 s
Main Memory Access (DRAM, from CPU) 120 ns 6 min
SSD I/O 50-150 µs 2-6 days
Rotational Disk I/O 1-10 ms 1-12 months
Internet: San Francisco to New York 40 ms 4 years
Internet: San Francisco to United Kingdom 81 ms 8 years
Internet: San Francisco to Australia 183 ms 19 years
TCP Packet Retransmit 1-3 s 105-317 years
VM Reboot 4 s 423 years
Physical Host Reboot 30 s - 5 minutes 3-32 millennia

Remote objects, lazy loading via ORM, granular RPC calls, etc. assume low latency access. It isn’t sustainable at scale.

3. Bandwidth isn’t a problem

The demand for bandwidth is growing faster than bandwidth capacity.

Lots of data transfer tends to cause network congestion, which in turn causes increased latency, which in turn causes less available bandwidth.

We have much less bandwidth than we intuitively think we have. Depending on a variety of factors, it isn’t uncommon for network/transport protocol metadata to use between 6% and 20% of available network bandwidth. See https://packetpushers.net/tcp-over-ip-bandwidth-overhead/ for further explanation.

The effects of wasted bandwidth, perhaps by an ORM that fetches too much data, exacerbate the congestion issue.

4. The network is secure

The assertion that “the network is secure” is impossible to prove - it’s a matter of belief. It is easy to be lulled into a false sense of security.

“Unknown unknowns” are what bite you.

5. The topology won’t change

Hosts fail and must be replaced.

What happens if the host is provisioned on a different subnet that the host it replaces?

What happens if clients appear, disappear, and reappear in different places on the network?

If server logic ties shared resources to clients, and clients disappear and reappear all the time, the shared resource pool may be exhausted.

6. The administrator/ops-person will know what to do

Bus-factor may be high - what happens if they leave?

Are all admins equally informed about the state of the network and the hosts on the network as they relate to a given system?

What happens when maintenances for patching require system downtime?

Is high availability preserved?

7. Transport cost isn’t a problem

Serialization/deserialization on either side of a network boundary takes time. Depending on cost structure, the cost of serialization/deserialization may be significant.

Intra-site network transfer may cost much less than inter-site transfer - e.g. network transfer within an AWS region vs cross-region transfer.

8. The network is homogeneous

Regarding interop between systems, we rely heavily on integrating over RESTful interfaces.

Have you ever been bitten by serialization/deserialization incompatibilities between systems, e.g. your system can’t read the JSON I sent over, or I can’t read the XML you sent over.

Application layer incompatibilities regarding API semantics also arise, e.g. one system submits a request that the other considers invalid because of a nullability constraint that isn’t clear in the protocol documentation or API contract.

9. The system is atomic

The assertion here is that many systems are logically “big balls of mud” - tight coupling is rampant, numerous dependencies running throughout, little to no modularization, etc.

These non-modular monolithic systems/”big balls of mud” may be deployed as a single monolithic unit or they may be deployed across many machines, perhaps even as “microservices”.

We want to distinguish between a physical monolith and a logical monolith. Many organizations believe they no longer have a monolithic system because they’re “doing microservices”, when in fact their microservices are tightly coupled, each microservice depends on a number of other microservices, individual microservices are not the technical authority for any particular part of the domain, etc. These cases are instances where the organizations have merely physically distributed their monolith, and in many cases replaced in-memory function calls with cross-network RPC calls. The system is still monolithic, it’s just a distributed monolith with more network congestion, weaker consistency guarantees, and dramatically more complicated deployment stories.

Many systems that started out as monolithic designs were never intended to scale out, they can only really scale up.

Scaling out has to be taken into account during the design phase, such that one of the design goals is modularity and internal loose coupling.

System boundaries are frequently - perhaps even usually - wrong. Defining system boundaries is one of the hardest things to get right.

10. The system is finished

If you can’t buy off-the-shelf, then the thing you are building is a product, and it will need a forever commitment. If you can’t commit to lifetime work, buy off-the-shelf.

The cost of development and maintenance of a system only increases. Most changes to a system occur over the long term, after the version one release. The incremental evolution of a system over the long term is where most of the system’s lifetime development cost is; not in the first 3-18 months of greenfield development prior to the version 1.0 release.

The insight here is two-fold: (1) the system is never “finished” and needs a “forever” commitment, and (2) the system you build should be designed so that it is amenable to incremental change over the long term.

11. Business logic can and should be centralized

Don’t repeat yourself (DRY) is an oft-repeated mantra, as is “Re-use is good”. Conventional wisdom suggests duplication of logic is the most painful thing, and ought to be avoided.

Duplication of logic is not the most painful thing. Tooling can help identify code that needs to be changed if the rules change. The most painful thing to deal with is inappropriate coupling.

Inappropriate coupling is much harder to cope with than duplication of logic. Inappropriate centralization of logic frequently implies tight coupling.

Duplication isn’t the worst problem, coupling is.

Coupling

At some threshold, a high degree of coupling suggests that a component doesn’t have a well defined responsibility. Lots of coupling is a smell that may indicate single responsibility is being violated.

Coupling to stable things or volatile things? Lots of coupling to generic libraries (e.g. logging) isn’t necessarily bad because the logging dependency is stable. Lots of coupling to dependencies that capture domain logic is probably bad because domain logic is volatile - it changes much more frequently than generic things like logging.

The number of outgoing dependencies is indicative of how self-contained a component is.

Implied coupling is easy to miss. Sometimes design decisions are made with an eye toward reducing coupling, but mistakenly increase the degree of coupling between components. For example, entity-relationship modeling is one practice that usually implies a great deal of hidden coupling that only becomes apparent after the system is in production and the business requests a change. The reason traditional ER modeling introduces a lot of unnecessary coupling is because the evaluation of any given business rule or policy frequently only depends on one or two attributes from a given entity, however, when the entity is read or updated within a transaction that guarantees consistency within the business rule/policy’s consistency boundary, the entire database record associated with the entity being updated, including all the entity's unrelated attributes, are locked within the transaction. Even if attributes are being updated in different services/components independently, the entities are still under high contention because the entity is shared - no single component really owns the data.

SOA

Udi presents an opinionated view of what a service oriented architectural style should look like.

Udi adopts IBM’s definition of a service, “the technical authority for a specific business capability”. The big idea is that a service fully encapsulates all that is necessary to be the authority for a specific business capability - data, view, business rules/policy - such that the service is autonomous and has well defined boundaries. A big takeaway here is that the service doesn’t represent a noun or a verb. A service isn’t a wrapper around a database table. A service isn’t a wrapper around a function.

Entities

Entities and their relationships with one another are how we were all taught to model things, and it’s what we practice in industry on a daily basis. Tradition and formal education suggest we put all attributes related to a single type of noun in a single domain entity, which usually translates into a database table with those same attributes. Implicit in ER modeling is a high degree of coupling between anything that needs to read or write any of the attributes in the model.

We need to stop making traditional entities representing nouns. Don’t model “order”, “customer”, “product”, etc. Instead of thinking of nouns and their corresponding list of attributes, try to think in terms of individual attributes being tied to an identifier belonging to a particular business entity, such that only the attributes that change together are encapsulated in the same service and stored in the same database.

In a distributed system, the model is distributed. Entity properties are distributed. This should be the big aha moment, as it flips traditional modeling on its head. It means that a distributed system that implements an ordering system may not have an order model at all - there will be an order identifier, but perhaps no centralized order model. It’s counterintuitive because we’ve studied the traditional style in school and we’ve been practicing it in industry for years - it’s how we’ve been conditioned to think. Third normal form is not the holy grail.

The big takeaway is that we should design our models around transaction (consistency) boundaries. What attributes need to change together in the same transaction? This is the thing that requires a lot of careful consideration. Carefully analyzing the problem domain and figuring out consistency boundaries is the hard part - it is time consuming, requires domain experts in the room at design time, and it is an iterative guess and test process. That is the hard pill to swallow.

Services

A service may consist of multiple applications (e.g. a backend web app, an android app, an IOS app, and a javascript client component), but should all reside within a single repository. A service will capture a full vertical slice - data, domain logic, view/presentation.

Services should encapsulate associated sets of entity attributes.

A workflow is a composition of services. Within a workflow, an entity identifier may be threaded through the services, and each service maintains the relevant state associated with the entity identifier.

Services boundaries are determined by the consistency boundaries discovered at design time, as mentioned in the previous section, “Entities”.

Design

Moving away from traditional entity-relationship modeling is the hardest thing.

Analysis and design is a much more involved process than most realize. In general, development teams don’t spend sufficient time and effort on analysis and design. System design is usually done without a domain expert in the room. A domain expert should be in the room during system design.

In order to identify the high level system boundaries, you have to go deep to figure out what changes together - attribute by attribute. High level system design requires much greater depth - you have to do a deep dive into the domain. That is why domain experts must be on hand during high level system design. Development teams need to be able to ask domain experts whether one attribute would change with another attribute, or if one attribute has any bearing on the value of another attribute. Domain experts know those nuanced characteristics of the domain that development teams may ignore or improperly assume.

Engineering organizations need to be more assertive that we need a domain expert in design meetings. Domain experts can give guidance on future elements that would inform design.

System design is like diamond cutting - you have to analyze the problem domain and look at it from every angle in order to figure out how the diamond wants to be cut up.

Traditional design is “easy” because we arbitrarily define system boundaries around sets of related entities - related nouns.

Traditional design works best for monolithic systems because the entire domain model is in one system - all data is in one system boundary. Service design requires much more careful analysis and design of the data model.

How to find boundaries

System boundaries are more discovered than defined. Discovery is an iterative trial and error process.

The pain of this discovery process is less pain than building a traditional system, putting it into production, encountering foundational problems, and then trying to work around them with added complexity, e.g. distributed locking, trying to synchronize transaction commits across multiple applications in order to maintain system consistency, or performing complex workflows serially in order to retain system-wide consistency guarantees, etc.

Heuristics:

  • As we group data together, we do so with respect to a use case or a business process we are modeling.
  • If you need to do request-response across services, your system boundaries are wrong!
  • Words ending in “-ing” usually indicate a process (e.g. billing, shipping, etc.) made up of a series of steps that multiple services take part in.

Do not focus on entities. Focus on attributes.

Break existing entities apart:

  • which attributes influence one another?
  • which attributes are related?
  • which attributes change together in a transaction?
  • which attributes depend on one another?

Entity services are a bad idea.

Domain data should not be copied across services.

If you see a bunch of domain data being passed back and forth among services, either via RPC or by pub/sub, that is a red flag that your service boundaries are wrong!

Org structure doesn’t indicate service ownership.

Adopting SOA Incrementally

The decision to transition to a service oriented architectural style is a costly one. There is no easy way to just switch from whatever you have today to SOA.

The transition to SOA is going to be an incremental process over a long time.

Don’t try to “switch to SOA” with a big bang rewrite. Rewrites usually fail.

SOA adoption should be done in a phased approach. Each phase imposes new requirements in the areas of ops/infrastructure, development, and organizational maturity.

The entire organization has to be brought along on each phase; individual teams can’t effectively jump ahead to phases that the org isn’t ready for.

Phase 1

Ops/infra: continuous integration (CI)

Dev: unit tests; emit events from existing codebases; extract generic behaviors into their own services, e.g. email, pdf report generation, etc.

Org: celebrate wins publicly within org; lunch and learn; “skunk works”; invest 10% budget on building proper SOA, continue business as usual other 90% of time; borrow domain expert(s) as available during design meetings

Duration: 6-12 months

Phase 2

Continue practices from phase 1, plus the following:

Ops/infra: continuous delivery (CD)

Dev: pub/sub

Org: celebrate wins publicly within org; training; expand “skunk works”; expand to 20% budget on building proper SOA, continue business as usual other 80% of time; discuss UI composition; bring domain expert to design sessions more regularly as they’re available

Duration: 1-2 years

Phase 3

Continue practices from phase 2, plus the following:

Ops/infra: monitoring, HA

Dev: UI composition, rip-replace-stabilize-repeat

Org: celebrate wins publicly within org; “skunk works” transitions to accepted “new way”

Duration: 18 months to 3 years

Phase 4

Continue practices from phase 3, plus the following:

Ops/infra: Data migration

Dev: Data migration

Org: celebrate wins publicly within org; re-org

Duration: 2 years+

Long Running Processes

“A process is a set of activities that are performed in a certain sequence as a result of internet and external triggers.”

“A long running process is a process whose execution lifetime exceeds the time to process a single external event or message”

Long running processes can be likened to state machines, as they maintain their own state, and a single instance will handle multiple external events/triggers.

Long running processes can be considered a Domain Model, as described in the PoEAA book: "Domain Model mingles data and process…" to be used "if you have complicated and everchanging business rules involving validation, calculations, and derivations…"

Saga Pattern

The Saga pattern is one way to implement complicated business processes.

The Saga pattern as it was originally defined in 1987 implies that actions must have compensating actions, but Udi’s take on Sagas ignores that requirement.

A Saga is a multi-trigger process with state:

[trigger 1] → {process phase 1} → [save interim process state] → (time elapses while the process is otherwise idle/sleeping) → [trigger 2] → {process phase 2} → [save interim process state] → … → [trigger N] → {process phase N} → [save interim process state]

Each trigger/phase/save block is one transaction.

A saga sits within a service boundary.

Saga state constitutes an aggregate root.

Domain driven sagas encapsulate business policy, while services, which are broader in scope, encapsulate business capability. In practice, this means Sagas can share the same database used by the service. Multiple policies are modeled on the same data encapsulated by the common service that the sagas reside within.

Domain driven sagas are microservices done right. The microservices currently in vogue are not SOA style services at all - they don’t encapsulate a business capability. Entity services are a good example of an anti-pattern.