Friday, May 12, 2017

Reading Note on DDD(Domain Driven Design) - Aggregates



Aggregate is the idea that promotes DDD among the Microservice community: how big should a Microservice be? What should be included in one Microservice and not another? The common recommendation is: one aggregate should be in one Microservice. 

There are some terms defined in the book that we should be familiar with. 

Entity: an entity is an object that has an identity through its lifecycle, for example, Customer, Employee etc. 

Value: a value object is an object that has no identity. For example, Address of a Customer is composed of Country, City, Street, it has no meaningful identity; however, in a mapping application, an Address might be modeled as Entity

Aggregate: An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes. Each AGGREGATE has a root and a boundary. The boundary defines what is inside the AGGREGATE. The root is a single, specific ENTITY contained in the AGGREGATE. The root is the only member of the AGGREGATE that outside objects are allowed to hold references to, although objects within the boundary may hold references to each other. ENTITIES other than the root have local identity, but that identity needs to be distinguishable only within the AGGREGATE, because no outside object can ever see it out of the context of the root ENTITY.

An Aggregate needs to follow these rules:
  • The root ENTITY has global identity and is ultimately responsible for checking invariants.
  •  ENTITIES inside the boundary have local identity, unique only within the AGGREGATE.
  •  Nothing outside the AGGREGATE boundary can hold a reference to anything inside, except to the root ENTITY.
  •  Only AGGREGATE roots can be obtained directly with database queries. All other objects must be found by traversal of associations.
  •  Objects within the AGGREGATE can hold references to other AGGREGATE roots.
  •  A delete operation must remove everything within the AGGREGATE boundary at once.
  • When a change to any object within the AGGREGATE boundary is committed, all invariants of the whole AGGREGATE must be satisfied.
  • When a change spans across Aggregate boundaries, invariants spanning across boundaries might not be enforced at all times.


If we implement an Aggregate as a Microservices, the last two rules apply to Microservices perfectly: transactions across Microservices are not expected to be consistent at all times, the usual recommendation is for “eventual consistency” (check The devil is in the details – eventual consistency)

Take the following example:


A Car is obviously an Entity with a global identity; outside of the Car context, we probably don’t care about a Tire – no one would be interested to query the database and find out what Car a particular Tire is attached to. Our interest of Tire is through Car, Tire identification is meaningful within the car context.  You might be interested to track an Engine independently of a Car, if so, Engine is outside of the Car context.

Repository: A repository functions like a DAO: to save and resurrect objects to/from a storage (usually a database), but in DDD, you should only provide Repository for the Aggregate Root. In the Car context, there will only be a CarRepostory, whose getCarById() method will return you a Car with proper attributes, for example, a Car with 4 wheels in correct positions and 4 tires. There will no TireRepository nor WheelRepository.

The book provides a more extended example in Chapter 7 (Using the language: An Extended Example). The example is about a delivery company delivering cargoes for customers:



The business logic is:
  •  Multiple Customers are involved in a Cargo with different roles, shipper, receiver, payer etc.
  •  A Delivery Specification defines the goal of shipping a Cargo.
  •  A Handling Event is an action taken with Cargo, such as loading it to a ship, or clearing it through customs.
  •  A Carrier Movement represents a trip by a Carrier (a ship or a truck) from one Location to another. 
  •  A Delivery History represents that has happened to a Cargo.
With this model, how should we define boundaries (and Aggregates)?

Cargo, Delivery Specification, Delivery History obviously belong together: Delivery Specification and Delivery History have no interesting identity on their own, our interest on them is through Cargo.  

Customers, Locations, and Carrier Movements stand on their own with global identities: we are interested to find out about them directly, not through a Cargo. So they will be the root of their own Aggregates.

Handling Event is a tricky one. It can be included in the Cargo context, if so, we will find out it through Cargo -> Delivery History -> Handling Event.  (Remember, we can only query directly the Aggregate Root, other objects must be traversed through associations) ; on the other hand, a Carrier Movement is shared by many Carriers, and we might be interested to know for a particular Carrier Movement, what Handling Events must be carried out. In this sense, Handling Event is meaningful outside of the Cargo context. Handling Events can happen outside of a Carrier Movement, for example, clearing a cargo through customs.

If it is hard for you to decide, think about how this application will be used:
  • There should be “booking function” that allows customers to book a cargo delivery, they can specify a delivery specification and track delivery history. The Cargo context will be used mostly here.
  • There should be a “logging function” that allows operations to log handling events for all cargos.
From the business functions, we can see “Handling Events” stands out on its own.

With the above analysis, we arrive at the following Aggregates: Cargo, Delivery History, and Delivery Specification are in one Aggregate with Cargo as the Aggregate root; other entities are Aggregate root of their own Aggregates.  





Notice, in this diagram, there is no HandlingEventRepository, even though Handling Event is the Aggregate root, let us play it along to see what happens. Without its own Repository, Handling Event can be saved and retrieved through Cargo:

public static HandlingEvent newLoading(

        Cargo cargo, CarrierMovement loadedOnto, Date timeStamp) {

    HandlingEvent event =

            new HandlingEvent(cargo, LOADING_EVENT, timeStamp);

    event.setCarrierMovement(loadedOnto);

    cargo.getDeliveryHistory().addEvent(event);

    return result;

}

What is wrong with this approach?

  • It is cumbersome to maintain this relationship.
  • If when adding an event to a Cargo, other users are modifying the Cargo, the transaction of adding an event will fail. The “logging function” is used by operation people, and needs to be efficient.


To address these issues, we add “Handling Event Repository”:



This approach has its own problem: now Handling Event is saved independently of Cargo, there is no guarantee that Cargo will get an up-to-date view of the history. This speaks to the last rule of AggregateWhen a change spans across Aggregate boundaries, invariants spanning across boundaries might not be enforced at all times.

Now you can see why DDD sparks so much interest among the Microservice Community. Microservice is not easy (check Start with Microservice (in mind) - I think Martin Fowler is wrong), a Microservice architecture has 3 layers, while “infrastructure layer” and “application infrastructure layer” are pretty much technical and can be helped by many open sources today, a well-behaving “application layer” requires you to carefully construct your models so there will be minimum interdependencies among them. If boundaries of Micorservices are not designed carefully, there will be a lot of dependencies among Microservices, and maintaining data consistency will be a hell. Aggregate provides an approach for you to reason about boundaries.  

2 comments: