Constellation Mesh

Because Pleiades is like nothing else which exists, by design, it's hard to describe how it is different. Initially, Pleiades was compared to a globally distributed data fabric, but that was missing a core aspect of Pleiades: autonomy. We tried to compare it to a data mesh, but there's no architectural alignment with the specific business domains expected with a data mesh. After many different iterations, and comparing Pleiades to many different system models, Sienna landed on *constellation mesh*.

What is a constellation mesh?
A mesh, as defined by Oracle, is a distributed architecture for data management. That fits into the distributed architecture model for Pleiades without defining domain alignment but doesn't define the type of mesh. In systems engineering, the term "constellation" is commonly used to refer to autonomous satellites which work in coordination to provide different aspects of a distributed, unified, and autonomous data set. Pleiades is a distributed, autonomous system working in independent coordination focusing on data management as it's primary feature (re: fancy distributed database).

So what makes Pleiades a distributed, autonomous system working in coordination? Pleiades is designed from the ground up to be able to handle an exabyte's worth of data while only having a single operator. This means internal architectures require a mixture of distribution and autonomy whenever possible, giving visibility to the operator, but also not requiring hands-on operational management. Configurations, workload scheduling, and many other internal operations in the constellation must happen independently, and the scale requires a leader-less, decentralized design. This also increases the complexity, but only if modeled incorrectly.

Understanding Configuration Propagation
One of the most useful models for understanding the systemic impacts of automated decision-making in a decentralized network is a force-directed graph. In force-directed graphs, attraction is generally modeled with $F_s = kx$ and repulsion is generally modeled with $|F| = \frac{1}{4\pi\epsilon_0}\frac{|q_1q_2|}{r^2}$, and iterative simulations demonstrate how mechanical equilibrium can be achieved across the entire graph. Here is a force-directed graph using [Verlet integration](https://en.wikipedia.org/wiki/Verlet_integration) to demonstrate network propagation between nodes. > **Note:** *Click and drag the nodes to see how force applied to one node affects the rest*



While Pleiades doesn’t use Verlet integration or operate in this exact way, it’s a useful visual example. To understand how it relates to Pleiades, it’s helpful to understand the nuances of this visual.

Automated decision-making in a distributed system can have very similar sets of characteristics, but instead of a mechanical equilibrium being simulated, *virtual change equilibrium* would be achieved through network propagation. Virtual Change Equilibrium (VCE) is the result of a Constellation Runtime Event (CRE, re: a change) being successfully propagated throughout the entire constellation.

Pleiades' internal clustering model is modeled with graph connectivity instead of a centralized membership. Each node in the constellation is only aware of its neighbours, some top-level metadata, and how to handle CREs. A CRE is a neighbour-only broadcast. An example would be when a node is shutting down, it will broadcast the CRE of `leave` to its local neighbours who will repeat the same message, so on and so forth, until the constellation has reached virtual change equilibrium. Other CREs are things like `join`, `query`, and `update`, all of which follow the same propagation model to achieve VCE.

The constellation model is enabled through SWIM, and Pleiades uses Hashicorp's implementation with their Lifeguard extensions. While SWIM allows for the constellation's members to be modeled concretely, Pleiades also leverages network tomography to handle VCE. Pleiades uses the Hashicorp network tomography library, `coordinate`, to provide real-time computed network coordinates for each node in the constellation. SWIM enables constellation membership, `coordinate` provides locality, and the internal lifecycle state machines together make Pleiades an autonomous distributed system.

Using a workload adjustment event, it can be easier to understand the implementation nuances. When a node containing the leader of a range replica receives an internal scaling event (re: scale up or scale down), it will send a CRE with some change metadata, and the constellation will quickly achieve VCE asychronously. From the triggering node's perspective, VCE happens once the broadcast call returns. After triggering the CRE, the node will broadcast a `query` CRE asking for the nearest neighbours with available capacity to create a new range replica. Once the neighbours have been identified, the node will communicate directly with the closest identified neighbour to instantiate a range replica. The identified neighbour will broadcast the new range replica CRE as part of the initial VCE, and the original node will start the relevant scaling workflow. One the workflow has finished, the original node which triggered the CRE will broadcast a final CRE with the final change metadata.

While there are many things not covered in that example, the internal autonomy of the constellation allows for things like scaling events to be handled by any node, while keeping the entire constellation aware of the change. This is what makes Pleiades a constellation mesh database. Hopefully that helps! Please feel free to open a discussion if you have questions.