Transitive Closure in PostgreSQL
At Remind we operate one of the largest communication tools for education in the United States and Canada. We have...
At Remind we operate one of the largest communication tools for education in the United States and Canada. We have millions of parents, students, teachers and administrators use our application each day to improve learning outcomes by sending 10s of millions of messages per day. With this large scale usage we’ve had many opportunities to innovate and some of our most recent work has been around using PostgreSQL to capture graph-based structure and efficiently query them.
Our particular dataset consists of multiple organization trees. In education, most classes belong to a school and most schools belong to a district. But some classes may belong to a department, which may belong to a school, which may belong to a college, which may belong to a campus, which may belong to a university. We wanted to build a system that could not only represent these hierarchies but do so in an efficient manner.
As a concrete example, here’s how a middle school mach class might be represented:
In our database, we store these entities as an adjacency list. Note: We store these with UUIDs, but here we’ll use the names to make it easier to follow.
organization | parent |
---|---|
Algebra I | Maple Middle School |
Maple Middle School | Springfield District |
Springfield District | NULL |
Common query patterns we have require finding all classes in a district, or finding the district a class belongs to.
Naively, we need to traverse through Maple Middle School
to determine that Algebra 1
is a descendent of Springfield District
.
Our goal was to make such queries fast and performant while maintaining accuracy.
There are a few techniques for doing this inside a PostgreSQL database:
We started with option 1 but performance wasn’t quite where we wanted it to be. Option 2 was a viable path, but concerns around maintenance and our update pattern (where leaf nodes are frequently added and removed) didn’t seem like the right fit. So we went with Option 3.
The transitive closure of a graph tells if one node is reachable from another node. In our Algebra 1
example above, we have 6 edges in the transitive closure:
Note: We include self references here for reasons we’ll explain in a bit.
organization
, ancestor
, and distance
These are the columns that describe the edges in our graph. Given a class Algebra 1
in a school Maple Middle School
in a district Springfield District
, the edges look like:
organization | distance | ancestor |
---|---|---|
Algebra 1 | 0 | Algebra 1 |
Algebra 1 | 1 | Maple Middle School |
Algebra 1 | 2 | Springfield District |
Maple Middle School | 0 | Maple Middle School |
Maple Middle School | 1 | Springfield District |
Springfield District | 0 | Springfield District |
organization
is the start node, ancestor
is the end node and distance
captures how many other nodes we had to go through to get there.
This is a really great schema for querying. We have organization tables like organization_memberships
and we can perform simple queries like:
select
count(distinct "user")
from organizations_closure
join organization_memberships using (organization)
where ancestor = 'Springfield District'
This query will count the number of users enrolled in any class in the Springfield District
. We didn’t have to traverse through the schools, we could just enumerate the user enrollment directly by following the connection from district -> class
.
through_org
, distance_to_through_org
Something missing from the above data is why an organization is considered an ancestor. Let’s capture those reasons:
organization | ancestor | reason |
---|---|---|
Algebra I | Algebra I | Self |
Algebra I | Maple Middle School | Parent of Algebra I |
Algebra I | Springfield District | Parent of Maple Middle School |
Maple Middle School | Maple Middle School | Self |
Maple Middle School | Springfield District | Parent of Maple Middle School |
Springfield District | Springfield District | Self |
Each reason
is either Self
or Parent of X
where X
is an established ancestor of the organization. In our schema we capture this by recording the through_org
for each edge as well as the distance to the through_org
. For self references these will be NULL
. For any other references they will describe the node that must be travelled through to reach the ancestor
. The resulting data looks like this:
organization | distance | ancestor | through_org | distance_to_through_org |
---|---|---|---|---|
Algebra 1 | 0 | Algebra 1 | NULL | NULL |
Maple Middle School | 0 | Maple Middle School | NULL | NULL |
Springfield District | 0 | Springfield District | NULL | NULL |
Algebra 1 | 1 | Maple Middle School | Algebra 1 | 0 |
Maple Middle School | 1 | Springfield District | Maple Middle School | 0 |
Algebra 1 | 2 | Springfield District | Maple Middle School | 1 |
With our columns in place, we want to ensure this data is correct and that we will not drift from our adjacency list.
Our constraints:
through_org
.distance
of 0. All other rows must have distance
of distance_to_through_org + 1
.through_org
must be an ancestor of organization
(which can mean itself) and distance_to_through_org
must match the distance from organization
to through_org
.through_org
, ancestor
must be a direct parent of the through_org
.We can enforce these constraints in PostgreSQL:
"organizations_closure_check" CHECK (through_org IS NOT NULL OR organization = ancestor)
"organizations_closure_distance_check" CHECK (distance = 0 OR organization <> ancestor)
"organizations_closure_distance_to_through_org_check" CHECK (distance_to_through_org IS NULL AND distance = 0 OR distance_to_through_org = (distance - 1))
"organizations_closure_through_org_distance_fk" FOREIGN KEY (organization, through_org, distance_to_through_org) REFERENCES remind.organizations_closure(organization, ancestor, distance)
"organizations_closure_through_org_fkey" FOREIGN KEY (through_org, ancestor) REFERENCES remind.organizations(organization, parent) ON DELETE CASCADE
Finally, we want to ensure referential integrity, so we ensure that ancestor
is an organization that exists:
"organizations_closure_ancestor_fkey" FOREIGN KEY (ancestor) REFERENCES remind.organizations(organization) ON DELETE CASCADE
(It’s not necessary to have a foreign key on organization
as every row has an eventual dependency on a self reference. For self references, organization = ancestor
and so the foreign key on ancestor
gives us what we need. Having the additional key wouldn’t hurt, but it’s not necessary.)
With these constraints in place, we are protected against stale data from intermediate nodes being improperly removed.
To populate this table, we can follow a recursive approach:
Returning to our Algebra 1
example, from the above rules we can first establish the self references:
organization | distance | ancestor | through_org | distance_to_through_org |
---|---|---|---|---|
Algebra 1 | 0 | Algebra 1 | NULL | NULL |
Maple Middle School | 0 | Maple Middle School | NULL | NULL |
Springfield District | 0 | Springfield District | NULL | NULL |
We can then add immediate parents. Because organization
is its own ancestor
(from the self references), we can use those as through_org
and use their parent as ancestor
organization | distance | ancestor | through_org | distance_to_through_org |
---|---|---|---|---|
Algebra 1 | 0 | Algebra 1 | NULL | NULL |
Maple Middle School | 0 | Maple Middle School | NULL | NULL |
Springfield District | 0 | Springfield District | NULL | NULL |
Algebra 1 | 1 | Maple Middle School | Algebra 1 | 0 |
Maple Middle School | 1 | Springfield District | Maple Middle School | 0 |
Now that we have parents in our table, we can use those parents to add grandparents
organization | distance | ancestor | through_org | distance_to_through_org |
---|---|---|---|---|
Algebra 1 | 0 | Algebra 1 | NULL | NULL |
Maple Middle School | 0 | Maple Middle School | NULL | NULL |
Springfield District | 0 | Springfield District | NULL | NULL |
Algebra 1 | 1 | Maple Middle School | Algebra 1 | 0 |
Maple Middle School | 1 | Springfield District | Maple Middle School | 0 |
Algebra 1 | 2 | Springfield District | Maple Middle School | 1 |
And in this case, because Springfield District
has no parent, we’re done.
Whenever data is denormalized, the greatest risk is inconsistency. As we traverse more of the graph, a given relation depends on more and more rows. If a great-grandparent moves under another node, all of its children will need to be updated.
Our goal was to find a way to leverage database constraints to ensure correctness.
This is the schema and constraints we landed on:
Table "remind.organizations_closure"
Column | Type | Collation | Nullable | Default
-------------------------+---------+-----------+----------+---------
organization | uuid | | not null |
through_org | uuid | | |
ancestor | uuid | | not null |
distance | integer | | not null |
distance_to_through_org | integer | | |
Indexes:
"organizations_closure_pkey" PRIMARY KEY, btree (organization, ancestor)
"organizations_closure_ancestor_idx" btree (ancestor)
"organizations_closure_organization_ancestor_distance_idx" UNIQUE, btree (organization, ancestor, distance)
"organizations_closure_through_org_idx" btree (through_org)
Check constraints:
"organizations_closure_check" CHECK (through_org IS NOT NULL OR organization = ancestor)
"organizations_closure_distance_check" CHECK (distance = 0 OR organization <> ancestor)
"organizations_closure_distance_to_through_org_check" CHECK (distance_to_through_org IS NULL AND distance = 0 OR distance_to_through_org = (distance - 1))
Foreign-key constraints:
"organizations_closure_ancestor_fkey" FOREIGN KEY (ancestor) REFERENCES remind.organizations(organization) ON DELETE CASCADE
"organizations_closure_through_org_distance_fk" FOREIGN KEY (organization, through_org, distance_to_through_org) REFERENCES remind.organizations_closure(organization, ancestor, distance)
"organizations_closure_through_org_fkey" FOREIGN KEY (through_org, ancestor) REFERENCES remind.organizations(organization, parent) ON DELETE CASCADE
We also want to prevent any invalid states that we couldn’t fully capture in our schema. The primary case we need to prevent is a cycle.
To prevent a cycle, we use the following function as a before insert
trigger:
CREATE FUNCTION remind.organizations_block_cycle() RETURNS trigger
LANGUAGE plpgsql
AS $$
begin
if exists (select from remind.organizations_closure where ancestor = NEW.organization and remind.organizations_closure.organization = NEW.parent) then
raise exception 'Cannot add descendant as parent, would create cycle';
else
return NEW;
end if;
end
$$;
Now that we have a schema and constraints that ensure correctness, we need a way to update it easily. While the schema will prevent invalid states, we don’t want to hit these errors. We want the normal range of CRUD operations to propagate changes to these tables and keep them in sync.
We used the following functions in an after insert
trigger:
CREATE FUNCTION remind.organizations_rematerialize_closure(uuid) RETURNS void
LANGUAGE plpgsql
AS $_$
begin
-- This could maybe be reworked to recursively build the list of orgs that need to be
-- rematerialized and then materialize them all in one fell swoop.
perform remind.materialize_org_ancestors($1);
-- When we update the ancestors of the parent, we need to update all the children
-- so that they reflect the new grandparent, great-grandparent, etc.
perform remind.organizations_rematerialize_closure(organization) from (
select remind.organizations.organization
from remind.organizations
where remind.organizations.parent = $1
) child_orgs;
end;
$_$;
CREATE FUNCTION remind.materialize_org_ancestors(VARIADIC uuid[]) RETURNS void
LANGUAGE plpgsql
AS $_$
begin
with recursive closure as (
select
organizations.organization,
null::uuid as through_org,
organizations.organization as ancestor,
0 as distance,
null::integer as distance_to_through_org
from remind.organizations
where organization = any ($1)
union
select
closure.organization as organization,
organizations.organization as through_org,
organizations.parent as ancestor,
closure.distance + 1 as distance,
closure.distance as distance_to_through_org
from remind.organizations
join closure on (closure.ancestor = organizations.organization)
where organizations.parent is not null
)
insert into remind.organizations_closure (organization, through_org, ancestor, distance, distance_to_through_org)
select * from closure
on conflict (organization, ancestor) do nothing;
end
$_$;
Essentially, we rematerialize all ancestors for a given org and then iterate through its children and rematerialize their ancestors (and continue until we’ve reached all leaf nodes). This is a recursive process and it is why preventing cycles is essential.
When rematerializing ancestors, we use a recursive CTE. On each iteration we track through_org
and distance_to_through_org
.
For deletes and updates we rely on our foreign keys on delete casade
as well as a before update trigger to clean up rows that are no longer relevant.
This is a lot of complexity. While we are confident the data in the table will be accurate, we were (rightfully) very concerned about application errors when a constraint blocks us from making a change. We decided that robust testing was essential to having confidence operationally.
When we started this, we had been separately exploring a style of testing called property testing in a couple of different contexts. The core idea is that if you can create a simple model that captures your behavior, you can generate broad and nearly exhaustive test cases for a more complex implementation. In our case, the model is fairly straightforward (it’s just a tree structure) and the materialization is fairly complex. It has worked very well for us.
Our property tests (using the proptest
crate) build an organization forest (or a collection of organization trees) in memory. We then sync this structure into the database and assert the following for each organization:
organizations_closure
row exists for the organization and each of its ancestors with the correct distance values. No other ancestors should exist.organizations_closure
row exists for each of its descendants with the correct ancestor and with the correct distance values. No other descendants should exist.Then we mutate the tree with arbitrary actions. For each action we perform the mutation on our simple model as well as in the database.
Action | Simple Model | Complex Model |
---|---|---|
Nothing | Nothing | Nothing |
Detach | Remove from children, add to tree list | update organizations set parent = null where organization = ? |
Update (1 step) | Remove from old parent, add to new parent | update organizations set parent = ? where organization = ? |
Update (2 step) | Remove from old parent, add to new parent | update organizations set parent = null where organization = ? followed by update organizations set parent = ? where organization = ? |
Delete (roll up children) | Move children to grand parent, remove organization | update organizations set parent = ? where parent = ? followed by delete from organizations where organization = ? |
Delete (detach children) | Add children to tree list, remove organization | update organizations set parent = null where parent = ? followed by delete from organizations where organization = ? |
In addition to these, we have a handful of other cases more specific to our product.
We then assert the same properties as before.
This testing strategy has been great. The proptest
crate captures the generated tests that fail and includes them in all future runs as a regression suite. While developing we ran with 10k cases, but in our CI suite we now run with “just” the default of 256 cases. The thoroughness of the testing approach has made adding additional actions (for example, “Archiving” an organization) easy to implement and has given us very high confidence as we roll new features out.
We rely heavily on AWS Aurora with PostgreSQL. One of the great features available with Aurora are clones. In about 15 minutes, a Copy-on-Write clone can be spun up and be used for testing. With our full production dataset in a clone, we created the table and constraints and started populating it. It was no surprise that we had some cycles in our data set (a decade of organic data at scale means nothing is a surprise), but otherwise everything went smoothly. We worked with our support team to clean up the cycles, added the tables to prod, and backfilled the data. Everything ran smoothly and we’ve been using this table ever since.
We use this to power a product that serves millions and millions of parents, students, teachers and administrators every day. It’s improved performance, reduced load, and accelerated our team’s ability to deliver value to our customers. A special thanks to all the team members at Remind who worked on this from conceptualization to running it in prod. Especially Phil Frost, who held the vision and set us on this path.