Phoenix Contexts and Crossing Boundaries

When the contexts were introduced in Phoenix I was really excited. Baking in Domain Driven Design (DDD) concepts felt like a good guardrail. It would make us create lines in our code. Boundaries that would force us to get to know our business better, and in the process become better developers.

Photo by Hiroshi Tsubono on Unsplash

Fine and dandy until you bump up against your business’ “God” entity. The thing that is literally all things to everyone. Working at Flatiron School you can imagine what our big entity in the sky is.

Students are everywhere. Well, sort of. When we’re trying to find the right fit for them, they’re applicants. Before that, they’re leads. When they join, they’re students, and then on to becoming graduates! Then they become Job Seekers, and then awesome developers.

They’re a lot of things to different departments, and those departments care about different parts of the student. Admissions officers care about how to get in contact with a student, their schedules, and program of interest. They’re interested in how applicants will handle tuition, and finding relevant scholarship awards.

Our instructors don’t care how you got there. If you’re here, it’s time to work. They need your GitHub username, a slack username, and your name.

Our career services team cares about the program you’ve completed, your interests as a developer, and where they can help best connect you with the wider developer network.

What’s this look like in the context of…. contexts?

Buckets of functionality

For your first pass you might try something like this:

You’d have all of your contexts just reach in and get the generic user data, then dump the data you don’t need. You’d then run joins on your data and attach your transformed user data at the last minute before responding to your requests.

Photo by Sorin Gheorghita on Unsplash

What this diagram is showing is that there’s ONE source of truth for account information. None of these other contexts can directly assert FACTS regarding account information. They can use it to deliver an experience, but it’s not theirs. Any changes have to go through Accounts

To be honest, I hate this. I either have to make clunky requests for more data than I need, or I wind up making functions that require a slew of Graphql like query parameters to refine a query to what I want. All the while, my Accounts context gets more and more bloated under the weight of having to support all of these “different things to different people” cases. This sort or reminded me of something else though…

Are we doing microservies?

On the surface, this approach sort of feels like an approach to microservices where you make synchronous requests at run time to fulfill requests sent to your system. The same problems apply here. You need a user? Well you’ve got to go to the User service. Want a course catalog? Hit the Course Management system. Oh, it’s not the data you needed? Too bad, that’s what the service gives you. This approach always bothered me since it made YOUR system vulnerable to another systems availability. Now you’re on the hook for circuit breakers and graceful degradation. FUN!

Photo by Nik Shuliahin on Unsplash

There’s another way of doing Microservices that might work as good inspiration for us when working with Phoenix Contexts.

CQRS

If you haven’t heard of CQRS, it’s a method of organizing a distributed system where the central means of communication is messaging. Here’s a sample interaction.

I glossed over the CQRS magic because the thing we care about here is the messaging as a result of a change. Let’s see what’s happening in the education service.

graph TD Message -->|Handles incoming messages| ag([Aggregator]) ag -->|Processes message and updates read model| db[(Read Model)]

From this point on, anytime the Education service needs user information, it can read it from its eventually consistent LOCAL data source. The beauty of this is that at any point, our Account service goes down, our instructors are able to keep on chuckin’. Here a read model is storing ONLY the bits you need to do the job. Don’t be afraid to take in data that would be joined in different databases when writing one of these. You’re optimizing for reading, not writing.

Ok, what’s this got to do with Phoenix?

This isn’t a bait and switch for me to write about CQRS. It ties into Phoenix Contexts quite nicely. Say we look at all of our Contexts as if they were microservices using CQRS to keep “copies” of data. You wouldn’t want to use Phoenix Pubsub to handle internal messages to keep these copies up to date. That’d be crazy. What you DO want is a way to keep your underlying “local” data up to date with what the source of truth says is FACT. You also want to view this data in the way that makes the most sense for your domain.

Introducing SQL Views

For all intents and purposes you interact with SQL views the same way you do any other table. You can create an Ecto schema just like you would with any other table. You can filter it like any other table. From your application’s perspective, it’s just a table.

Let’s look at some code.

Accounts and Education

We’re going to model a couple of contexts. Accounts has an underlying Account schema, SlackAccount schema, and a GithubAccount schema. Then we’ll make the Education Context. It’s world revolves around student progress and cohort membership. You’ll see that the representation of an Account doesn’t really map on to what we’d need for a student. The data is there, it’s just not easy to get at.

erDiagram accounts ||--o| slack-accounts : belongs-to accounts ||--o| github-accounts : belongs-to

On sign up we create an account and you can link a Github and Slack account if you want. This is what the Account Schema Looks like.

Here’s SlackAccount

And GithubAccount

Let’s add a top level get/1 function to our context that fetches WAY too much data.

So… Even looking at this, we’re probably going to have to do a lot of work to make this data useful. Checkout what this get query looks like.

While having all of this data as joins make sense in this context, it’s a pain in the neck for me to work with since now I have to manage nested relationships every time I need to render the parts of this account relevant to my domain. Everytime I want to load a student’s progress, I’ll have to unwind and wade through data I just don’t care about. Let’s leverage some SQL to if it can help.

Generating a view

We’re going to generate a view that returns a single record for an account, and only with the parts we care about. SQL TIME

Photo by Goh Rhy Yan on Unsplash

Let’s perfect our query first. We’re going to bring in all of the tables and join them, ensuring we always return an account if it exists, even if it’s incomplete.

We’re using LEFT OUTER JOINs to prevent filtering out accounts that don’t have github or slack accounts linked.

SO GREAT! This data is what I would expect if I was working with a “Student”. I can even rename columns to ones that make sense for what I’m building. A field named username on the SlackAccount struct makes total sense under accounts. When I’m looking at this unified view, I’d like it named slack_account.

Here’s the view:

That’s it! Now if you run a query against students, you’ll get this.

id	first_name	last_name	username	slack_username	github_username
3	Steven	Nunez	Steven The guy	Steven the chatter	Steven The Coder

Ok, ok… I see you squinting your eyes looking at the words “small_hack” and wondering if I’m stealing all of your BitCoin with this blog post. Rest assured, I have your best interest at heart. You see, views have this weird feature in that in most cases, you can write to them. Meaning you’d write to the underlying table which means WE’VE BROKEN OUR CONTRACT WITH THE SOURCE OF TRUTH. Wouldn’t it be nice if we could make this query Read only? Well, that little WITH trick makes it so we spook Postgres enough to prevent it from letting us run an insert. Let’s generate the Student Schema and run a query. Then we’ll try to insert a record.

Let’s take it for a spin.

It’s beautiful! From here you can treat this view like any other table. You can join it against other “local” data, and always know your data is backed by the real source of truth.

Odds and ends

So, this is cool. You’re going to change all of your contexts to use views now. There are a couple of things to think about. One is that you’re now tied to the source of truth and any changes to that table’s structure can break your views! Not great. It’s the reason you don’t want microservices that share the same database. One benefit we have on our side is that our unit tests will fail our build since missing a column would likely be catastrophic to your app.

Photo by Green Chameleon on Unsplash

I wanted to mention Materialized Views. We’re using regular old views which means that when you run the query, it’s running the underlying query RIGHT THEN AND THERE. This means you’ll always have the latest data available. So what’s a Materialized View? It’s a view that’s stored on disk and is rerun based on some trigger, say a record was inserted into a specific table. Why would you want this? What do you lose? If your queries are slow, Materialized Views get you a wicked fast response since it’s reading a cached value. What you lose is your data might lag a bit depending on how expensive the query is. Setup is a little different, but the concept is the same.

That’s all for today. Thanks for reading!

Hostile Developer

I am Steven. Hear me Rant.