contexts were introduced in Phoenix I was really
excited. Baking in Domain Driven Design (DDD) concepts felt like
a good guardrail. It would make us create lines in our
code. Boundaries that would force us to get to know our business
better, and in the process become better developers.
Fine and dandy until you bump up against your business’ “God” entity. The thing that is literally all things to everyone. Working at Flatiron School you can imagine what our big entity in the sky is.
Students are everywhere. Well, sort of. When we’re trying to find the right fit for them, they’re applicants. Before that, they’re leads. When they join, they’re students, and then on to becoming graduates! Then they become Job Seekers, and then awesome developers.
They’re a lot of things to different departments, and those departments care about different parts of the student. Admissions officers care about how to get in contact with a student, their schedules, and program of interest. They’re interested in how applicants will handle tuition, and finding relevant scholarship awards.
Our instructors don’t care how you got there. If you’re here, it’s time to work. They need your GitHub username, a slack username, and your name.
Our career services team cares about the program you’ve completed, your interests as a developer, and where they can help best connect you with the wider developer network.
What’s this look like in the context of…. contexts?
Buckets of functionality
For your first pass you might try something like this:
You’d have all of your contexts just reach in and get the generic user data, then dump the data you don’t need. You’d then run joins on your data and attach your transformed user data at the last minute before responding to your requests.
What this diagram is showing is that there’s ONE source
of truth for account information. None of these other
contexts can directly assert FACTS regarding account
information. They can use it to deliver an experience,
but it’s not theirs. Any changes have to go through
To be honest, I hate this. I either have to make clunky
requests for more data than I need, or I wind up making
functions that require a slew of Graphql like query
parameters to refine a query to what I want. All the while,
Accounts context gets more and more bloated under the
weight of having to support all of these “different things
to different people” cases. This sort or reminded me of something
Are we doing microservies?
On the surface, this approach sort of feels like an approach to microservices where you make synchronous requests at run time to fulfill requests sent to your system. The same problems apply here. You need a user? Well you’ve got to go to the User service. Want a course catalog? Hit the Course Management system. Oh, it’s not the data you needed? Too bad, that’s what the service gives you. This approach always bothered me since it made YOUR system vulnerable to another systems availability. Now you’re on the hook for circuit breakers and graceful degradation. FUN!
There’s another way of doing Microservices that might work as good inspiration for us when working with Phoenix Contexts.
If you haven’t heard of CQRS, it’s a method of organizing a distributed system where the central means of communication is messaging. Here’s a sample interaction.
I glossed over the CQRS magic because the thing we care about here is the messaging as a result of a change. Let’s see what’s happening in the education service.
From this point on, anytime the Education service needs
user information, it can read it from its eventually
consistent LOCAL data source. The beauty of this is that at
any point, our Account service goes down, our instructors
are able to keep on chuckin’. Here a read model is storing
ONLY the bits you need to do the job. Don’t be afraid to
take in data that would be
joined in different databases
when writing one of these. You’re optimizing for reading,
Ok, what’s this got to do with Phoenix?
This isn’t a bait and switch for me to write about CQRS. It ties into Phoenix Contexts quite nicely. Say we look at all of our Contexts as if they were microservices using CQRS to keep “copies” of data. You wouldn’t want to use Phoenix Pubsub to handle internal messages to keep these copies up to date. That’d be crazy. What you DO want is a way to keep your underlying “local” data up to date with what the source of truth says is FACT. You also want to view this data in the way that makes the most sense for your domain.
Introducing SQL Views
For all intents and purposes you interact with SQL views the same way you do any other table. You can create an Ecto schema just like you would with any other table. You can filter it like any other table. From your application’s perspective, it’s just a table.
Let’s look at some code.
Accounts and Education
We’re going to model a couple of contexts. Accounts has an
underlying Account schema, SlackAccount schema, and a
GithubAccount schema. Then we’ll make the Education
Context. It’s world revolves around student progress and
cohort membership. You’ll see that the representation of
Account doesn’t really map on to what we’d need for a
student. The data is there, it’s just not easy to get at.
On sign up we create an
account and you can link
a Github and Slack account if you want. This is what the
Account Schema Looks like.
Let’s add a top level
get/1 function to our context that
fetches WAY too much data.
So… Even looking at this, we’re probably going to have to do a lot of work to make this data useful. Checkout what this get query looks like.
While having all of this data as joins make sense in this context, it’s a pain in the neck for me to work with since now I have to manage nested relationships every time I need to render the parts of this account relevant to my domain. Everytime I want to load a student’s progress, I’ll have to unwind and wade through data I just don’t care about. Let’s leverage some SQL to if it can help.
Generating a view
We’re going to generate a view that returns a single record for an account, and only with the parts we care about. SQL TIME
Let’s perfect our query first. We’re going to bring in all of the tables and join them, ensuring we always return an account if it exists, even if it’s incomplete.
LEFT OUTER JOINs to prevent filtering out
accounts that don’t have github or slack accounts linked.
SO GREAT! This data is what I would expect if I was working
with a “Student”. I can even rename columns to ones that make
sense for what I’m building. A field named
SlackAccount struct makes total sense under accounts.
When I’m looking at this unified view, I’d like it named
Here’s the view:
That’s it! Now if you run a query against
you’ll get this.
|3||Steven||Nunez||Steven The guy||Steven the chatter||Steven The Coder|
Ok, ok… I see you squinting your eyes looking at the words
“small_hack” and wondering if I’m stealing all of your BitCoin with
this blog post. Rest assured, I have your best interest at
heart. You see, views have this weird feature in that in
most cases, you can write to them. Meaning you’d write to
the underlying table which means WE’VE BROKEN OUR CONTRACT
WITH THE SOURCE OF TRUTH. Wouldn’t it be nice if we could
make this query Read only? Well, that little
makes it so we spook Postgres enough to prevent it from
letting us run an insert.
Let’s generate the Student Schema and run a query. Then
we’ll try to insert a record.
Let’s take it for a spin.
It’s beautiful! From here you can treat this view like any other table. You can join it against other “local” data, and always know your data is backed by the real source of truth.
Odds and ends
So, this is cool. You’re going to change all of your contexts to use views now. There are a couple of things to think about. One is that you’re now tied to the source of truth and any changes to that table’s structure can break your views! Not great. It’s the reason you don’t want microservices that share the same database. One benefit we have on our side is that our unit tests will fail our build since missing a column would likely be catastrophic to your app.
I wanted to mention Materialized Views. We’re using regular old views which means that when you run the query, it’s running the underlying query RIGHT THEN AND THERE. This means you’ll always have the latest data available. So what’s a Materialized View? It’s a view that’s stored on disk and is rerun based on some trigger, say a record was inserted into a specific table. Why would you want this? What do you lose? If your queries are slow, Materialized Views get you a wicked fast response since it’s reading a cached value. What you lose is your data might lag a bit depending on how expensive the query is. Setup is a little different, but the concept is the same.
That’s all for today. Thanks for reading!comments powered by Disqus Copyright © 2020 Steven Nuñez - HostileDeveloper