Goodbye Document-Relational. Until we meet again.
PreRamble
Jump to the articleThe last several weeks have been a lot for my family and I, and it's taken me a while to put my thoughts together in a way that's productive for sharing. I could get stuck in the weeds for hours on this one! But I will try to stay focused on a few key points.
Initial thoughts
A couple of weeks ago, Fauna announced that it is sunsetting its service. To say that I am disappointed is an understatement, and not just because it meant losing my job. I still believe that what we built was unique and one of the best operational databases available. So to lose that... I'm super bummed. I am hearted by the commitment to open-sourcing Fauna technology, though. I hope that it delivers a real tool for the community to use and acts as a touchstone to help push the whole database ecosystem forward.
Fauna certainly isn't perfect, but I can't help but reflect on its unique perspective and what we can learn from it. To that end, I want to take some time to call out things that went well, what perhaps could have been better, and what I want to see in other DBs.
What I am sharing is what has already been shared publicly; for example, in the architectural white paper. I hope that open-sourcing will shed some additional light on the inner workings.
I don't honestly know how long the Fauna blogs will be available, but I, for one, have been capturing them into Obsidian notes using a browser plugin: https://obsidian.md/clipper
Vectors and AI
I don't have enough energy to fully dig into AI in this post, but let's start up front with a quick callout that Fauna missed the boat. Everybody in the game has access to vector storage and indexing, if not integrations or tooling directly related to connecting those features with AI vendors. Here's a list of just the big players:
- pgvector for Aurora, Cockroach, Supabase, Neon, or any self-hosted PostgreSQL
- Vectors are built into MySQL
- MongoDB Vector Search
- Couchbase Vector Search
- Graph databases like Neo4j and ArangoDB
- You can connect services like DynamoDB to Amazon OpenSearch
- Vector Search in Azure Cosmos DB for NoSQL
- ScyllaDB promises "forthcoming vector storage and search features."
I mean, just look at the truly MASSIVE list of vector stores that LangChain supports!
Even Cassandra supports indexing and searching vectors now. Despite Fauna's storage being based on a fork of Cassandra, it seems it diverged to much to keep up with upstream changes this big.
I'll argue that Fauna needed deeper integration with AI to stay relevant (certainly to investors), and... it didn't. Fauna made the bet that an operational database should stick to operational data, that vectors weren't that, and that it was sufficient to hoist a separate solution for vectors (e.g. Pinecone) when you need AI integration for your application. Hindsight is 20/20.
I will say, however (and this is important), that you can only focus your limited resources on so many things at once, and Fauna had plenty to work on over the past several years. I do think it was a reasonable call at the time because of that, and I have no intention on beating anybody up over this. We were a small team and I saw everybody busting their butts on everything we worked on. These folks were rockstars 🤩! Even so, it became a non-trivial problem that Fauna lacked deeper support and integration with AI.
Whether or not the AI integration with any of these databases is being used, valued by customers, and is the most efficient solution is probably debatable when you look at things case-by-case, but what's less debatable is that AI is the hype train that investors are on. Working with AI seems to be table-stakes now with any tech venture.
All that said, there's still a TON that was great about Fauna. So having gotten my quick rant out of the way, let's dive into that!
"NoSQL" does not mean sacrificing consistency
Fauna gave a rude gesture to everything folks have been saying about the CAP Theorem when it comes to NoSQL databases. Specifically not the theorem itself (it still very much applies), but rather the assumption that "NoSQL" means "not consistent" and often "not relational". As Fauna put it:
Fauna is more specifically a Relational NoSQL database platform. The term "NoSQL" refers only to the interface; Fauna currently supports an execution-transparent, procedural interface instead of declarative SQL.
We'll come back to "execution-transparent, procedural interface" in a moment!
After years of workshopping it, Fauna finally settled on the term document-relational, or "docu-relational". Whatever you call it, enabling relationships, joins and traversals on top of document storage is one of Fauna's most powerful features. It's also how I discovered and came love the product.
My education and background are originally in Mechanical and Manufacturing engineering. In 2018, I was trying to build a database model for product design and lifecycle data. It's all very hierarchical, heterogeneous, and dynamic -- assemblies containing parts, which can be fabricated or purchased or modified, with multiple configurations and revisions in different approval states, all mixed up and composed together. That's not so simple to do in a traditional RDBMS.
Fauna is still the only solution I have ever found that has let me (so easily) define a data model with dynamic components and relationships. My brain thinks about data very much in terms of the relationships between things. Having fallen down the rabbit hole that is functional programming and category theory, it's only gotten worse 😂 but I have some better words to describe things such as algebraic data types, something document stores can actually excel at.
Another important note is that "flexible schema" doesn't need to mean "no schema". I would say it primarily means that it's easy to modify and evolve. To me, it also means supporting complex schemas through composition, inheritance, etc.
Have you ever worked with a game engine like Unity or Bevy? Fauna makes me feel like I am working with an ECS in the cloud.* Mmmmm... Okay, not quite, but the ability to handle dynamic schema gets it close. Maybe a mature column-family store would be a better comparison, but Fauna as an affordable, serverless option was a game changer for me.
Regardless, the archytypal storage in Bevy's ECS is a beautiful thing. Being built in Rust, the Bevy ECS is naturally designed to schedule around systems with immutable reads and mutable writes. There's some absolutely fascinating overlap between the ECS, its dynamic storage, and database systems. It's something I have an outline to research further!
* UPDATE 2025-04-11
I spoke too soon about "ECS in the cloud". I just learned today that title actually exists and belongs to SpacetimeDB. If you are interested in a transactional, relational, multiplayer/collaborative computing service, then you need to check this out!
On the "execution-transparent, procedural interface"
Okay, so let's talk about FQL. FQL was one of Fauna's greatest strengths and also weaknesses.
Fauna is a pay-for-what-you-use service. So in addition to needing fine control over transactions, you want to be able to predict how a query is executed and charged. With Fauna, there is no query planner. YOU are the query planner. You tell the database exactly what data to fetch and how to do it. In some respects this is great. Do you need to implement efficient search by geo-location? Use FQL to implement Uber's H3 algorithm within indexes and on the fly! On the other hand, YOU are the query planner, and you need to know how to tell the database exactly what to do and how to do it! 😅
As powerful as FQL is, it has been hard for a lot of folks to leverage it.
- It is a new language and comes with a learning curve.
- Folks have (clearly valid) concerns over vendor lock-in.
- It's not always clear how to approach a specific query pattern, especially with non-trivial use cases.
FQL stands out for its approach to providing powerful access to the database. If you have ever tried to perform a complex search or transaction consisting of 100s of lines of SQL, you understand the challenges of working with SQL as an interface for complex business logic. Yet it was hard to get on board the FQL train.
I wonder about what an ideal approach would be. 🤔 I think anything new needs to support SQL and some semblance of query planning. There are several vendors offering enterprise-ready serverless support for PostgreSQL, and their customers seem to have the query planning determinism problem in check. But I keep coming back to this lower-level interface and how to avoid the issues we've seen. Can drivers be provided to support using existing programming languages as the interface? Maybe not ad-hoc queries in a transactional way, but as stored procedures? It's something I'll be thinking about.
Serverless and distributed by default
Perhaps one of the clearest thing that stands out about Fauna is its protocol for consistent transactions over a distributed architecture, which was inspired by Calvin. Daniel Abadi and Matt Freels wrote an awesome article called Consistency without Clocks, which is something worth checking out.
Briefly: Fauna supports strictly serializable, externally consistent transactions, without relying on physical clock synchronization to maintain consistency, or placing any constraints on replica distance. As one example, I worked with one customer running a cluster with replicas in the US West, Ireland, and Japan.
As far as I know, nothing using Calvin has gotten as big as Fauna, but doing a quick search shows there continues to be some chatter in recent years about Calvin (some positive, some negative). It would be foolish to claim Calvin is the answer to life the universe and everything, but it certainly pushed the boundaries of the state of the art. The original paper didn't have all of the answers. Fauna filled some of them in and made some of its own decisions, but there is plenty more to explore.
Now, not everyone needs data replicated across the globe! 😜 But the scalability, availability, and durability granted from being natively distributed is something I saw folks come for time and again. If serverless is, in part, about scaling down to zero, then with Fauna you didn't have to sacrifice multi-regionality to do so. This is no small deal. Many customers came to Fauna after grief they suffered trying to scale within the same platform after choosing whatever small-scale option they started with ("serverless" or not).
Separating compute from storage, or pluggability, branching and time travel
I was recently reading articles from the folks at Neon about their architectural decisions and separating compute from storage. It had me thinking a lot about what Fauna does and doesn't do to separate things in this way.
I already hit on the fact that Fauna is distributed by default. It has always separated the transaction log from the underlying storage. Yet it definitely takes more care to provision enough storage for active data than described by Neon's approach of putting older pages into object storage. Very interesting that. Fauna's storage is based on a custom fork of Cassandra, and Neon has some notes about LSM storage (used by Cassandra, for example),
The storage engine is inspired by and shares some similarities with Log-Structured Merge (LSM) trees... Despite the similarities, we could not find an LSM tree implementation that fit our needs directly.
What I understand of Fauna's architecture (and there are a lot of assumptions here until open-sourcing can validate them), is that the keyspaces (essentially the "databases" for Cassandra) are pretty broadly used for a lot of data. Your databases and collections create a kind of "scope", or ID prefix, so your various collections' data, while sorted, all sit together (or their respective partitions anyway) in the same Sorted String Tables. And the temporal history is in there too. This is a tradeoff between managing many more files on disk and fewer complex ones.
Neon also highlights its decision's impact on temporal features. Neon provides branching and time travel through its use of neatly ordered pages, whereas the temporal information in Fauna is deeply integrated with the active data. Regarding time travel, Fauna made it easy (and still efficient) to query at snapshots in time. Point-in-time recovery is also technically possible with Fauna, but... it's painful implement as a user.
We know that branching and time travel are useful and important tools for a lot of folks. I think any new service needs to consider these in its architectural decisions.
One takeaway from this is that careful, intentional separation of concerns can enable the pluggability of different systems. PostgreSQL is awesome for that when you look at the success of something like pgvector but also clearly lacking when you consider storage.
Remember when I talked about vectors and AI? You can't miss the value of being flexible enough to adapt and integrate additional solutions like vector storage, indexing and search alongside everything else.
I have a habit of trying to abstract systems to an extreme and build for every possible outcome. It's very useful when troubleshooting and developing tests, but it's problematic at times. I had a great conversation with a friend of mine about building more narrow-purpose applications for very specific audiences. I was picking at an idea around crowdfunding when they pointed me to https://numarket.co/ as an example. There is nothing about the tech that means the app must stay limited to food business, but there is something important about working with a limited scope that lets you focus and solve the problem. So, while I can't help but think about an omni-super-perfect operational database, I do realize not every solution needs to solve every problem. Maybe I'll get over it. Maybe not 😮💨 I'll continue to talk about this like we maybe we find a perfect solution, with the understanding that anything in reality will be a mixed bag of compromises.
Authentication
Fauna provides a built-in Credentials collection with special functionality that automatically hashes passwords and generates authentication tokens. Fauna is not opinionated about what a "user" is, which grants developers a lot of flexibility. As always with new and more complex methods, I saw folks struggle with this at times, especially when getting started. The balance between giving developers more powerful tools while increasing learning curve is difficult to land and something for any software as a service to consider carefully.
Attribute-based Access Controls (ABAC)
ABAC controls access to specific documents by evaluating some logic on the caller and the document first. The combination of flexible auth tokens and FQL enables incredible control over access rules. You can define what is essentially middleware in FQL that runs on each query to determine if a caller has permissions to a specific document.
PostgreSQL calls this Row-Level Security (RLS).
This goes one step further than Role-based access controls that some databases are limited to like MongoDB or MySQL.
I would like to see more control over access with language that's easy to write, understand, maintain. Complex PostgreSQL policies in SQL can quickly get verbose and more difficult to read/understand/maintain.
Column-level security, or attribute masking
Attribute masking, where some fields are hidden from users without required privileges, is something Fauna does not have. I heard a lot of requests for it, and it is available in a wide range of other databases, either natively (as with MongoDB's Field-level redaction) or with extensions (as with Anonymizer for PostgreSQL). Workarounds typically included adding more user-defined functions (AKA stored procedures) to strip away values or to split data across collections with different privileges. I think these are reasonable, but having this baked into the auth system is delightful.
JSON Web Tokens (JWT)
Fauna added support for JWTs which greatly expanded the ways in which applications could connect. As I understand it, this works kind of like a middleware that verifies the signature on the JWT and hands off a special kind of temporary Token to the query. I can't say how deeply integrated this needed to be in the database itself, but it highlights another example of where built-in extensibility can be key to maintaining and evolving the product.
Open source, or lack thereof
I wasn't there way back in the day, so I can't comment on the call to not open-source Fauna sooner. As someone who interfaced with a lot of customers and potential customers, though, I observed Fauna's closed-source nature as a real barrier to community growth. As Fauna said in the announcement, "Driving broad based adoption of a new operational database that runs as a service globally is very capital intensive." No kidding! Enterprise Sales isn't cheap in its own right, but it was necessary given poor bottom-up adoption. Truly hitting virality with product-led growth was hard for Fauna. Fauna was new, had a new language, a learning curve, and lock-in to a closed-source system.
I can't say that being open-source would have saved Fauna. Probably not 😅. But I will charge that it would have reduced friction for a lot of folks. It probably would have opened up the possibility for faster development on features that were low priority for the company, but very high priority for select customers, like geo-data or vectors. It definitely would have made life easier for folks leaving the hosted service, now that it is closing. Was it the right choice to make in the moment? 🤷
Conclusion
There's so much to consider and talk about, but I tried to hit on some of the bigger thoughts I have on what I want a modern database to deliver. It's clear to me that, at some point, SQL as an interface just doesn't cut it, but it's still an open question in my mind how it should be resolved. What I think we need is to focus on architectures that are built inherently for distributed systems, with well-defined interfaces between components to enable scalability and extensibility, including things like integration with AI and other ML applications. Doing so will put a database in a good posture to join the latest hype trains and stay ahead, providing folks with valuable features. And open-source, I think, is critical to ensuring trust and, in turn, adoption. Community insight and contributions cannot be undervalued.