Skip to main content

3 posts tagged with "database"

View All Tags

· 10 min read
Himank Chaudhary
Yevgeniy Firsov

Building a new database

The most complicated and time-consuming parts of building a new database system are usually the edge cases and low-level details. Concurrency control, consistency, handling faults, load balancing, that kind of thing. Almost every mature storage system will have to grapple with all of these problems at one point or another. For example, at a high level, load balancing hot partitions across brokers in Kafka is not that different from load balancing hot shards in MongoDB, but each system ends up re-implementing a custom load-balancing solution instead of focusing on their differentiating value to end-developers.

This is one of the most confusing aspects of the modern data infrastructure industry, why does every new system have to completely rebuild (not even reinvent!) the wheel? Most of them decide to reimplement common processes and components without substantially increasing the value gained from reimplementing them. For instance, many database builders start from scratch when building their own storage and query systems, but often merely emulate existing solutions. These items usually take a massive undertaking just to get basic features working, let alone correct.

Take the OLTP database industry as an example. Ensuring that transactions always execute with "linearizable" or "serializable" isolation is just one of the dozens of incredibly challenging problems that must be solved before even the most basic correctness guarantees can be provided. Other challenges that will have to be solved: fault handling, load shedding, QoS, durability, and load balancing. The list goes on and on, and every new system has to at least make a reasonable attempt at all of them! Some vendors rode the hype wave of NoSQL to avoid providing meaningful guarantees, such as not providing linearizability in a lot of cases, but we think those days are long gone.

Vendors are spending so much time rebuilding existing solutions, they end up not solving the actual end users’ problems, although ostensibly that's why they decided to create a new data platform in the first place! Instead of re-inventing the wheel, we think database vendors should be focused on solving the user’s actual pain points like the "database sprawl" problems we covered in our previous blog post: "Why do companies need so many different database technologies".

The good news is there is a growing trend in the industry to reuse great open source software as a building block for higher level data infrastructure. For example, some vendors realized that a local key-value store like RocksDB could be used to deliver many features and enforce an abstraction boundary between the higher level database system and storage. They still hide RocksDB behind a custom API, but get to leverage all the development and testing effort it has received over the years. This can result in new systems that are both robust (durable, correct) and performant at the single-node level in a very short amount of time.

That said, even if all these storage systems magically used RocksDB under the hood, the problem of a complicated data infrastructure with many different components and data pipelines wouldn’t be solved. That’s because, even if all your storage systems use RocksDB under the hood, there is no common domain to perform transactions across them. After all, RocksDB is only a local key-value store.

While a local key-value store is a great abstraction for separating the storage engine from the higher-level system, Google took this a step further with Spanner, which is a SQL database built on a distributed key-value store. CockroachDB is another example of a SQL database built on top of a distributed key-value store. The distributed key-value store interface shifts the hard problems of consistency, fault tolerance, and scalability down into the storage layer and allows database developers to focus on building value-added features database users actually need.

Why Tigris Leverages FoundationDB to Replace Multiple Database Systems

Database

Until a few years ago, the open-source community did not have any viable options for a stand-alone, production-ready distributed key-value store with interactive transactions. FoundationDB, used by Apple, Snowflake, and others is such a system. It provides the same consistency and isolation guarantees as Spanner - strict serializability, and has an amazing correctness story through simulation testing. FoundationDB exposes a key-value API, similar to RocksDB, but it automatically scales horizontally to support petabytes of data and millions of operations per second in a single deployment on a modest amount of commodity hardware.

Tigris uses FoundationDB’s transactional key-value interface as its underlying storage engine. This separation of compute from storage allows Tigris to focus its attention on higher level concerns, instead of lower level ones. For example, we’d prefer to spend our time figuring out how to incorporate an OLTP database, search indexing system, job queue, and event bus into a single transactional domain instead of spending our time building and testing a custom solution for storing petabytes of data in a durable and replicated manner. In other words, we leverage FoundationDB to handle the hard problems of durability, replication, sharding, transaction isolation, and load balancing so we can focus on higher level concerns.

Building a new data platform on top of an existing distributed transactional key value interface like FoundationDB may not be the most efficient approach, but it heavily skews the implementation towards "correct and reliable by default", even in the face of extreme edge cases. Many existing distributed databases still struggle to pass the most basic aspects of Jepsen testing even after over a decade of development. At Tigris, we value correctness, reliability, and user experience over raw performance. The generic abstraction of a transactional, sorted key-value store is flexible enough to support many kinds of access patterns, as well as enable rapid feature development. For example, a secondary index is just another key-value pair pointing from the secondary index key to the primary index key. Searching for records via this secondary index is as simple as a range scan on the key-value store and doing point reads on the primary index to retrieve the records. Ensuring the secondary index is always consistent only requires that the secondary index key is written in the same FoundationDB transaction in which the primary record is updated.

Our goal is to provide a cohesive API for collections and streams that all work together. We use FoundationDB’s transactions to ensure the invariants of these data structures remain intact regardless of the different failure modes your cluster might encounter.

  • Tigris encodes index keys using the FoundationDB’s key encoder. It accepts high level data structures like arrays, strings, and numbers and encodes them into a lexicographically sortable byte string. Encoded keys are a common language for different components inside Tigris to communicate with.
  • Tigris Streams are powered by FoundationDB’s support for "baking" a transaction’s commit timestamp into a key, which allows reading all collection mutations in the total order of transactions committed into the database.

FoundationDB’s powerful primitives allow us to focus on the high-level API and experience of using Tigris, so you can build an application with rich functionality like search, background jobs, data warehouse sync, and more without having to grapple with a unique solution for each, or figure out how to keep them all in sync with each other. We’ve built a cohesive set of tools for data management within a modern application. And if we don’t support exactly what you need today, you can use the Streams API to ensure correct data sync to any other system.

While low level concerns like durability and data replication are not where we want to spend our engineering time, they are still extremely important! So why do we feel comfortable trusting FoundationDB with our most critical data? In a nutshell, it is because while perhaps lesser known, FoundationDB is one of the most well tested and battle hardened databases in the world.

FoundationDB’s Correctness and Fault Tolerance

Data Integrity

FoundationDB has more comprehensive testing than almost any other database system on the market. It relies on simulation testing, an approach where an entire FoundationDB cluster can be deterministically simulated in a single operating system process, alongside a wide variety of test workloads which exercise the database and assert on various properties. Simulation testing allows the test runtime to speed up time much faster than wall clock time by skipping over all the boring moments where code is waiting for a timeout to happen, or a server to come back up after being rebooted. This means many more "test hours" can pass for each real hour dedicated to running tests, and these tests can be run in parallel across many servers to explore the state of possible failure modes even further.

There are very few databases which even come close to this level of testing. Those that do typically only simulate a small portion of their system, leaving many components untested by simulation and only relying on unit tests. FoundationDB even simulates the backup-restore process to ensure the operational tools are just as good as the database itself. After all, those kinds of tools are often just as important! No database is perfect, but simulation catches problems before they make it into a user facing release.

FoundationDB also receives extensive real-world testing at Apple and Snowflake. The releases are beta tested by them long before being released to the community to ensure simulation results match reality of running in chaotic cloud environments. It also provides Tigris with extremely powerful workload management features. The database is constantly evaluating the load of the cluster to determine when it is "too busy", and when that happens it will artificially slow down starting new transactions until the load is stable again. By forcing all the latency to the beginning of your transaction, FoundationDB ensures every operation after that experiences a consistent, low latency. Many other database systems lack any workload management features at all, which manifests as overloaded clusters requiring operators to shut down all the clients to get it back under control.

We previously mentioned that backup-restore functionality is tested in simulation. That’s true of other features like disaster recovery, logging, and metrics too. These built-in tools are a core part of FoundationDB’s developer experience. Many other database systems force you to use a third-party backup tool once you cross a small scale due to table locking or other concurrency problems. FoundationDB’s backups and disaster recovery system are both streaming (so your recovery point objective can be measured in seconds instead of hours) and they work for large, busy databases, not just the toy sized ones.

Tigris Operations and Reliability

Reliable Data Flow

Tigris is built following a microservice architecture approach. It is deployed as a layer in front of FoundationDB. This allows us to have separate components that can each be scaled independently. For example, separating storage from compute ensures a better distribution of resources. Some deployments of Tigris will store a small amount of data that requires tons of CPU for processing queries, while others will store large amounts of data and use a tiny amount of CPU. You can right-size your Tigris and FoundationDB clusters separately so that valuable CPU and memory resources are never again stranded on a big database instance which mostly sits idle because of high storage utilization.

Tigris expands on FoundationDB’s integrated workload management features to provide more fine-grained workload management down to the individual collection level.

FoundationDB provides Tigris a strong foundation for building a high quality data platform. By leaving the low level details to a battle-tested system, Tigris can focus on building a cohesive, flexible set of tools that enable application developers to take an idea from production without stepping into the sinkhole of data pipelines, broken sync jobs, and complicated concurrency bugs present in many modern application architectures.


Tigris is the all-in-one open source developer data platform. Use it as a scalable transactional document store. Perform real-time search across your data stores automatically. Build event-driven apps with real-time event streaming. All provided to you through a unified serverless API enabling you to focus on building applications and stop worrying about the data infrastructure.

Sign up for the beta

Get early access and try out Tigris for your next application. You can also follow the documentation to learn more.

· 7 min read
Ovais Tariq

In our inaugural blog post Hello world, we talked about the problem of data infrastructure sprawl that, over the years, has complicated modern application development, while putting a lot of burden on the operations team who must manage, maintain and secure many different databases and technologies.

In this blog post, we’ll explain what we mean when we say "data infrastructure sprawl" by walking through a typical example, and then we’ll explain why it doesn’t have to be this way.

The standard evolution story of a modern application

Let's start by taking a look at what an application looks like at the beginning of the journey. The application is simple, it talks to a database, typically a traditional OLTP database such as Postgres, or MySQL.

Application and Database

To scale the application further and increase reliability, some of the application logic is moved to background processing. This requires introducing a message queue such as RabbitMQ or ActiveMQ. Now the application architecture looks something like this:

Application, Database and Message Queue

Over time, the application grows in popularity and more features are added. The development team starts to feel the pain of working with a relational database. They want to be able to scale reads and writes independently, so they introduce read replicas and update the application logic to split the read and write requests. This exposes the development team to infrastructure-related concerns. At the same time, it also increases the application complexity as developers now need to decide when it’s safe to read from the read replicas and when they need to query the primaries instead.

Application, Database with replicas, Message Queue

At some point, the business analysts may need to analyze the data to guide business decisions. They could do it for a while using the primary database, but eventually these requests will have to be isolated from production. One easy solution that companies adopt involves running a nightly job that exports the data out of production into a data warehouse like Redshift or BigQuery.

Application, Database with replicas, Message Queue, Data Warehouse

As the business continues to grow, so does the complexity of the application infrastructure required to support it. The developers continue to expand the application's functionality by adding full text search capabilities. They decide that introducing Elasticsearch or Solr would be best since the primary OLTP database does not have a capable search engine. The search functionality is introduced via a new microservice so that the team can iterate quickly and experiment with new technology without disrupting the existing business. The new search microservice needs to know when certain operations are performed on the primary OLTP database, so a message bus, such as Kafka, is introduced as well. Now the application architecture looks something like this:

Application, Database with replicas, Message Queue, Data Warehouse, Search

Does this look familiar? The application isn’t doing anything crazy from a technological perspective. All the team has built is a backend that can:

  1. Power CRUD operations.
  2. Provide search functionality.
  3. Exposes key business data to our analysts in a data warehouse.
  4. Allows experimentation and quick iteration in a “safe” manner that doesn’t put the entire business in jeopardy.

The backend though has become a complex distributed system with many different moving components. Each component must be deployed, configured, secured, monitored, and maintained. The team has to think about things like data replication, data consistency, availability, and scalability for each of the individual components as well.

Over time, the team starts spending less time building the functionality required to grow the business, and more time managing incidental infrastructure.

Fundamental reasons for the data infrastructure sprawl

This type of “data infrastructure sprawl” is the norm in the industry today, and justifiably so. At every point in the story above, engineering leadership made well-founded and logical architectural decisions, but the end result was high amounts of incidental complexity that made even straightforward feature development time-consuming. But at Tigris, we don’t think it has to be this complicated. Users are forced into this situation because the open source marketplace is filled with rich and powerful building blocks, but it’s lacking in holistic solutions.

Most open source databases today take a narrow view of the world:

  • They have rich query functionality, but they can’t scale beyond a single node. Or, they can scale horizontally, but push all the hard correctness and consistency problems to the application developers.
  • They can function as your primary data store, or they support rich full text search functionality, but rarely both.
  • Multi-tenancy and isolation between workloads are an afterthought or not present at all. Applications can’t be logically isolated while sharing the same underlying “database platform” in a safe manner. Often the database itself is “unsafe by default” and won’t provide so much as a warning before happily performing a full table scan every time someone navigates to your home page.

What should a modern database look like?

We believe that, “Correctness, reliability, and user experience over raw performance” is a good guiding principle that, if followed, would lead to the development of a modern database platform that could keep the data infrastructure sprawl at bay.

This of course sounds great on paper, but prioritizing user experience over micro-benchmarks is not how most modern database companies try to differentiate themselves.

Database Comparisons Today

What most database comparisons look like today

This may seem obvious, but building a holistic database platform that can keep data architectural sprawl at bay means that the developers of this new system must take a principled stand to always do right by the user instead of the benchmarks. This is a very difficult thing to do in the current competitive climate that is dominated by benchmark-obsessed marketing material. The recent controversy between Snowflake and Databricks is a great example of this. It’s much easier to quantify rows/s/core than it is to quantify (a) sleepless nights, (b) apologies issued to your customers, and (c) developer productivity.

Business Value Created

Product Success

What most database comparisons should look like

So what would a database that helps customers maximize business value instead of micro-benchmarks look like? At Tigris, it boils down to a system with the following characteristics:

  1. A flexible document model that enables developers to model data in a way that best suits their applications’ needs. The data model should also be easy to evolve, and schema changes should be as simple and painless as regular feature development.
  2. Simple and intuitive APIs that allow developers to quickly insert and retrieve data while continuing to use their preferred programming language - no new database query language to learn.
  3. Strictly serializable isolation by default. This ensures that developers never need to think about how transactions work or what their limitations are, nor do they need to configure things like read and write concerns or transaction isolation levels.
  4. “Distributed by default” with no painful transition point when the needs of the application begin to exceed the capabilities of a single node. Sharding should also be transparent, and the system should ensure that the database scales seamlessly as the application traffic increases, while core operations such as transactions do not have any restrictions in terms of the records or shards that can participate.
  5. Multi-tenant by default so that learning how to deploy, operate, and secure a single database technology is enough for the majority of the applications.
  6. An integrated search engine that provides developers with real-time search functionality eliminating the need to run a separate search platform and synchronize data.
  7. Built-in low latency replication to cloud data lakes for OLAP workloads that eliminates the need for developers to configure, track, and maintain separate ETL processes.

This is exactly what we’re building with Tigris! Stay tuned for future posts where we’ll discuss in detail how we’re leveraging open source technology like FoundationDB to build a rock solid and intuitive database platform in record time. We’re also hiring!

· 3 min read
Ovais Tariq
Himank Chaudhary
Yevgeniy Firsov

We're excited to announce the launch of Tigris Data, a company on the mission of simplifying data management for developers.

Over the years, data has become increasingly complex and difficult to manage. Developers have had their lives made exponentially more difficult due in large part to all these different technologies, data models, APIs, and databases they're expected to put together to build modern applications.

The database sprawl also puts a lot of pressure on operations teams, who must manage, maintain and secure these different databases and technologies. Then there is the onerous task of operationalizing these databases across multiple different cloud platforms.

Complexity to Simplicity with Tigris

We have personally experienced the pain and financial cost from the complexity. We want to make it easier for you to work with data by providing you with a unified database that helps quickly and easily structure the data in any way the application requires while providing you with one consistent experience across a broad set of workloads.

Furthermore, we are building the database to be open source, so you can avoid vendor lock-in and have the transparency to be sure that the product will continue to evolve and improve over time. Additionally, it enables anyone to be able to contribute improvements and extensions to the codebase. All-in-all being open source makes the product more accessible, robust, and secure for everyone.

We're grateful to have assembled an amazing team that is passionate about simplifying data management for developers. Our team has a wealth of experience in data management, distributed systems, and databases. We're excited to bring our knowledge and experience to bear on the problem of data management.

We're also proud to announce that in December of last year, we successfully raised a seed round led by Basis Set with additional funding from General Catalyst and many individual angels including founders at technology unicorns.

Now we're well on our way to a preview of the Tigris database later this year!

If you're interested in learning more about Tigris Data or staying up-to-date on our progress, subscribe to our mailing list or follow us on Twitter.

The founders

Before starting Tigris Data, the founders, Ovais, Himank, and Yevgeniy, spent almost six years working closely together at Uber. They developed and operated Uber's storage infrastructure, leading projects like Docstore, Herb, and DBEvents.

Their experiences have given them a lot of important lessons. They've learned about the importance of making architectural choices that can scale, establishing efficiency as a core principle, hiring the right talent, and cultivating a culture that supports diversity, innovation, and growth.