Skip to main content

· 6 min read
Ovais Tariq

Data is the lifeblood of modern applications, powering rich and diverse user experiences. We rely on these applications in our daily life to do everything from managing logistics, finances, shopping needs, and even the most mundane and basic of tasks.

However, as much as these modern applications have become data-dependent, the supporting data infrastructure hasn't evolved to support these rich data needs. Today's applications still rely on "databases" in the classical sense, a concept designed in the 1970s and 1980s.

How we got here

Over the past five years, we have seen not only the exponential growth of new applications being launched but also the diversification of tools and infrastructure components needed to support those applications. Most new applications are built using a microservice architecture.

Each of these microservices will often have its own disparate database components, many times more than one. This variance is usually driven by the requirements of the services around data structure, model, and access patterns. The need to support multiple data models and patterns has led to the idea of a specific toolset or component for a particular use-case has become the norm. There has been a "gold" rush for infrastructure companies to stake a place in this new landscape, creating new tools and resulting in increasingly complex architectures. The tools being released are generally focused on specific usage cases or components instead of solving the larger problem.

Looking at the current CNCF landscape you can see how evident the variety and sprawl is

CNCF landscape - databases and streaming

This complex map shows the hundreds and hundreds of different infrastructure components that can be bundled together. Most of them are focused on performing one specific niche item. We often need many of these components to build a robust modern application because the core infrastructure components ( like the database ) were never designed to meet today's needs. We are now in a time when DIY builders can couple together Frankenstein's monster to support their application.

The impact of increased complexity

While purpose-built tooling and components have some benefits, the downside is that the environments in which we currently build applications are getting larger and more complex.

MongoDB recently surveyed 2,000 IT professionals; over 50% described their data architecture as "Complex".

Percona ( an Open Source Database Vendor ) did a similar survey and found that the number of organizations supporting 1000s of databases in production more than doubled in the last year. That same survey found over 90% of the respondents run multiple databases, and most running five or more.

This complexity has grown not only the footprint of companies' data infrastructure. Still, it has forced developers to learn and maintain the skills to build and support dozens of disparate components, with each component having its APIs, characteristics, access methods, operations, and best practices.

And time is a resource developers are lacking!

A 2022 study by Reveal.io showed that 40%+ of developers are challenged by not only the high demands being placed on them but the constant raising of those requirements.

Stack Overflow's 2022 survey showed that a quarter of all developers spend more than 25% of their time every week looking for solutions to problems. No wonder that same study showed that nearly 53% of developers are looking at Low or No Code solutions to help reduce the time to release applications.

The more complex the environment, the more time is spent maintaining, troubleshooting, learning, and getting components to play nicely together. Much of the complexity of the infrastructure comes from how data is stored and accessed using legacy database management systems.

Databases are dead

Databases are legacy

One of the main reasons for the explosion in data infrastructure tools is that the databases were never designed for the rich use cases applications have today. Then, instead of rethinking the developer experience related to data and widening the functions of the database, we added more bolt-on tools.

Our idea of how we need to store and interact with data in applications dates back to the 1970s. While databases have evolved regarding stability, security, and scalability, the core functionality remains the same. At its core, all databases have these common tenants:

  • Stores data in a self-describing universal format
  • Allows you to query data with a specific language
  • Separates the data from the application, allowing many applications to connect and use the same data source
  • Handles the persistence of the data
  • Secures data in some manner

By definition, database management systems are designed to separate the application and data, focusing on interacting with data in a narrow way. This is archaic thinking in today's modern environments.

Today applications and data are as connected as ever before. The application should dictate the best way to store and retrieve data, not the other way around.

The Developer Data Platform

If we were to go back and redesign the core systems we use to store, access, and use data with what we know now, we would build a completely different experience. Instead of adding more disparate components and bolt-ons, we need a new solution to handle the modern requirements. We need a Data Platform purpose-build for Developers with the following characteristics:

  1. Has the capabilities to meet the demands of a multi-model application.
  2. Data stored is optimized for application access patterns, not the application designed to adhere to the database's preferred patterns.
  3. All data is queryable and searchable directly from common application frameworks (the tools, libraries, and packages developers use daily) and APIs without learning a new language.
  4. Data is sharable in common formats for end users, applications, APIs, and services in real time.
  5. Infrastructureless from the developers' perspective: indexing, sharding, HA, recovery, and standard database administration operations are handled automatically.
  6. Built with modern cloud-native architecture with independently scalable components.
  7. Provides control of their data to the users in a secure way that complies with laws and regulations.

Many database companies like to say they are taking a developer-first approach to designing their products, but most of the time, they remain stuck in the past.

It is time to start building the data platforms that conform to how developers think, code, and create.

This is why we are building Tigris. We hope other vendors will evolve as well to better support a modern developer's needs.


Tigris is the all-in-one open source developer data platform. Use it as a scalable transactional document store. Perform real-time search across your data stores automatically. Build event-driven apps with real-time event streaming. All provided to you through a unified serverless API enabling you to focus on building applications and stop worrying about the data infrastructure.

Sign up for the beta

Get early access and try out Tigris for your next application. You can also follow the documentation to learn more.

· 11 min read
Anshuman Bhardwaj

The client-server model is one of the most used patterns in web development. In its simplest form, the client-server model can be described as a resource seeker (client) requesting the resource from a computer (server) serving it.

Advancements in web technologies have intensified the need for real-time client-server interactions. Protocols like WebSocket solve many of the problems in the older client-server model. WebSocket provides convenient bidirectional communication between the client and server while allowing messages to be broadcast among a variety of clients. But its flexible approach encourages bad practices among developers, such as not setting up an API contract for request/response or overusing WebSockets where HTTP would do fine. Its overuse has caused it to lose its essence.

HTTP/2—the successor of the HTTP protocol—provides advantages over its predecessor, such as multiplexing and server push. Although it's a significant improvement over HTTP/1, HTTP/2 is not a replacement for WebSockets. So what is the future of real-time client-server interactions?

In this article, you'll learn more about real-time client-server interactions as well as the HTTP/2 and WebSocket protocols and their use cases. If you're planning to build real-time interactions, you'll be able to choose the right tools for your application needs.

Why Do You Need Real-Time Client-Server Interactions?

Real-time client-server interactions have become vital for many modern applications. This means it's crucial for any developer who's looking to develop fast and resilient applications to understand the inner workings of real-time communication.

Below are some popular use cases for real-time client-server interactions:

  • Modern-day chat applications like WhatsApp are an implementation of real-time client-server interactions, hence the name instant messaging. Such real-time communication can also be implemented using solutions like Tigris, which provide you with an event stream to listen to the latest messages published.
  • Multiplayer online games use real-time client-server interactions to broadcast different players' locations and maintain the leaderboard. Low latency is crucial for these games because the lower the latency, the closer it is to real time.
  • Stockbroker platforms rely on real-time client-server interactions to broadcast ever-changing stock prices. Real-time interactions ensure that buying and selling prices are as accurate as possible so users avoid losses due to system delays.

What's the Future of Real-Time Client-Server Interactions?

Both HTTP/2 and WebSockets are potential solutions for your real-time communication needs, but each protocol comes with benefits and drawbacks.

How Does the HTTP/2 Protocol Work?

HTTP/2 is an advanced version of HTTP. It follows the same semantics and external formatting like headers, methods, and response codes, but its internal implementation offers performance enhancements.

HTTP/2 Single TCP

HTTP/2 follows the same request and response model as its predecessor, but unlike HTTP, HTTP/2 uses a single TCP connection per origin, and it supports multiplexing for requests and responses. This means that there can be more than one request/response in flight on the same TCP connection, utilizing the bandwidth to its full potential.

This amazing efficiency of HTTP/2 is made possible through two important features: binary encoding for transmission and splitting requests/responses into multiple frames.

  • HTTP used plaintext-based communication, which was easier for humans to understand, but it caused unnecessary overheads to computer systems. HTTP/2, on the other hand, encodes the small messages and frames into a binary format and later decodes them on the client/server. This encoding only affects the transmission of the messages and not how they appear on the sending or the receiving sides, which means the client and server will still follow the same semantics of HTTP, like methods or headers.
  • The binary framing splits the messages into smaller units called frames, which carry portions of the message like headers or data along with the stream ID to which they belong. Frames can be put in any order or even in a different stream during transmission. They are put together in the correct structure at the end of the line at the application protocol layer. Frames allow HTTP/2 to send multiple requests/responses in parallel without worrying about mixing up the messages.

How Does the WebSocket Protocol Work?

WebSockets provide a persistent channel for interaction between the client and server. A WebSocket connection is created by the client initiating a handshake request with the server. After both parties agree on the request, a bidirectional connection persists until one of the parties chooses to close it.

WebSocket connection

The following headers are sent in the first HTTP request for connection upgrade:

Upgrade: WebSocket
Connection: Upgrade

The Upgrade header indicates that the client wants to upgrade the connection to the WebSocket protocol. The Connection header is sent because the Upgrade header is a hop-by-hop header.

As you can see, the WebSocket protocol depends on HTTP for the initial handshake. After that, the WebSocket connection stays open for any number of messages to be sent across the channel.

The request and response formatting is set more loosely in WebSockets, and it's left to the messaging layer—the developers—to settle on a specific request and response structure.

Why Is HTTP/2 the Future of Real-Time Client-Server Interactions?

While WebSockets provide several advantages, you can also use HTTP/2 to implement many real-time applications. Consider:

  • Stream prioritization in HTTP/2 allows you to prioritize streams for loading urgent data. This feature is built into the protocol as long as the server supports it. WebSocket doesn't provide any such feature, leaving the heavy lifting to developers.
  • HTTP/2 caters to one of the fundamental needs of the modern web—edge caching. This makes it a good choice for serving cacheable information for faster responses, like stock prices from the past year. Messages delivered using WebSockets are not cached.
  • HTTP/2 follows widely known request methods. Each method (GET, POST, PUT) has a designated meaning, making the APIs accessible across teams without knowledge overhead. With WebSockets, the messaging layer is responsible for implementing a similar mechanism to request methods, which creates ambiguity.
  • As with the requests, the responses of HTTP/2 are well-defined and standardized. An error in the request resolution will reflect in its response code. Because there's a definite request resolution, it's easier to synchronize multiple requests. WebSocket, though, doesn't provide request resolution out of the box. The protocol doesn't guarantee any acknowledgment from the receiver, leaving the error handling on the messaging layer.

HTTP/2 doesn't support WebSockets as HTTP did, making the protocols an either-or choice. One good workaround for this situation would be to use the gRPC protocol, which uses HTTP/2 under the hood.

Alternatively, you can simplify the process with a tool like Tigris, which uses gRPC with HTTP/2 to provide event streaming to your application without much setup. Tigris provides SDKs in TypeScript, Java, and Go that you can use to get started quickly.

Building Real-Time Client-Server Interactions with Tigris

To demonstrate, you're going to build an application that adds new users to a Tigris database and updates the frontend (containing a table of users) in real time.

Prerequisites

You'll need the following installed on your system:

Getting Started

Once you have Docker running on your system, run the following command to get the Tigris local development environment running on port 8081:

docker run -d -p 8081:8081 tigrisdata/tigris-local:latest

Clone the tigris-starter-ts repository

git clone https://github.com/tigrisdata/tigris-starter-ts.git

Now open it in a code editor like Visual Studio Code. The starter repository provides a basic ecommerce application with REST API for CRUD operations on users, products and orders.

We will extend it to add a social element, by allowing users to submit social messages.

Run npm i to install all dependencies.

Create a new public folder inside the project and add an index.html file to build the client for your application.

Then, update the public/index.html file to show two input fields: nickName and message and a button to publish a new message using the /social_messages/messages/publish API endpoint:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta
name="viewport"
content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"
/>
<meta http-equiv="X-UA-Compatible" content="ie=edge" />
<title>Tigris real-time example</title>
</head>
<body>
<label for="nickName">Nick name</label>
<input type="text" placeholder="Nick name" id="nickName" />
<label for="message">Message</label>
<input type="text" placeholder="Message" id="message" />
<button onclick="addNewMessage()">Send message</button>
<table class="messages">
<thead>
<tr>
<th>NickName</th>
<th>Message</th>
</tr>
</thead>
<tbody id="messages-table"></tbody>
</table>
<script>
const baseURL =
"http://localhost:8081/api/v1/databases/tigris_starter_ts/collections";

// publish new messages
function addNewMessage() {
const nickNameInput = document.getElementById("nickName");
const messageInput = document.getElementById("message");
const nickName = nickNameInput.value;
const message = messageInput.value;
nickNameInput.value = "";
messageInput.value = "";
fetch(`${baseURL}/social_messages/messages/publish`, {
method: "POST",
headers: {
"content-type": "application/json",
},
body: JSON.stringify({
messages: [
{
nickName,
message,
},
],
}),
});
}
</script>
</body>
</html>

Adding a Message Row

Now, use the Tigris Event Streaming API to subscribe to the messages published to the social_messages topic using the /social_messages/messages/subscribe API endpoint and add a new row inside the #messages-table.

Update the public/index.html file with the following:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta
name="viewport"
content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"
/>
<meta http-equiv="X-UA-Compatible" content="ie=edge" />
<title>Tigris real-time example</title>
</head>
<body>
<label for="nickName">Nick name</label>
<input type="text" placeholder="Nick name" id="nickName" />
<label for="message">Message</label>
<input type="text" placeholder="Message" id="message" />
<button onclick="addNewMessage()">Send message</button>
<table class="messages">
<thead>
<tr>
<th>NickName</th>
<th>Message</th>
</tr>
</thead>
<tbody id="messages-table"></tbody>
</table>
<script>
const baseURL =
"http://localhost:8081/api/v1/databases/tigris_starter_ts/collections";

// publish new messages
function addNewMessage() {
const nickNameInput = document.getElementById("nickName");
const messageInput = document.getElementById("message");
const nickName = nickNameInput.value;
const message = messageInput.value;
nickNameInput.value = "";
messageInput.value = "";
fetch(`${baseURL}/social_messages/messages/publish`, {
method: "POST",
headers: {
"content-type": "application/json",
},
body: JSON.stringify({
messages: [
{
nickName,
message,
},
],
}),
});
}

// listen for new messages
fetch(`${baseURL}/social_messages/messages/subscribe`, {
method: "post",
}).then(async (response) => {
const streamReader = response.body.getReader();
while (true) {
const { value, done } = await streamReader.read();
if (done) break;
const string = new TextDecoder().decode(value);
const strLines = string.split("\n");
for (let i in strLines) {
if (strLines[i].length === 0) continue;
let {
result: { message },
} = JSON.parse(strLines[i]);
const newMessageRow = document.createElement("tr");
newMessageRow.innerHTML = `<td>${message.nickName}</td><td>${message.message}</td>`;
document
.getElementById("messages-table")
.appendChild(newMessageRow);
}
}
});
</script>
</body>
</html>

The code above uses the streamReader.read() function to consume the stream, parses the JSON string from the response, and then adds the message to the table until the stream is closed.

You can learn more about parsing HTTP streaming responses in this article.

Updating the setup Method

Update the setup() method inside the src/app.ts file to serve static content from the public folder:

import express from "express";
import { DB, Tigris } from "@tigrisdata/core";
import { User, userSchema } from "./models/user";
import { Product, productSchema } from "./models/product";
import { Order, orderSchema } from "./models/order";
import { UserEvent, userEventSchema } from "./models/user-event";
import { SocialMessage, socialMessageSchema } from "./models/social-message";
import { UserController } from "./controllers/user-controller";
import { ProductController } from "./controllers/product-controller";
import { OrderController } from "./controllers/order-controller";
import { SocialMessageController } from "./controllers/social-message-controller";

export class App {
private readonly app: express.Application;
private readonly port: string | number;
private readonly dbName: string;
private readonly tigris: Tigris;
private db: DB;

constructor() {
this.app = express();
this.port = 8080;
this.dbName = "tigris_starter_ts";
this.tigris = new Tigris({
serverUrl: "localhost:8081",
insecureChannel: true,
});

this.setup();
}

public async setup() {
this.app.use(express.json());
this.app.use(express.static("public")); // add this line
await this.initializeTigris();
await this.setupControllers();
}

public async initializeTigris() {
// create database (if not exists)
this.db = await this.tigris.createDatabaseIfNotExists(this.dbName);
console.log("db: " + this.dbName + " created successfully");

// register collections schema and wait for it to finish
await Promise.all([
this.db.createOrUpdateCollection<User>("users", userSchema),
this.db.createOrUpdateCollection<Product>("products", productSchema),
this.db.createOrUpdateCollection<Order>("orders", orderSchema),
this.db.createOrUpdateTopic<UserEvent>("user_events", userEventSchema),
this.db.createOrUpdateTopic<SocialMessage>(
"social_messages",
socialMessageSchema
),
]);
}

public setupControllers() {
new UserController(this.db, this.app);
new ProductController(this.db, this.app);
new OrderController(this.db, this.app);
new SocialMessageController(this.db, this.app);
}

public start() {
this.app.listen(this.port, () => {
console.log(
`⚡️[server]: Server is running at http://localhost:${this.port}`
);
});
}
}

Run npm run build to compile the TypeScript into dist/index.js as JavaScript code.

Lastly, run npm run start to start the server and open http://localhost:8080 in a web browser.

Your application with real-time client-server interaction is ready.

Real-time application with Tigris

Conclusion

Real-time client-server interactions will only become more vital as applications and use cases continue to evolve. Though both the WebSocket and HTTP/2 protocols can work well for real-time communication, HTTP/2 is the better option in certain situations. You can improve operations with HTTP/2 even further by using an alternate protocol like gRPC.

As you build your real-time websites and apps, you can enable seamless development with Tigris. Tigris is an open source developer data platform available with a simple yet powerful, unified API that spans search, event streaming, and transactional document store. It enables you to focus on building your applications rather than on managing databases.


Tigris is the all-in-one open source developer data platform. Use it as a scalable transactional document store. Perform real-time search across your data stores automatically. Build event-driven apps with real-time event streaming. All provided to you through a unified serverless API enabling you to focus on building applications and stop worrying about the data infrastructure.

Sign up for the beta

Get early access and try out Tigris for your next application. You can also follow the documentation to learn more.

· 5 min read
Ovais Tariq

Complexity to Simplicity with Tigris

  • Launch to offer open source and serverless API-based developer data platform in order to solve problems around scalability, search and support
  • Platform based on open source database FoundationDB, and backed by leading figures in the world of open source as well as General Catalyst, Basis Set Ventures and Netlify

Sunnyvale, CA - 23rd September 2022 - Tigris Data, a groundbreaking open source developer data platform provider, today announced the launch of its serverless and open source developer data platform Tigris. Tigris is aimed at web and mobile apps, providing a unified API that spans search, event streaming, and transactional document store, along with smart features like automatic indexing and management. By using serverless and a single API, this approach simplifies the data infrastructure for application developers and enables them to focus on building.

The company launched with seed funding of $6.9million, including backing from General Catalyst and Basis Set Ventures as well as a range of open source and software leaders including Guillermo Rauch (CEO at Vercel) and Rob Skillington (CTO and Co-Founder, Chronosphere). In addition, Peter Zaitsev, CEO at Percona, is an Advisor for Tigris. Tigris Data also received funding from Netlify's Jamstack Innovation Fund, which supports the most promising companies in the community.

The cost of maintaining multiple data services is not only expensive, but is disruptive to development time. Tigris is an all-in-one open source developer data platform with a single, unified API to ensure developers spend less time on tedious infrastructure management and more time on what they do best - coding and feature development. Tigris is based on FoundationDB, a distributed database that was open sourced by Apple in 2018 under the Apache 2.0 licence. With FoundationDB at its core, Tigris is designed to help developers to:

  • Eliminate the need for deploying, configuring, securing, monitoring, and maintaining data infrastructure
  • Prevent data silos and data infrastructure sprawl
  • Stop the need for administration operations such as cluster management, query tuning, index optimization, and other maintenance activities
  • Reduce the number of moving parts within applications that can cause outages and slowdowns
  • Remove reliability and scalability issues as applications scale and need changes
  • Avoid vendor lock-in and maintain control of their data

"Developers want to build applications and scale up those services, and they don't want to spend time managing databases or linking up infrastructure components. They want to use an API like approach to interact with data, in just the same way as they interact with other services. And they want to achieve all this fast. This is why we built and launched Tigris," commented Ovais Tariq, CEO at Tigris Data. "Providing that single approach to data management in one developer-friendly environment makes it easier to prevent data infrastructure sprawl and let the developers focus on what they like best - building - while still providing all the services that developers need. Lastly, building this as an open source platform means that developers can avoid lock-in."

As a company, Tigris Data was founded by senior engineers, Ovais Tariq, Himank Chaudhary and Yevgeniy Firsov, that led the development of data storage and management at Uber, where they faced exactly the same challenges around data growth and infrastructure sprawl. Before this, the team worked across various data and infrastructure teams including time at Yahoo, SanDisk, Percona and Khoros. With this experience, the team was inspired with the idea of creating a developer data platform to simplify data applications without sacrificing speed or scalability.

"Ovais and the whole team at Tigris are super focused on customer value - building a true platform solution that scales with the developer," said Quentin Clark, managing director at General Catalyst. "Out in the industry, we are seeing innovation towards eliminating the depth of 'infrastructure' expertise that was once necessary to build to scale. Tigris is exactly this."

"Developers increasingly make the decisions that set their applications – and their companies – on the route to success. Being the developer data platform of choice for developers is a huge market that existing companies have not fully solved, so there is a great opportunity for new entrants to win here. We are pleased to support the Tigris Data team in their approach," added Xuezhao Lan, Founder and Managing Partner, Basis Set Ventures.

"The open source market continues to innovate and find solutions to the problems that developers and data professionals face. Simplifying the interactions around data using API interface fits with how developers want to build and implement their applications in the cloud. I'm happy to support Ovais and his team in building Tigris as they start their journey," commented Peter Zaitsev, advisor to Tigris Data and CEO at open source database company Percona.

About Tigris Data

Tigris Data provides an open source developer data platform that enables developers to store, access, stream, and search data from the tools and languages they already use and love. It provides all the base functionality developers need to build robust, data-driven applications in seconds, making it easier to run and manage the applications that deliver what customers want. Join the Tigris waitlist and be the first to try the new open-source developer data platform for your next application.

· 10 min read
Himank Chaudhary
Yevgeniy Firsov

Building a new database

The most complicated and time-consuming parts of building a new database system are usually the edge cases and low-level details. Concurrency control, consistency, handling faults, load balancing, that kind of thing. Almost every mature storage system will have to grapple with all of these problems at one point or another. For example, at a high level, load balancing hot partitions across brokers in Kafka is not that different from load balancing hot shards in MongoDB, but each system ends up re-implementing a custom load-balancing solution instead of focusing on their differentiating value to end-developers.

This is one of the most confusing aspects of the modern data infrastructure industry, why does every new system have to completely rebuild (not even reinvent!) the wheel? Most of them decide to reimplement common processes and components without substantially increasing the value gained from reimplementing them. For instance, many database builders start from scratch when building their own storage and query systems, but often merely emulate existing solutions. These items usually take a massive undertaking just to get basic features working, let alone correct.

Take the OLTP database industry as an example. Ensuring that transactions always execute with "linearizable" or "serializable" isolation is just one of the dozens of incredibly challenging problems that must be solved before even the most basic correctness guarantees can be provided. Other challenges that will have to be solved: fault handling, load shedding, QoS, durability, and load balancing. The list goes on and on, and every new system has to at least make a reasonable attempt at all of them! Some vendors rode the hype wave of NoSQL to avoid providing meaningful guarantees, such as not providing linearizability in a lot of cases, but we think those days are long gone.

Vendors are spending so much time rebuilding existing solutions, they end up not solving the actual end users’ problems, although ostensibly that's why they decided to create a new data platform in the first place! Instead of re-inventing the wheel, we think database vendors should be focused on solving the user’s actual pain points like the "database sprawl" problems we covered in our previous blog post: "Why do companies need so many different database technologies".

The good news is there is a growing trend in the industry to reuse great open source software as a building block for higher level data infrastructure. For example, some vendors realized that a local key-value store like RocksDB could be used to deliver many features and enforce an abstraction boundary between the higher level database system and storage. They still hide RocksDB behind a custom API, but get to leverage all the development and testing effort it has received over the years. This can result in new systems that are both robust (durable, correct) and performant at the single-node level in a very short amount of time.

That said, even if all these storage systems magically used RocksDB under the hood, the problem of a complicated data infrastructure with many different components and data pipelines wouldn’t be solved. That’s because, even if all your storage systems use RocksDB under the hood, there is no common domain to perform transactions across them. After all, RocksDB is only a local key-value store.

While a local key-value store is a great abstraction for separating the storage engine from the higher-level system, Google took this a step further with Spanner, which is a SQL database built on a distributed key-value store. CockroachDB is another example of a SQL database built on top of a distributed key-value store. The distributed key-value store interface shifts the hard problems of consistency, fault tolerance, and scalability down into the storage layer and allows database developers to focus on building value-added features database users actually need.

Why Tigris Leverages FoundationDB to Replace Multiple Database Systems

Database

Until a few years ago, the open-source community did not have any viable options for a stand-alone, production-ready distributed key-value store with interactive transactions. FoundationDB, used by Apple, Snowflake, and others is such a system. It provides the same consistency and isolation guarantees as Spanner - strict serializability, and has an amazing correctness story through simulation testing. FoundationDB exposes a key-value API, similar to RocksDB, but it automatically scales horizontally to support petabytes of data and millions of operations per second in a single deployment on a modest amount of commodity hardware.

Tigris uses FoundationDB’s transactional key-value interface as its underlying storage engine. This separation of compute from storage allows Tigris to focus its attention on higher level concerns, instead of lower level ones. For example, we’d prefer to spend our time figuring out how to incorporate an OLTP database, search indexing system, job queue, and event bus into a single transactional domain instead of spending our time building and testing a custom solution for storing petabytes of data in a durable and replicated manner. In other words, we leverage FoundationDB to handle the hard problems of durability, replication, sharding, transaction isolation, and load balancing so we can focus on higher level concerns.

Building a new data platform on top of an existing distributed transactional key value interface like FoundationDB may not be the most efficient approach, but it heavily skews the implementation towards "correct and reliable by default", even in the face of extreme edge cases. Many existing distributed databases still struggle to pass the most basic aspects of Jepsen testing even after over a decade of development. At Tigris, we value correctness, reliability, and user experience over raw performance. The generic abstraction of a transactional, sorted key-value store is flexible enough to support many kinds of access patterns, as well as enable rapid feature development. For example, a secondary index is just another key-value pair pointing from the secondary index key to the primary index key. Searching for records via this secondary index is as simple as a range scan on the key-value store and doing point reads on the primary index to retrieve the records. Ensuring the secondary index is always consistent only requires that the secondary index key is written in the same FoundationDB transaction in which the primary record is updated.

Our goal is to provide a cohesive API for collections and streams that all work together. We use FoundationDB’s transactions to ensure the invariants of these data structures remain intact regardless of the different failure modes your cluster might encounter.

  • Tigris encodes index keys using the FoundationDB’s key encoder. It accepts high level data structures like arrays, strings, and numbers and encodes them into a lexicographically sortable byte string. Encoded keys are a common language for different components inside Tigris to communicate with.
  • Tigris Streams are powered by FoundationDB’s support for "baking" a transaction’s commit timestamp into a key, which allows reading all collection mutations in the total order of transactions committed into the database.

FoundationDB’s powerful primitives allow us to focus on the high-level API and experience of using Tigris, so you can build an application with rich functionality like search, background jobs, data warehouse sync, and more without having to grapple with a unique solution for each, or figure out how to keep them all in sync with each other. We’ve built a cohesive set of tools for data management within a modern application. And if we don’t support exactly what you need today, you can use the Streams API to ensure correct data sync to any other system.

While low level concerns like durability and data replication are not where we want to spend our engineering time, they are still extremely important! So why do we feel comfortable trusting FoundationDB with our most critical data? In a nutshell, it is because while perhaps lesser known, FoundationDB is one of the most well tested and battle hardened databases in the world.

FoundationDB’s Correctness and Fault Tolerance

Data Integrity

FoundationDB has more comprehensive testing than almost any other database system on the market. It relies on simulation testing, an approach where an entire FoundationDB cluster can be deterministically simulated in a single operating system process, alongside a wide variety of test workloads which exercise the database and assert on various properties. Simulation testing allows the test runtime to speed up time much faster than wall clock time by skipping over all the boring moments where code is waiting for a timeout to happen, or a server to come back up after being rebooted. This means many more "test hours" can pass for each real hour dedicated to running tests, and these tests can be run in parallel across many servers to explore the state of possible failure modes even further.

There are very few databases which even come close to this level of testing. Those that do typically only simulate a small portion of their system, leaving many components untested by simulation and only relying on unit tests. FoundationDB even simulates the backup-restore process to ensure the operational tools are just as good as the database itself. After all, those kinds of tools are often just as important! No database is perfect, but simulation catches problems before they make it into a user facing release.

FoundationDB also receives extensive real-world testing at Apple and Snowflake. The releases are beta tested by them long before being released to the community to ensure simulation results match reality of running in chaotic cloud environments. It also provides Tigris with extremely powerful workload management features. The database is constantly evaluating the load of the cluster to determine when it is "too busy", and when that happens it will artificially slow down starting new transactions until the load is stable again. By forcing all the latency to the beginning of your transaction, FoundationDB ensures every operation after that experiences a consistent, low latency. Many other database systems lack any workload management features at all, which manifests as overloaded clusters requiring operators to shut down all the clients to get it back under control.

We previously mentioned that backup-restore functionality is tested in simulation. That’s true of other features like disaster recovery, logging, and metrics too. These built-in tools are a core part of FoundationDB’s developer experience. Many other database systems force you to use a third-party backup tool once you cross a small scale due to table locking or other concurrency problems. FoundationDB’s backups and disaster recovery system are both streaming (so your recovery point objective can be measured in seconds instead of hours) and they work for large, busy databases, not just the toy sized ones.

Tigris Operations and Reliability

Reliable Data Flow

Tigris is built following a microservice architecture approach. It is deployed as a layer in front of FoundationDB. This allows us to have separate components that can each be scaled independently. For example, separating storage from compute ensures a better distribution of resources. Some deployments of Tigris will store a small amount of data that requires tons of CPU for processing queries, while others will store large amounts of data and use a tiny amount of CPU. You can right-size your Tigris and FoundationDB clusters separately so that valuable CPU and memory resources are never again stranded on a big database instance which mostly sits idle because of high storage utilization.

Tigris expands on FoundationDB’s integrated workload management features to provide more fine-grained workload management down to the individual collection level.

FoundationDB provides Tigris a strong foundation for building a high quality data platform. By leaving the low level details to a battle-tested system, Tigris can focus on building a cohesive, flexible set of tools that enable application developers to take an idea from production without stepping into the sinkhole of data pipelines, broken sync jobs, and complicated concurrency bugs present in many modern application architectures.


Tigris is the all-in-one open source developer data platform. Use it as a scalable transactional document store. Perform real-time search across your data stores automatically. Build event-driven apps with real-time event streaming. All provided to you through a unified serverless API enabling you to focus on building applications and stop worrying about the data infrastructure.

Sign up for the beta

Get early access and try out Tigris for your next application. You can also follow the documentation to learn more.

· 4 min read
Ovais Tariq

Complexity to Simplicity with Tigris

Join the Tigris waitlist and be the first to try the new open-source developer data platform for your next application

Over the past year, we’ve been building a revolutionary new data platform for developers to handle all their applications’ data needs without all the data infrastructure complexity. This is the first truly open source developer data platform available with a simple yet powerful, unified API that spans search, event streaming, and transactional document store. It enables you to focus on building your applications and stop worrying about the data infrastructure.

Today we are opening up the waitlist for you to join our beta program. Tigris is focused on the needs of the developer community and committed to delivering a solution that addresses the unique needs of developers in a data driven environment. The best way for us to build a better platform is to work with our users on a daily basis.

If you are interested in helping us out please add yourself to our waitlist here. We will be providing access to the platform on a rolling basis in the coming weeks.


Why we created Tigris?

While working on the database platforms for companies like Uber, Percona and other startups, we noticed a common problem - Developers are held back by their databases.

Building modern cloud native applications often requires you to bolt on systems and infrastructure components to overcome the shortcomings of your data infrastructure that were never designed to operate in the modern ecosystem.

This means more tools to manage and deploy, and infrastructure complexity that distracts you from building your applications.

This cost of maintaining multiple different systems is not only expensive but is disruptive to development time. Time has become an extremely limited resource for developers. The more complex the environment, the more time is spent troubleshooting, learning new database languages, and maintaining various systems, instead of shipping features and building applications.

This is exactly why we built Tigris. Now developers will spend less time on tedious infrastructure management and more time on what they do best - coding and feature development.

Features to simplify your workflow

  1. Simple APIs: Quickly add data and easily retrieve or edit that data through simple and intuitive APIs.
  2. Flexible Document Model: Our JSON data structure makes it easy to map to the objects in your code while providing the schema enforcement seen in traditional databases. Furthermore, its flexibility makes it easy to evolve your data models.
  3. Zero Cost Schema Evolution: Schemas evolve in a lightweight manner without any downtime. Changes are performed in a transactional manner, take only a few milliseconds to complete and do not require a collection rebuild.
  4. Automatic Index Maintenance & Management: The system makes query tuning a thing of the past with automatic index management and maintenance, meaning you will never have to worry about slow queries due to missing indexes.
  5. Transactions: Strictly serializable isolation, and unlike some other document databases, no confusing read / write concerns to configure, and no cross-shard caveats.
  6. Event Streaming: Built-in event streaming allows you to subscribe and publish events using the same API you use to query and search data. Since data is automatically indexed, you can query all data and search through all events with ease.
  7. Global Search: Search across all your collections using full text or faceted search. All with an integrated search engine that eliminates the need to run a separate search platform and synchronize data.
  8. Cloud Native Architecture: Built as individual components that can be scaled independently to keep performance and scalability high while keeping costs low. Core data storage is built on FoundationDB, a distributed backend that enables nearly limitless scalability.
  9. Open Source: Tigris is truly open source, adhering to an Apache 2 license. Built on open source principles we’re committed to providing a secure, stable, and innovative product.
  10. Multi-tenant by default: Save on investment costs and maximize resource usage. Multi-tenancy provides the much needed flexibility to add new customers and applications quickly and easily.

If you are interested in helping us out please add yourself to our waitlist below. We will be providing access to the platform on a rolling basis in the coming weeks. If you’re selected to participate in the beta, you’ll receive an email invitation to get started.

🚀 Click here to signup for the Tigris beta and get early access!

· One min read
Ovais Tariq

We're excited to announce that Tigris Data has joined Netlify's Jamstack Innovation Fund as one of the 10 most promising Jamstack startups.

Tigris Data joins Netlify&#39;s Jamstack Innovation Fund

As the world increasingly shifts to digital-first interactions, the need for fast, reliable, and secure web applications has never been greater. The Jamstack movement is a response to this need, focused on building web applications that provide rich experiences while at the same time being easy to deploy and scale. Data is, of course, crucial to building rich experiences, and the support from Netlify will accelerate our mission to provide the fast, reliable and secure data layer that Jamstack applications need to thrive.

· 7 min read
Ovais Tariq

In our inaugural blog post Hello world, we talked about the problem of data infrastructure sprawl that, over the years, has complicated modern application development, while putting a lot of burden on the operations team who must manage, maintain and secure many different databases and technologies.

In this blog post, we’ll explain what we mean when we say "data infrastructure sprawl" by walking through a typical example, and then we’ll explain why it doesn’t have to be this way.

The standard evolution story of a modern application

Let's start by taking a look at what an application looks like at the beginning of the journey. The application is simple, it talks to a database, typically a traditional OLTP database such as Postgres, or MySQL.

Application and Database

To scale the application further and increase reliability, some of the application logic is moved to background processing. This requires introducing a message queue such as RabbitMQ or ActiveMQ. Now the application architecture looks something like this:

Application, Database and Message Queue

Over time, the application grows in popularity and more features are added. The development team starts to feel the pain of working with a relational database. They want to be able to scale reads and writes independently, so they introduce read replicas and update the application logic to split the read and write requests. This exposes the development team to infrastructure-related concerns. At the same time, it also increases the application complexity as developers now need to decide when it’s safe to read from the read replicas and when they need to query the primaries instead.

Application, Database with replicas, Message Queue

At some point, the business analysts may need to analyze the data to guide business decisions. They could do it for a while using the primary database, but eventually these requests will have to be isolated from production. One easy solution that companies adopt involves running a nightly job that exports the data out of production into a data warehouse like Redshift or BigQuery.

Application, Database with replicas, Message Queue, Data Warehouse

As the business continues to grow, so does the complexity of the application infrastructure required to support it. The developers continue to expand the application's functionality by adding full text search capabilities. They decide that introducing Elasticsearch or Solr would be best since the primary OLTP database does not have a capable search engine. The search functionality is introduced via a new microservice so that the team can iterate quickly and experiment with new technology without disrupting the existing business. The new search microservice needs to know when certain operations are performed on the primary OLTP database, so a message bus, such as Kafka, is introduced as well. Now the application architecture looks something like this:

Application, Database with replicas, Message Queue, Data Warehouse, Search

Does this look familiar? The application isn’t doing anything crazy from a technological perspective. All the team has built is a backend that can:

  1. Power CRUD operations.
  2. Provide search functionality.
  3. Exposes key business data to our analysts in a data warehouse.
  4. Allows experimentation and quick iteration in a “safe” manner that doesn’t put the entire business in jeopardy.

The backend though has become a complex distributed system with many different moving components. Each component must be deployed, configured, secured, monitored, and maintained. The team has to think about things like data replication, data consistency, availability, and scalability for each of the individual components as well.

Over time, the team starts spending less time building the functionality required to grow the business, and more time managing incidental infrastructure.

Fundamental reasons for the data infrastructure sprawl

This type of “data infrastructure sprawl” is the norm in the industry today, and justifiably so. At every point in the story above, engineering leadership made well-founded and logical architectural decisions, but the end result was high amounts of incidental complexity that made even straightforward feature development time-consuming. But at Tigris, we don’t think it has to be this complicated. Users are forced into this situation because the open source marketplace is filled with rich and powerful building blocks, but it’s lacking in holistic solutions.

Most open source databases today take a narrow view of the world:

  • They have rich query functionality, but they can’t scale beyond a single node. Or, they can scale horizontally, but push all the hard correctness and consistency problems to the application developers.
  • They can function as your primary data store, or they support rich full text search functionality, but rarely both.
  • Multi-tenancy and isolation between workloads are an afterthought or not present at all. Applications can’t be logically isolated while sharing the same underlying “database platform” in a safe manner. Often the database itself is “unsafe by default” and won’t provide so much as a warning before happily performing a full table scan every time someone navigates to your home page.

What should a modern database look like?

We believe that, “Correctness, reliability, and user experience over raw performance” is a good guiding principle that, if followed, would lead to the development of a modern database platform that could keep the data infrastructure sprawl at bay.

This of course sounds great on paper, but prioritizing user experience over micro-benchmarks is not how most modern database companies try to differentiate themselves.

Database Comparisons Today

What most database comparisons look like today

This may seem obvious, but building a holistic database platform that can keep data architectural sprawl at bay means that the developers of this new system must take a principled stand to always do right by the user instead of the benchmarks. This is a very difficult thing to do in the current competitive climate that is dominated by benchmark-obsessed marketing material. The recent controversy between Snowflake and Databricks is a great example of this. It’s much easier to quantify rows/s/core than it is to quantify (a) sleepless nights, (b) apologies issued to your customers, and (c) developer productivity.

Business Value Created

Product Success

What most database comparisons should look like

So what would a database that helps customers maximize business value instead of micro-benchmarks look like? At Tigris, it boils down to a system with the following characteristics:

  1. A flexible document model that enables developers to model data in a way that best suits their applications’ needs. The data model should also be easy to evolve, and schema changes should be as simple and painless as regular feature development.
  2. Simple and intuitive APIs that allow developers to quickly insert and retrieve data while continuing to use their preferred programming language - no new database query language to learn.
  3. Strictly serializable isolation by default. This ensures that developers never need to think about how transactions work or what their limitations are, nor do they need to configure things like read and write concerns or transaction isolation levels.
  4. “Distributed by default” with no painful transition point when the needs of the application begin to exceed the capabilities of a single node. Sharding should also be transparent, and the system should ensure that the database scales seamlessly as the application traffic increases, while core operations such as transactions do not have any restrictions in terms of the records or shards that can participate.
  5. Multi-tenant by default so that learning how to deploy, operate, and secure a single database technology is enough for the majority of the applications.
  6. An integrated search engine that provides developers with real-time search functionality eliminating the need to run a separate search platform and synchronize data.
  7. Built-in low latency replication to cloud data lakes for OLAP workloads that eliminates the need for developers to configure, track, and maintain separate ETL processes.

This is exactly what we’re building with Tigris! Stay tuned for future posts where we’ll discuss in detail how we’re leveraging open source technology like FoundationDB to build a rock solid and intuitive database platform in record time. We’re also hiring!

· 3 min read
Ovais Tariq
Himank Chaudhary
Yevgeniy Firsov

We're excited to announce the launch of Tigris Data, a company on the mission of simplifying data management for developers.

Over the years, data has become increasingly complex and difficult to manage. Developers have had their lives made exponentially more difficult due in large part to all these different technologies, data models, APIs, and databases they're expected to put together to build modern applications.

The database sprawl also puts a lot of pressure on operations teams, who must manage, maintain and secure these different databases and technologies. Then there is the onerous task of operationalizing these databases across multiple different cloud platforms.

Complexity to Simplicity with Tigris

We have personally experienced the pain and financial cost from the complexity. We want to make it easier for you to work with data by providing you with a unified database that helps quickly and easily structure the data in any way the application requires while providing you with one consistent experience across a broad set of workloads.

Furthermore, we are building the database to be open source, so you can avoid vendor lock-in and have the transparency to be sure that the product will continue to evolve and improve over time. Additionally, it enables anyone to be able to contribute improvements and extensions to the codebase. All-in-all being open source makes the product more accessible, robust, and secure for everyone.

We're grateful to have assembled an amazing team that is passionate about simplifying data management for developers. Our team has a wealth of experience in data management, distributed systems, and databases. We're excited to bring our knowledge and experience to bear on the problem of data management.

We're also proud to announce that in December of last year, we successfully raised a seed round led by Basis Set with additional funding from General Catalyst and many individual angels including founders at technology unicorns.

Now we're well on our way to a preview of the Tigris database later this year!

If you're interested in learning more about Tigris Data or staying up-to-date on our progress, subscribe to our mailing list or follow us on Twitter.

The founders

Before starting Tigris Data, the founders, Ovais, Himank, and Yevgeniy, spent almost six years working closely together at Uber. They developed and operated Uber's storage infrastructure, leading projects like Docstore, Herb, and DBEvents.

Their experiences have given them a lot of important lessons. They've learned about the importance of making architectural choices that can scale, establishing efficiency as a core principle, hiring the right talent, and cultivating a culture that supports diversity, innovation, and growth.