Industry: Media and Entertainment
Author: Lianghong Xu (Engineering Manager for Storage & Cache Team at Pinterest)
Editors: Yajing Wang, Caitin Chen, Tom Dewan
Pinterest is an American visual discovery engine for finding ideas like recipes, home and style inspiration, and more. As of March 2021, it had more than 478 million global monthly active users.
In recent years, we used the HBase ecosystem to serve our critical online applications. But this infrastructure was complex. It had limited functionalities and brought high maintenance and infrastructure costs. We looked for a new storage solution that supports distributed transactions and SQL compatibility.
After we compared YugaByteDB, CockroachDB, and TiDB, an open-source scale-out database, we chose TiDB because it best met our requirements for stability and performance. TiDB helps us reduce system complexity and infrastructure cost and achieve stronger consistency.
In this article, I'll describe:
In 2013, we introduced HBase into Pinterest. Today, we have about 50 clusters, hosting 10+ PB of data, aggregating 100 million+ queries per second (QPS), and serving 100+ critical use cases. Since it was introduced, HBase has been a building block of our Pinterest infrastructure. Today, it still serves our critical online path.
We have built the following components with or on top of HBase:
This diagram gives you an architectural overview of our HBase ecosystem:
In this ecosystem:
Over the years, HBase served us well. However, we faced a number of challenges, including:
High maintenance cost
Over the past eight years, because of the old HBase version we used, we suffered from long-standing tech debt. HBase was a complicated system, and it had high barriers to entry. As a result, it took a lot of effort to keep HBase up and running. This was one of our top pain points.
Limited functionalities
HBase was powerful, but its interface was simple. While our users initially enjoyed its simple KV interface, they wanted richer functionalities, like stronger consistency. It will be hard to build middle layer systems on top of a NoSQL store to satisfy all these requirements.
High complexity
Because we had to keep satisfying our customer requirements, we had been building these “Lego pieces” over time to support SQL-like features from a NoSQL data store—not to mention that HBase itself had many components and dependencies like ZooKeeper and Hadoop Distributed File System (HDFS).
High infrastructure cost
Our infrastructure cost was high because we used primary-standby cluster pairs to achieve high availability, and we used six replicas for a unique dataset.
Our customers also had pain points with our existing HBase system.
Zen is our in-house graph service that provides a graph data model. Essentially, users can do CRUD operations with a data model of nodes and edges. They can define customized indexes like unique, non-unique, edge query, or local indexes.
Zen had two major pain points:
Data inconsistencies
Because HBase did not provide cross-table or cross-row transactions, inconsistent data introduced a lot of problems. Our customers were confused, and they had to do more coding to ensure consistent data.
Limited query support
Because HBase had limited functionalities, complicated queries like query joints would be hard to run.
Ixia is our in-house index data store. It is built on top of HBase + Muse (our in-house index engine). We use CDC for asynchronous indexing. Source of truth data is written to HBase, and we build indexes asynchronously in the search engine.
Ixia's major pain points were:
As a result, we looked for new storage solutions that can gracefully handle those situations for us.
Currently, we have a gap in our storage solutions. We're missing:
We thought of something like Google Spanner. It inspired us to explore open-source distributed SQL systems.
We evaluated about 15 technologies including in-house systems, open-source technologies, and cloud solutions.
There were many dimensions that we considered, including operational load, migration cost,
programming language, community support, and cost efficiency. Open-source distributed SQL systems stood out. In the last round of our search, we mainly focused on three: YugaByteDB, CockroachDB, and TiDB. We chose TiDB because it best met our requirements for stability and performance.
TiDB is an open-source, distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability. You can learn more about TiDB's architecture here.
We did a dark traffic evaluation with Pinterest production traffic. We used Ixia as the target system. It's our near real-time indexing system on HBase, as shown in the following figure:
In this figure:
In contrast, the TiDB architecture is much simpler.
TiDB has given us:
This is the result of just one of our use cases. We evaluated other use cases, and, in general, we are satisfied with TiDB's performance and reliability.
We really appreciate the continuous support from the TiDB community and PingCAP. As we continue our collaboration, we're thinking about new challenges and how and where we can make TiDB even more useful and powerful for companies like Pinterest. For example:
If you'd like to learn more about our experience with TiDB, or if you have any questions, you can join the TiDB community on Slack. If you're interested, you could get started with TiDB here.
This article is based on a talk given by Lianghong Xu at PingCAP DevCon 2021.