Industry: Logistics
Author: Mu Yang (JD Cloud Database Product Manager)
Transcreator: Caitin Chen; Editor: Tom Dewan
JD.com is China's leading one-stop e-commerce platform and a member of the Global Fortune 500. It provides 500 million+ active customers with an unrivalled range of authentic, high-quality products. The vast majority of orders placed on JD can be delivered right to the customer's doorstep on the same day or the day after the order is placed.
As our business developed and data size boomed, we faced severe database challenges. Stand-alone MySQL had limited storage capacity, complex queries took a long time to execute, and the database was hard to maintain. We tried sharding, Elasticsearch, and ClickHouse®, but they were not ideal.
Finally, we found our optimal solution: TiDB, an open-source, NewSQL database. TiDB helps scale out our databases and increases our large parcel sorting system's performance by 8x. We estimate that in the next two years it can reduce our IT costs of our logistics billing system by 67%.
We have deployed dozens of TiDB clusters on the cloud to support multiple top-priority application systems. These clusters have passed the rigorous test of JD's big sales campaign with very stable performance.
In this post, I'll share our database challenges, why we chose TiDB instead of other solutions, and how we use it as a MySQL alternative in three applications.
As our business developed, our data volume grew fast. We faced these challenges:
It was hard to scale out our databases.
Because our data size quickly increased, our databases needed to frequently scale out. Each time, to scale our databases, we had to apply for resources, assess risks, prepare plans, and communicate with the application team. This process was really troublesome for our architects.
Complex SQL queries slowed us down.
As our data volume increased and applications became complicated, our SQL query efficiency got lower. Developers needed to continuously optimize the system. This put a lot of pressure on them.
Database operations and maintenance were complicated.
To store so much data, we tried database sharding on many of our systems. It was difficult to operate and maintain this architecture, and we tended to make mistakes.
To solve our business pain points, we tried sharding, Elasticsearch, and ClickHouse. But they were unsatisfactory. Our experts from various application teams conducted thorough research, and we found that TiDB was an ideal solution. We cooperated with PingCAP and jointly launched Cloud-TiDB, a distributed database on the cloud, to provide TiDB services on JD Cloud.
We found sharding has these disadvantages:
In some analytical and query scenarios, we tried to use Elasticsearch and ClickHouse. We found they had these limitations:
TiDB is an open-source, distributed SQL database that supports Hybrid Transactional/Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability. You can learn more about TiDB's architecture here.
Based on TiDB, we developed Cloud-TiDB along with PingCAP to run TiDB on JD Cloud.
Cloud-TiDB has these features:
One-click deployment
When we create a cluster, we can customize the node size and the number of nodes. We can upgrade the database version online. To perform these operations, we just click the mouse. It's easy.
Single-command scaling
To improve the cluster's storage capacity and processing capabilities, we can add TiDB, TiKV, and TiFlash nodes online. The scaling process does not impact the application.
Multi-dimensional monitoring and alerting
To help our developers and operations engineers better use TiDB, we combine cloud monitoring with TiDB's Grafana and Dashboard.
We chose Cloud-TiDB to support some of our applications with massive data, because:
We also compared TiDB with other distributed databases. In the following table, we call them Databases A and H. In general, TiDB is superior to them in the following areas:
Feature or capability | Cloud-TiDB | Database A | Database H |
Cluster architecture | Supports hundreds of nodes | 1 primary and 15 secondaries | 1 primary and 15 secondaries |
Node read and write | Multi-active
All nodes that provide external services can both read and write. |
Only the primary node can write. | Only the primary node can write. |
Storage capacity | Petabytes of data | 100 TB of data | 128 TB of data |
Cluster QPS | 1 million+ | 1 million+ | 1 million+ |
MySQL compatibility | Yes | Yes | Yes |
Scaling out in minutes | Yes | Yes | Yes |
Backup in seconds | No | Yes | No |
Whole-cluster read latency | No latency | No latency | Unknown |
Distributed transactions with strong consistency | Yes | Yes | Yes |
Online Analytical Processing (OLAP) support | Yes | No | No |
Our logistics billing system stores a large amount of data. Its three primary tables had 2 billion, 5 billion, and 10 billion rows of data. After the system was in production for six months, its largest tables’ data reached 22 billion rows. Because the data size was large, we used sharded MySQL at the very beginning. However, after a while, we encountered some problems with MySQL sharding. For example:
The following diagram shows our original MySQL-based logistics billing system and how it looks now after we switched to TiDB:
After we migrated to TiDB, TiDB showed good performance:
Writes were about 100 milliseconds and so were updates.
Both non-sum queries and sum queries took only 20-30 milliseconds.
To migrate a system with tens of billions of rows of data from MySQL to TiDB, we didn't modify a single line of code.
We just changed our Java Database Connectivity (JDBC) username and password to seamlessly migrate from MySQL to TiDB. TiDB is highly compatible with MySQL; therefore, our costs of trial and error, testing, and migration were low, and we got benefits from TiDB quickly.
TiDB greatly reduced our IT infrastructure cost.
We estimate that over a two-year period the cost of using TiDB will be only 37% of that of MySQL. This is because TiDB has a high data compression ratio. Our test found that after migration, 10.8 TB of data in MySQL became only 3.2 TB in TiDB, and this was the total data volume of three copies. Therefore, the space usage ratio of MySQL to TiDB is 3.4:1.
Cluster | Space usage | Ratio |
MySQL | 10.8 TB | 3.4 |
TiDB | 3.2 TB | 1 |
Before we switched to TiDB, our architecture was a four-shard MySQL cluster. According to the data growth rate, we needed to scale out our database every six months, and each time we needed to add four sets of active-standby instances. But after we switched to TiDB, we can scale out the database based on our actual needs.
Cluster | Scale | Scaling scheme |
---|---|---|
MySQL | 4 * (16-core 64 GB 3 TB) * 2 Active-standby |
Add four sets of active-standby instances every six months |
TiDB | 2 TiDB nodes, 7 TiKV nodes, 3 PD nodes, and 1 monitor node Each node has 16 cores. |
Add nodes based on actual needs |
The following figure shows that in the 24th month:
Previously, we used MySQL to support real-time kanbans (visual boards for workflow management) and mission-critical reports of our large parcel sorting system. As data size increased and SQL queries became complex, the kanban and report performance degraded, and the user experience was not good. We also tried MySQL sharding, but it was intrusive to the code, we needed to greatly adjust the architecture, and the risk was high.
When we switched from MySQL to TiDB, we used our self-developed system dcomb to replicate data in near real time. With TiDB, the overall performance for hundreds of metrics increased by eight times.
The data in our waybill accounting system increased by tens of millions of rows each day. A single table could have close to 20 billion rows. We couldn't use MySQL to store so much data. We once tried Presto. Its performance and storage capacity were okay, but it was expensive. Later, we used Elasticsearch to query data, but Elasticsearch was unstable. In our business, billing items often change, so the table schemas need changes, too. The database maintenance workload was large.
After we migrated to TiDB:
TiDB has helped us:
TiDB has been so successful for us, that it has become a benchmark for cost saving and efficiency within our company. We expect that by the end of 2021, at JD.com, the total cores of TiDB clusters will increase by 100%, and the total number of CPU cores will exceed 10,000.
If you want to know more details about our story or have any questions, you're welcome to join the TiDB community on Slack and send us your feedback.