We are pleased to announce that TiFlash, TiDB’s analytics engine, is now open sourced under Apache 2.0. TiFlash is the essential component for TiDB’s Hybrid Transactional and Analytical Processing (HTAP) architecture, enabling users to release the value of their operational data in real time. With the open sourcing of TiFlash, we are now accelerating the innovation of the platform and the ecosystem of analytical applications that can power it.
Since day one, open source has been deeply rooted in PingCAP’s faith. As a major force in the TiDB community, we collaborate with other contributors to grow the code base, while community users co-architect and iterate the project through their adoptions.
TiFlash started out over 2 years ago as an explorational extension for TiDB. However, we didn’t open source it in the first place because we wanted to wait until it took a better shape for the community. Along the course of its development, we’ve already benefited immensely from the community, without which TiFlash would not be possible.
TiFlash stands on the shoulders of giants. On top of the base dependent libraries, the framework code of TiFlash is based on ClickHouse. We used ClickHouse as a standalone compute runtime and server framework for TiFlash, while reusing its storage interface. To achieve online transactional analytics, we added transaction-related logic, MPP capabilities, and a column storage engine that could be updated in real time. We also introduced the Raft protocol and MySQL compatibility so that TiFlash could be fully integrated with the TiDB architecture. We are grateful to ClickHouse for providing the community with a high-performance compute engine, and a building block for us to accelerate the development of TiFlash. However, It is worth mentioning that TiFlash and ClickHouse have completely different focused scenarios: TiFlash is mostly focused on real-time analytics of transactional data whereas ClickHouse is more about non-transactional data.
As a young analytical engine, the growth of TiFlash has been greatly indebted to the community users – we iterate the project fast by listening to our users up close, which in turn has helped them shorten their analytics cycle and better leverage real-time insights. From the early failed experiments with a friendly user that led directly to our first project level refactoring, to the successful adoptions in the community that takes users smoothly through the most demanding scenarios, TiFlash has been battle-tested by hyper-growing companies like ZTO Express and Xiaohongshu. Along the course, our users from the TiDB community have always trusted TiFlash, and indulged us with their scenarios even though TiFlash was not open-sourced at that time. The trust and acknowledgment from the community have been paramount in helping TiFlash mature and land in more intensive and valuable scenarios. We believe open source could provide an even more grand venue for TiFlash to be integrated into the community on a mutually beneficial basis.
However, as an open-source project, we are lacking some essential onboarding materials for you to get to know TiFlash better. We are planning a series of articles on TiFlash, which will be coming in the next few months. To learn more about the architecture of TiFlash and how to contribute to it, you can refer to:
- TiFlash’s repository on GitHub
- TiDB: A Raft-based HTAP Database
- How We Build an HTAP Database That Simplifies Your Data Platform
- Delivering Real-time Analytics and True HTAP by Combining Columnstore and Rowstore
We look forward to your contribution to this project and participation in our community. Let’s build our real-time analytical engine together.
A fully-managed cloud DBaaS for predictable workloads
A fully-managed cloud DBaaS for auto-scaling workloads