TiDB+Pinterest

Title: How Pinterest Used TiDB to Modernize Their HBase Workloads

Time: September 14, 6 PM Pacific Daylight Time. Doors will open at 5:30 PM.

Introduction:

The Pinterest Storage and Caching team is responsible for several critical business functions, including ads, shopping, trust and safety. Pinterest storage services are built on the HBase ecosystem, and they have one of the largest HBase production deployments: approximately 50 production clusters host more than 9,000 virtual machines with 6 PB of source-of-truth data on disk. The HBase ecosystem has several advantages over other systems including strong consistency at the row level when data volume is high, flexible schema, low latency access to data, and Hadoop integration. However, Hbase cannot serve the needs of Pinterest’s clients for the next 3–5 years: it costs a lot to operate, it’s complex, and it lacks important functionality such as secondary indexes and support for transactions.

To find a solution, Pinterest evaluated more than 10 storage backends and benchmarked the three most promising ones. We used shadow traffic, which is asynchronously copying production traffic to a nonproduction environment, and then performed an in-depth performance evaluation. Pinterest decided to adopt TiDB for its next generation Unified Storage Service.

In this talk, Ankita Girish Wagh from Pinterest will discuss what they learned from adopting TiDB and how it performed during the first few use cases.

Speaker:

Ankita Girish Wagh
Senior Software Engineer, Pinterest

At Pinterest, Ankita’s work focuses on TiDB migration, Ixia (a secondary indexing service built on HBase), and the caching infrastructure. Before Pinterest, she was a Software Engineer in Compute Platform at Uber. She worked on a server provisioning service and helped build an ecosystem around it for Uber’s in house data centers. Ankita has nine years of industry experience and has a master’s degree in computer science from Texas A&M University.