Introduction to TiDB and a guide to Apache Kafka partitions

Join PingCAP and Aiven for an informative, tech evening and network with other like-minded people

Time: July 13th, 6 PM – 9 PM CET

Location: Aiven office in Berlin

Address: Schönhauser Allee 148 · 10435 Berlin

Registration link: Berlin Open Source Data Infrastructure Meetup – July 2023, Thu, Jul 13, 2023, 6:00 PM | Meetup

Details

Are you interested in learning more about open-source data technologies? Do you want to network with other like-minded people in a fun, relaxed environment?

Program:
6.15 PM – Open Doors
6.30 PM – Welcome
6.40 PM – 7.00 PM – Food & refreshments
7.00 PM – Introduction to TiDB – a distributed SQL database

Introduction to TiDB – a distributed SQL database. We will go through its architecture and how it distributes data and provides high availability. Besides its row-based storage for efficient transaction handling, it also provides optional column storage for speeding up analytics queries.

About the speaker: Mattias Jonsson – Senior Database Engineer at PingCAP. He has worked with MySQL for more than 15 years. Prior to joining PingCAP, he worked for Booking.com, where he spent a significant amount of time on MySQL.

7.30 PM – Beginners guide to balance your data across Apache Kafka partitions

Apache Kafka is a distributed system. At the heart of Apache Kafka is a set of brokers that contain topics. Topics are split into partitions. Dividing topics into smaller pieces allows us to work with data in parallel and achieve higher data throughput.

Such parallelization is the key to a performant cluster, however, it comes with a price. First, reading from multiple partitions will eventually mess up the order of records, meaning that the resulting order will be different from when the data was pushed into the cluster. Another big challenge is an uneven distribution of data across partitions.

Overloaded partitions present a dangerous issue for the performance of all involved parties, but especially for brokers and consumers. Therefore, when building our product architecture we should carefully weigh up how many partitions we need, how to ensure proper message ordering, how to balance records across partitions, and not forget about data load distribution over time. And do all of this while still maintaining good performance of the cluster.

If you’re fresh to Apache Kafka, or looking for good practices to design your partitions and avoid common pitfalls, you’ll find this session useful!

About the speaker: Olena is a Sr. Developer Advocate at Aiven. With a background in software engineering, she’s led teams and developed mission-critical applications at Nokia, HERE Technologies, and AWS. Currently, she works at Aiven where she supports developers and customers in using open-source data technologies such as Apache Kafka, ClickHouse, and OpenSearch. She is also an international public speaker and regularly presents at conferences around the world. She holds AWS Developer and Solutions Architect certifications and is also a Confluent Catalyst.

8 PM – 9 PM – More food & Socialising

*Please note that this is an alcohol-free event.

Introduction to TiDB – a distributed SQL database and guide to balance your data across Apache Kafka partitions

Details