migrating mysql2tidb banner

In our previous article, MySQL 5.7 EOL: Migrating to a MySQL Alternative, we explored the various alternatives to MySQL 5.7 after its imminent end of life (EOL). Among the alternatives discussed, migrating from MySQL to TiDB, an advanced open-source, distributed SQL database, emerged as a promising solution for those seeking scalability, high availability, and MySQL compatibility. It is especially suitable for handling large-scale data and accommodating the demands of rapidly evolving applications.

This article will serve as a practical guide to migrating data from MySQL to TiDB. We will delve into the intricacies of the data migration process, highlighting the various tools and best practices that can ensure a smooth and successful migration.

Preparing for Migration

Before diving into the actual data migration, it’s essential to make some necessary preparations. This phase is critical for ensuring that the migration process runs smoothly and without issues. We’ll take you through the steps that need to be taken for preparing your database, considering the compatibility, and setting up your TiDB cluster for the migration.

Assessing Your Data and Compatibility

Before starting the migration process, it’s important to understand the volume, structure, and type of data you’re migrating. This will help you select the appropriate tool and mode (full or incremental) for migration. 

Furthermore, although TiDB is designed to be compatible with MySQL, it’s essential to verify the compatibility of your current MySQL schema and queries with TiDB. You can refer to the TiDB compatibility documentation  for detailed information on compatibility with MySQL.

Setting Up the TiDB Cluster

Before you can migrate your data, you need to have a TiDB cluster ready. Depending on your preference and requirements, you can either set up a TiDB cluster on-premises or in the cloud. Follow the instructions TiDB documentation to to get your TiDB cluster up and running.

Ensuring Sufficient Resources and Backup

Make sure that the TiDB cluster you are migrating to has enough resources to handle the data you are transferring. Monitor CPU, memory, disk space, and network bandwidth. This is especially crucial if you are dealing with large datasets.

Additionally, before making any changes to your MySQL database, it’s highly recommended to create a full backup of your data. In case anything goes wrong during the migration process, this will allow you to restore your database to its original state.

Configuring Security and Permissions

For both MySQL and TiDB, ensure that the necessary permissions are granted for the data transfer. This typically involves setting up user accounts with the necessary privileges to read from the MySQL database and write to the TiDB database.

Also, consider the security of your data during the migration. Data should be transferred over secure connections, especially if migrating over the internet or cloud. Utilize encryption where necessary to ensure that sensitive data is protected.

Planning for Downtime

Keep in mind that the migration process may impact your applications. Therefore, you need to plan for potential downtime and reconfigure your applications with new data source names (DSNs) or connection strings. Or, to achieve zero downtime, you can configure both source and target databases to generate non-overlapping primary keys and perform a “rolling restart” of all applications.  

Choosing the Right Migration Tool

Having prepared your data and set up your TiDB cluster, the next crucial step is selecting the right tool for migrating your data from MySQL to TiDB. TiDB offers multiple options for data migration such as TiDB Data Migration (DM), TiDB Lightning, and Dumpling. The choice of tool depends on the size of your dataset and your specific requirements for full or incremental migration.

TiDB Data Migration (DM)

TiDB Data Migration (DM) is an integrated data migration task management platform, capable of handling both full data migration and incremental data replication from MySQL-compatible databases such as MySQL, MariaDB, Google Cloud MySQL, and Aurora MySQL to TiDB. It is particularly useful for full data migrating of small datasets less than 1 TiB. For datasets exceeding 1 TiB, it is recommended to use the DM physical import mode to expedite the migration. Alternatively, you can consider TiDB Lightning for full data import and then use DM to migrate the incremental data. For a comprehensive guide on DM usage, refer to the TiDB Data Migration documentation.

TiDB Lightning

TiDB Lightning is a tool used for importing large volumes of data into a TiDB cluster quickly. It is primarily used for full data migration, specifically for importing data dumps.

However, be cautious that TiDB Lightning consumes a significant amount of resources, and operations on the target table will be affected during the import. Refer to the TiDB Lightning documentation for a comprehensive guide on how to use this tool.

Dumpling

Dumpling is a data export tool that makes it easy to dump entire TiDB/MySQL databases in various formats, including SQL and CSV files. It can be paired with TiDB Lightning for a complete migration solution. Refer to the Dumpling documentation for detailed instructions on how to export data using Dumpling. 

Decision Matrix

Here’s a quick summary to help you decide:

ToolBest forData SizeMigration Type
DMSmall datasets with incremental changesLess than 1 TiBFull + Incremental
TiDB LightningLarge datasets and fast full migrationLarge datasets > 1 TiBFull
DumplingExporting data in SQL or CSV formatsAny sizeExport (For import use with TiDB Lightning)

By selecting the right tool based on your dataset size and migration requirements, you can ensure a more efficient and smooth migration process. For a comprehensive overview of the migration options, see TiDB Migration Tools Overview.

Migrating and Validating Data

Now that you have chosen the appropriate migration tool based on the size of your dataset and your specific requirements, it’s time to execute the data migration. In this section, we will walk through the process of full data migration and cover two different MySQL-to-TiDB migration scenarios: small datasets and large datasets.

Full Data Migration for Small Datasets

For datasets smaller than 1 TiB, TiDB DM is a suitable tool that reduces the operation cost of data migration and simplifies the troubleshooting process. Here’s how to use DM for migrating small datasets. 

 Prerequisites

 Create the Data Source

  1. Configure the data source in a yaml file, which specifies the MySQL configuration such as host, user, password, and port.
    
    # The ID must be unique.
    source-id: "mysql-01"
    
    # Configures whether DM-worker uses GTID to pull binlogs.
    enable-gtid: true
    
    from:
      host: "${host}"          
      user: "root"
      password: "${password}" 
      port: 3306
    
  1. Load this data source configuration to the DM cluster using the tiup dmctl command.
    
    tiup dmctl --master-addr ${advertise-addr} operate-source create source1.yaml
    

Create the Migration

Configure the migration task in a yaml file:

name: "test"
task-mode: "all"
target-database:
  host: "${host}"   
  port: 4000
  user: "root"
  password: "${password}"    

# The configuration of all MySQL instances  
mysql-instances:
-
  source-id: "mysql-01"
  block-allow-list: "listA"

# The global configuration of blocklist and allowlist.
block-allow-list:
  listA:                       # name
    do-tables:    		    # Allowlist of the upstream table to be migrated.
    - db-name: "test_db"       # Schema name of the table to be migrated.
      tbl-name: "test_table"   # Name of the table to be migrated.

Start the Migration

  1. Check the status and start the migration task:
    tiup dmctl --master-addr ${advertise-addr} check-task task.yaml tiup dmctl --master-addr ${advertise-addr} start-task task.yaml
  1. Check the migration status. Monitor the migration task using the query-status command: 
    tiup dmctl --master-addr ${advertise-addr} query-status ${task-name} 

You can also view DM-related monitoring metrics in the DM dashboard if you have deployed Prometheus, Alertmanager, and Grafana when deploying DM using TiUP.

Full Data Migration for Large Datasets

For datasets larger than 1 TiB, it’s recommended to use Dumpling in conjunction with TiDB Lightning for full data migration.  After the full migration is completed, you can replicate the incremental data using DM.

Prerequisites

Export Data from MySQL

Export data from MySQL using Dumpling. For more Dumpling parameters, refer to Dumpling Overview.

dumpling --host ${host} --user ${user} --password ${password} --port ${port} -o ${output_directory}

Create the Import Task

Configure the tidb-lightning.toml file as shown below. For more information on TiDB Lightning configuration, refer to TiDB Lightning Configuration.

[lightning]
# log.
level = "info"
file = "tidb-lightning.log"

[tikv-importer]
backend = "local" #"local": Default backend for importing large volumes.
sorted-kv-dir = "${sorted-kv-dir}" # Temporary storage directory for sorted KV files. 

[mydumper]
data-source-dir = "${data-path}" # the export path in step 1. 

[tidb]
# The target TiDB cluster information.
host = ${host}                # e.g.: 172.16.32.1
port = ${port}                # e.g.: 4000
user = "${user_name}"         # e.g.: "root"
password = "${password}"      # e.g.: "rootroot"
status-port = ${status-port}  # e.g.: 10080
pd-addr = "${ip}:${port}"     # e.g.: 172.16.31.3:2379

Start Importing

  1. Start the import by running tidb-lightning. 
  2. Check progress in the TiDB Lightning web interface, or the monitoring dashboard if you have also deployed Prometheus, Alertmanager, and Grafana.

Finish Importing

After TiDB Lightning completes the import, it exits automatically. Check whether tidb-lightning.log contains the whole procedure completed in the last lines. If yes, the import is successful.

Validating the Data 

After initiating the data migration process, it’s crucial to monitor the migration progress and validate the data to ensure accuracy and completeness. TiDB provides sync-diff-inspector, a powerful data validation tool that lets you compare the data in MySQL with that in TiDB or reversely. You can also use it to repair data in a scenario where a small amount of data is inconsistent. See sync-diff-inspector User Guide for more instructions. 

Conclusion

Data migration is a crucial task that requires careful planning and execution. When migrating data from MySQL to TiDB, understanding the nature of your data, planning for downtime, choosing the right migration strategy, and validating your data after migration are critical steps.

This wraps up our guide on migrating from MySQL to TiDB. However, keep in mind that every migration scenario is unique, and it may require custom adjustments. Please don’t hesitate to contact our experts for assistance with your migration process.


Connect with PingCAP


Experience modern data infrastructure firsthand.

Try TiDB Serverless

Have questions? Let us know how we can help.

Contact Us

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads