Using pgloader to Migrate Data from MySQL to openGauss

Ju Peng2021-05-27Using pgloader to Migrate Data from MySQL to openGauss

Using pgloader to Migrate Data from MySQL to openGauss

Introduction to pgloader

pgloader is a data import tool that uses the COPY command to import data to PostgreSQL. pgloader works in two modes: importing data from files and migrating databases. In both modes, pgloader uses the COPY protocol of PostgreSQL to efficiently transfer data.

openGauss is compatible with PostgreSQL communication protocols and most syntaxes. For this reason, you can use pgloader to migrate data from MySQL to openGauss.

pgloader Issues in openGauss

openGauss performs security hardening on native PostgreSQL communication protocols. As a result, it is incompatible with the default PostgreSQL communication protocols, and by default, the native PostgreSQL that uses pgloader cannot connect to openGauss. An error similar to the following is reported:

The solution is to modify the GUC parameter. The involved GUC parameter is password_encryption_type. By default, PostgreSQL uses MD5 encryption, which is insecure. To improve the security capabilities of openGauss, openGauss uses SHA256 encryption by default. Therefore, the preceding error is reported. openGauss does not delete MD5 encryption and its verification logic. As such, MD5 encryption can be enabled by setting the GUC parameter.

Method:

gs_guc reload -D $PGDATA -c "password_encryption_type = 1"

You must create a user after setting the preceding parameter. Then, you can log in to the database as the user.

Next, use pgloader to migrate data from MySQL to openGauss.

Installing pgloader

You can install pgloader directly from apt.postgresql.org or the official Debian repository at packages.debian.org/pgloader.

$ apt-get install pgloader

You can also directly use pgloader through Docker images.

$ docker pull dimitri/pgloader

$ docker run --rm --name pgloader dimitri/pgloader:latest pgloader --version

$ docker run --rm --name pgloader dimitri/pgloader:latest pgloader –help

Configuring pgloader

pgloader provides various configuration items for you to customize migration operations. For example, you can run the include drop command to delete all tables whose names appear in MySQL in the target database, so that the tool can be started automatically in a clean environment by running the same command for multiple consecutive times.

The following describes some common configuration items.

FROM: URL of the source database. The format is as follows:

  mysql://[user[:password]@][netloc][:port][/dbname][?option=value&...]

INTO: URL of the target database. The format is as follows:

  postgresql://[user[:password]@][netloc][:port][/dbname][?option=value&...]

WITH: options when data is loaded from MySQL. The options are include drop, create tables, and create indexes. CAST: user-defined type conversion rule. You are allowed to overwrite existing default conversion rules or modify them in special cases.

For partial migration, you can use the configuration items including only table names matching and excluding table names matching to migrate only specific tables or exclude specific tables during migration.

For details about the configuration items, see the official website:

https://pgloader.readthedocs.io/en/latest/ref/mysql.html

The following is an example of the configuration file for migrating data from MySQL to openGauss:

LOAD DATABASE
FROM mysql://mysql_test:password123@1.1.1.1:3306/mysql_database
INTO postgresql://opengauss_test:********_***@1.1.1.1:5432/opengauss_database

WITH include drop, create tables, create indexes, reset no sequences,
      workers = 8, concurrency = 1,
      multiple readers per thread, rows per range = 50000

 CAST
 type varchar when(= 1 precision) to "boolean" drop typemod keep default keep not null;

The preceding configuration file indicates that the username and password for connecting to MySQL are mysql_test and password123 respectively during data migration. The IP address and port number of the MySQL server are 1.1.1.1 and 3306, respectively. The source database is mysql_database.

The username and password for connecting to openGauss are opengauss_test and **********_*****, respectively. The IP address and port number of the openGauss server are 1.1.1.1 and 5432, respectively, and the target database is opengauss_database.

Note that the user must have the permission to remotely connect to MySQL and openGauss and the read and write permissions on the corresponding databases. For openGauss, the host where pgloader runs must be in the remote access whitelist of openGauss.

Creating Users and Databases

This section describes how to create users and databases in openGauss required for migration.

Runing pgloader to Migrate Data

The following shows how to install pgloader by using Docker images. Name the prepared configuration file openGauss.loader.

Run the docker run -tid --name pgloader_test dimitri/pgloader command to start Docker.

Run the docker cp ./openGauss.loader pgloader_test:/ command to copy the configuration file to Docker.

Run the docker exec -it pgloader_test /bin/bash command to access Docker.

Run the pgloader openGauss.loader command to start pgloader, wait until the data migration is complete, and view the migration result report.

View the migration result in openGauss.