Migration of Etherpad Lite from MariaDB to PostgreSQL

en

I recently migrated an installation of Etherpad Lite from an old setup based on a git checkout to a newer container-based setup. The old setup used MariaDB as its backing database.

In the new container setup, I wanted to expose the database's UNIX domain socket to the container. Unfortunately, the environment variables used to configure the default Etherpad Lite container image don't permit configuring a MariaDB backend using a UNIX socket. However, such as setup was possible when using a PostgreSQL backend.

Therefore, I wanted to figure out whether the database backend could easily be switched, and the data migrated.

Built-in Migration Command

Now, Etherpad Lite comes with a handy migrateDB command line tool that promises to do exactly what I needed: Provide an old and a new database configuration, and migrate data from another:

pnpm run --filter bin migrateDB --file1 old.json --file2 new.json

The documentation of the command states:

After some time the data should be copied over to the new database.

However, after some time had passed (I think around 15 minutes), and the command still hadn't finished, I compared the number of records in the old MariaDB database and the new PostgreSQL database:

The old database contained around 6.5 million records. After these 15 minutes, only around 8500 of these records had been migrated to the new database. Extrapolating from this, the entire migration would have taken around 8 days. 8 days during which the Etherpad Lite instance would have to have remained offline, because any changes would not have been migrated.

So this unfortunately was not an option.

pgloader

The next approach I tried was using pgloader, a "load anything into PostgreSQL" tool. It appeared to be similarly straight-forward as the migrateDB command: Provide it with a source and a destination, and then let the magic happen. So my first try looked like this:

pgloader mysql://etherpad:password@mysql/etherpad postgresql:///etherpad

And, indeed, magic happened: After less than 4 minutes, the migration was completed!

Well, sort of... The data was imported into the schema etherpad in the database etherpad. However, it appeared that Etherpad Lite expected the schema to have the default name, public. Unfortunately, simply renaming the schema after the fact caused some permission issues which I wasn't able to figure out on the spot.

But a schema rename command can also be added to the pgloader invocation. To do this, the migration command has to be provided as a config file, rather than on the command line:

LOAD DATABASE
  FROM mysql://etherpad:password@mysql/etherpad
  INTO postgresql:///etherpad
 ALTER SCHEMA 'etherpad' RENAME TO 'public';

After dropping the erroneous schema, I re-ran pgloader, and this time Etherpad Lite started up with its PostgreSQL backend without any issues:

~$ pgloader my.load
2025-08-17T14:51:47.024000Z LOG pgloader version "3.6.10~devel"
2025-08-17T14:51:47.256001Z LOG Migrating from #<MYSQL-CONNECTION mysql://etherpad@mysql:3306/etherpad {1005C5CE93}>
2025-08-17T14:51:47.256001Z LOG Migrating into #<PGSQL-CONNECTION pgsql://etherpad@UNIX:5432/etherpad {1005C5CF13}>
2025-08-17T14:54:23.139960Z LOG report summary reset
             table name     errors       rows      bytes      total time
-----------------------  ---------  ---------  ---------  --------------
        fetch meta data          0          2                     0.160s
         Create Schemas          0          0                     0.000s
       Create SQL Types          0          0                     0.016s
          Create tables          0          2                     0.052s
         Set Table OIDs          0          1                     0.020s
-----------------------  ---------  ---------  ---------  --------------
           public.store          0    6483656   920.8 MB       1m32.360s
-----------------------  ---------  ---------  ---------  --------------
COPY Threads Completion          0          4                  1m32.368s
 Index Build Completion          0          1                   1m2.708s
         Create Indexes          0          1                   1m2.616s
        Reset Sequences          0          0                     0.152s
           Primary Keys          0          1                     0.004s
    Create Foreign Keys          0          0                     0.000s
        Create Triggers          0          0                     0.004s
       Install Comments          0          0                     0.000s
-----------------------  ---------  ---------  ---------  --------------
      Total import time          ✓    6483656   920.8 MB       3m37.852s

So, if you're planning to move a workload to a PostgreSQL backend, pgloader is definitely worth looking into.