Backup MySQL databases with Docker and S3

For a complete working solution checkout the Github repository.

MySQL is fairly easy to get running with Docker Compose. The following example mounts the volume so data is persisted when the container is restarted. The root password is recorded in a separate .env file, outside of source control.

version: '3.8'

services:
  mysql:
    image: mysql:latest
    container_name: mysql
    restart: always
    env_file: .env
    command: mysqld --default-authentication-plugin=mysql_native_password
    volumes:
     - mysql-data:/var/lib/mysql
    environment:
      MYSQL_ROOT_PASSWORD: $MYSQL_ROOT_PASSWORD
      
volumes:
  mysql-data:

The mysql_native_password parameter was necessary for my client (MySQL Workbench) to connect correctly.

After running this docker-compose.yaml file with docker-compose up -d, other containers can connect to it using the container hostname mysql on port 3306.

Personally, I run a bunch of isolated web applications that share the same MySQL instance. Each application has their own MySQL user and database.

This has been running fine for a long time, but there's always a possibility that something happens somewhere in the tech stack that corrupts the database. It could even unintentionally be caused by myself.

For my use cases, I would say a solid backup system has:

  • Automated daily backups
  • Automated restore tests
  • Minimal downtime on running applications
  • Backup storage in multiple locations
  • Backup storage is cost effective

It is usually a good idea to separate responsibilities across containers, so the backup process has its own container definition.

  backup:
    build: ./backup
    container_name: mysql-backup
    environment:
      MYSQL_CONTAINER_NAME: mysql
      MYSQL_ROOT_PASSWORD: $MYSQL_ROOT_PASSWORD
      S3_ENDPOINT: $S3_ENDPOINT
      S3_ACCESS_KEY: $S3_ACCESS_KEY
      S3_SECRET_KEY: $S3_SECRET_KEY
      S3_BUCKET: $S3_BUCKET
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./backup/snapshots:/opt/mysql/backup
    command: crond -f -d 8
    restart: always

The backup directory contains a simple Dockerfile for building the container. It  copies the backup script to a cron directory, and installs software for accessing the database and downloading a MinIO client for uploading to S3.

FROM alpine:latest

COPY ./scripts/daily/* /etc/periodic/daily

RUN apk update && \
    apk upgrade && \
    apk add --no-cache mysql-client && \
    apk add --no-cache curl && \
    chmod a+x /etc/periodic/daily/*

The actual backup script can be found here. It has a few main steps:

  1. Dump all databases to separate files. The mysqldump utility exports everything into one file by default. I find it easier to manage the restore process if all databases are in separate files.
  2. Merge all backup files into a single tar file. This makes it easier to manage the entire backup as a "snapshot", and apply life cycle management on it in S3. The individual backup files can then be removed.
  3. Upload the tar file to S3, or in my case, the S3 compatible Scaleway object storage.

A single tar is around 10MB for me, not that big. However, I plan to keep this process running for years and years to come, so storage costs will accumulate over time. In AWS and Scaleway it is trivial to setup a lifecycle rule that moves tar files to "cold storage" after 30 days.

A next post will cover automated restore tests and copying the backup files to multiple locations.

References