DB-to-S3 attachment migration
Availability
[SINCE Orbeon Forms 2025.1.1]
Introduction
When switching your attachment storage from a relational database to S3 (see Storing attachments in the filesystem or on S3), existing attachments remain in the database. The DB-to-S3 attachment migration tool moves those existing attachments from the database to S3, so that all attachments are stored in a single location.
The tool:
reads attachments from the
orbeon_form_data_attachandorbeon_form_definition_attachdatabase tablesuploads them to the configured S3 bucket
verifies data integrity (size and SHA-256 hash)
nullifies the
file_contentcolumn in the database after a successful upload
If an attachment already exists on S3 with the correct content, the tool skips the upload and only nullifies the database column.
Prerequisites
It is highly recommended to back up your database before running this tool. While the tool includes integrity checks, a backup ensures you can recover if anything goes wrong.
Running with Docker
The tool is available as a Docker image: orbeon/db-to-s3-attachment-migration.
Basic usage
Using environment variables
Instead of command-line arguments, you can use environment variables. To avoid exposing secrets in the command line, you can store them in a file and use Docker's --env-file option:
JDBC drivers
The Docker image includes the PostgreSQL JDBC driver. For licensing reasons, other database drivers (MySQL, Oracle, SQL Server, DB2) are not included. You must provide the appropriate JDBC driver JAR by mounting it into the /opt/drivers/ directory:
Network access
The Docker container must be able to reach both the database and the S3 endpoint. If the database runs on the host machine, you may need to use --network host or the appropriate Docker networking option for your setup.
Parameters
Required
--db-url
ORBEON_DB_URL
JDBC URL for the database
--s3-bucket
ORBEON_S3_BUCKET
S3 bucket name
--s3-access-key
ORBEON_S3_ACCESS_KEY
AWS access key ID
--s3-secret-access-key
ORBEON_S3_SECRET_ACCESS_KEY
AWS secret access key
Optional
--db-user
ORBEON_DB_USER
(empty)
Database username
--db-password
ORBEON_DB_PASSWORD
(empty)
Database password
--db-init-sql
ORBEON_DB_INIT_SQL
(empty)
SQL to execute after connecting
--s3-endpoint
ORBEON_S3_ENDPOINT
s3.amazonaws.com
S3 endpoint URL
--s3-region
ORBEON_S3_REGION
aws-global
S3 region
--s3-base-path
ORBEON_S3_BASE_PATH
(empty)
S3 key prefix
--s3-force-path-style
ORBEON_S3_FORCE_PATH_STYLE
false
Use path-style S3 URLs
--parallelism
ORBEON_PARALLELISM
1
Number of rows to process concurrently
--dry-run
ORBEON_DRY_RUN
false
Preview what would be migrated, without making changes
Command-line arguments take precedence over environment variables.
Dry run
It is recommended to start with a dry run to preview what the tool would do:
In dry-run mode, no data is uploaded to S3 and no database rows are modified.
Data integrity
The tool verifies each upload by comparing the file size and SHA-256 hash between the database and S3. If a mismatch is detected, the tool aborts the migration and the database content is preserved.
S3-compatible services
The tool works with any S3-compatible service (e.g. MinIO, Backblaze B2). Use the --s3-endpoint parameter to point to your service. Some S3-compatible services also require --s3-force-path-style to be set to true.
See also
Last updated