Database Version Control

As normal for a Django project, the standard approach to database version control will use Django Migrations to create and apply database migrations.

FDW augments the built-in Django functionality with Pyrseas to maintain a canonical schema definition and generate SQL for complex migrations that are not supported by Django, such as views, triggers, functions or non-default foreign key actions.

The canonical metadata definition for the database is at :source:pyrseas/schema.yaml. This file is the master definition of the correct database structure for the fdw_owner schema.

Environment Variables

The Pyrseas commands (dbtoyaml, yamltodb and dbaugment) require some environment variables to be set up to enable access to the database. These variables (PGHOST, etc.) should already be configured in the virtualenv for the application.

Creating new Migrations

Normal migrations involving changes to managed Django models can be made using ./manage.py makemigrations as normal.

For changes to the unmanaged models, i.e. the ones that are backed by views, Pyrseas provides tools for understanding how the schema has changed over time.

Before making any changes to schema.yaml create a backup of the file in the pyrseas/ directory:

cp pyrseas/schema.yaml pyrseas/schema.backup.yaml

The DBA can make changes to the development database using any method they like. For example, by copying and manually editing a previous migration or by updating the view using pgAdmin4 and then copying the SQL into a new migration created with ./manage.py makemigrations --empty.

After changes are made to the development database, or to a test database created using test --keepdb, they can be extracted to a YAML file and them compared with the canonical definition in schema.yaml:

dbtoyaml --no-owner --no-privileges $PGDATABASE  > pyrseas/extract.yaml

or, to use the test database:

dbtoyaml --no-owner --no-privileges test_${PGDATABASE}  > pyrseas/extract.yaml

The DBA can then use a diff tool such as Meld or the Compare Files option in Eclipse or other IDE to bring the required changes into the schema.yaml file. For example:

meld pyrseas/extract.yaml pyrseas/schema.yaml &

All database changes must be applied using a Django migration. If a suitable migration cannot be created directly, only can be created manually using yamltodb. First, create an empty migation:

./manage.py makemigration --empty <app_name> --name <migration_name>

The SQL required to perform the Migration can be generated by Pyrseas by comparing the updated schema.yaml file with a test database where all changes are performed using migrations:

yamltodb test_${PGDATABASE} pyrseas/schema.yaml

The output from this command can be used to create the sql to be passed to migrations.RunSQL. For example:

sql = """
      ALTER TABLE price_marketproduct DROP CONSTRAINT dataseries_ptr_id_refs_id_51671822;
      ALTER TABLE fdw_owner.price_marketproduct ADD CONSTRAINT dataseries_ptr_id_refs_id_51671822
        FOREIGN KEY (dataseries_ptr_id) REFERENCES fdw_owner.warehouse_dataseries (id)
        ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED;
      ALTER TABLE price_pricedataset DROP CONSTRAINT dataset_ptr_id_refs_id_c494911c;
      ALTER TABLE fdw_owner.price_pricedataset ADD CONSTRAINT dataset_ptr_id_refs_id_c494911c
        FOREIGN KEY (dataset_ptr_id) REFERENCES fdw_owner.warehouse_dataset (id)
        ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED;
      ALTER TABLE warehouse_dataseries_datasets DROP CONSTRAINT dataseries_id_refs_id_21ff6a12;
      ALTER TABLE warehouse_dataseries_datasets ADD CONSTRAINT dataseries_id_refs_id_21ff6a12
      FOREIGN KEY (dataseries_id) REFERENCES fdw_owner.warehouse_dataseries (id)
        ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED;
      """

Note that the SQL created by yamltodb will include schema qualifiers on all objects. If the application is multi-tenant or has some other complex configuration where different users are using different search_paths then it is likely that the schema qualifiers must not be removed from table or domain names, etc. Otherwise it is likely then that when the Migration is run against in the production database it will not work as intended. There is also a strong likelihood that it will fail silently - i.e. the Migration will succeed but will make changes to a schema other than the one intended.

After the sql for the Migration’s migrations.RunSQL parameter has been defined it can be run against the test database to upgrade it (taking a backup first):

pg_dump -v -Fc -f/tmp/${PGDATABASE}.backup ${PGDATABASE}
./manage.py migrate

After the migration scripts have run, Pyrseas can be used to confirm that the database matches the canonical definition for that release:

yamltodb $PGDATABASE pyrseas/schema.yaml

Once the test database has been upgraded, yamltodb can be used to compare the schema backup against the new database structure and generate the commands required to undo the last update. The SQL generated can be pasted into the reverse_sql parameter for the current migration:

yamltodb $PGDATABASE pyrseas/schema.backup.yaml

The down migration should also be tested:

./manage.py migrate app_name --list
./manage.py migrate app_name 00XX # where XX is the number of the previous migration

After the backwards migration script has run, Pyrseas can be used to confirm that the database matches the canonical definition for the previous release:

yamltodb $PGDATABASE pyrseas/schema.backup.yaml

or, for a test database:

yamltodb test_${PGDATABASE} pyrseas/schema.backup.yaml

Note that if the migration fails such that you need to restore the database before continuing you can use the following commands. Note that these commands must be run on the database server via SSH, rather than on the application server:

pg_dump -v -Fc -f/tmp/${PGDATABASE}.broken.backup ${PGDATABASE} # Take another backup just to be safe!
restore() {
if [[ $1 =~ ^[Yy]$ ]]; then
    echo Restoring database ${PGDATABASE}
    # Drop open connections (change procpid to pid for pgsql > 9.2)
    psql -c "SELECT pg_terminate_backend(pg_stat_activity.procpid) FROM pg_stat_activity WHERE pg_stat_activity.datname = '${PGDATABASE}'";
    dropdb -i ${PGDATABASE} && createdb --owner ${PGDATABASE} --template template0 ${PGDATABASE} && echo Database created && pg_restore --role=${PGDATABASE} -d ${PGDATABASE} /tmp/${PGDATABASE}.backup && echo Database restored
fi
}
read -p "Are you sure you want to drop and restore database ${PGDATABASE}? (y/n): " -n 1;echo; restore $REPLY

Once the database changes are complete, the updated schema.yaml file and the corresponding Migration will be committed to the repository in a single commit, preferably with the other changes to the application related to the database change, e.g. updated models.py, admin.py, etc.

When a tagged release is promoted to the test or production systems, ./manage.py migrate will be used to run all the appropriate migrations.