Database Migration¶
Pepys-import repository uses Alembic for data migration. Alembic is a database migration tool which is maintained by SQLAlchemy.
It is possible to apply the existing migrations to the users’ database via pepys_admin
.
Don’t forget to check your configuration file before running pepys_import
or pepys_admin
.
Instructions¶
config.py
). Please check your PEPYS_CONFIG_FILE
environment variable. If it doesn’t exist, it means that default_config.ini
will be used to create a connection.alembic upgrade head
” which is going to create all DB tables and alembic_version
table.alembic_version
table.alembic_version
table and stamp it manually. For doing that, please run the following commands:Postgres¶
CREATE TABLE pepys.alembic_version
(
version_num VARCHAR(32) NOT NULL,
CONSTRAINT alembic_version_pkc PRIMARY KEY (version_num)
);
INSERT INTO pepys.alembic_version VALUES ('XXXXXX');
migrations/postgres_versions
).alembic current
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
5154f7db278d (head)
SQLite¶
CREATE TABLE alembic_version
(
version_num VARCHAR(32) NOT NULL,
CONSTRAINT alembic_version_pkc PRIMARY KEY (version_num)
);
INSERT INTO alembic_version (version_num)
VALUES ('XXXXXX');
Please change XXXXX with the latest migration number (revision variable in the most recent script file in migrations/sqlite_versions
).
So, the last line will stamp your database with the latest migration. For controlling your database version,
you might run the following command:
alembic upgrade head
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
ccc37f794db6 (head)
How to use it? (For Developers)¶
You can add/update/delete Base models (pepys_import.core.store.postgres_db
and pepys_import.core.store.sqlite_db
).
If there is any change, Alembic might automatically generate a migration script:
alembic revision -m "YOUR MESSAGE" --autogenerate
It will create a script under migrations/postgres_versions
or migrations/sqlite_versions
according to your configuration.
Alembic assigns revision IDs using uuid.uuid4()
. When you open the created script, you can see the number (revision = XXXXXX)
It is strongly suggested to control the autogenerated script before DB migration.
Latest Revision IDs¶
LATEST_POSTGRES_VERSION
and LATEST_SQLITE_VERSION
.@write_hooks.register("update_latest_revision")
def update_latest_revision(filename, options):
with open(filename) as file_:
lines = file_.readlines()
# Load json file
json_file_path = os.path.join(MIGRATIONS_DIRECTORY, "latest_revisions.json")
with open(json_file_path, "r") as json_file:
data = json.load(json_file)
for line in lines:
# If line has revision variable, i.e. revision = "bcff0ccb4fbd", remove new line
# character and quote marks, split line into two parts: ('', 'bcff0ccb4fbd'), obtain the
# second element
if line.startswith("revision = "):
split_tokens = line.replace("\n", "").replace('"', "").split("revision = ")
revision_id = split_tokens[1]
if "postgres_versions" in filename:
data["LATEST_POSTGRES_VERSION"] = revision_id
else:
data["LATEST_SQLITE_VERSION"] = revision_id
# Dump updated json
with open(json_file_path, "w") as json_file:
json.dump(data, json_file, indent=4)
revision = ` and takes the :code:`revision_id
from that line. Then, it updates the corresponding key in JSON (LATEST_POSTGRES_VERSION
if it’s a migration script for Postgres, LATEST_SQLITE_VERSION
if it’s a migration script for SQLite).After a migration script is checked, you can also interpret SQL script of it. Please note that this only works for PostgreSQL.
Batch operations are used for SQLite to not lose any data. Due to that, it needs further actions to run --sql
command. These actions were carried out and caused
more troubles. Therefore, it was decided not to support --sql
for SQLite.
For generating SQL codes from postgres migration scripts, the following command might be used:
alembic upgrade XXXXX --sql > new_migration.sql
. (Please change XXXX with the revision ID of the migration script.)
alembic upgrade FROM:TO --sql > new_migration.sql
migrations/postgres_versions
.alembic upgrade 018:b31 --sql
(pepys-import) baris@bariss-MacBook-Pro pepys-import % alembic upgrade 018:b31 --sql
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Generating static SQL
INFO [alembic.runtime.migration] Will assume transactional DDL.
BEGIN;
INFO [alembic.runtime.migration] Running upgrade 0189cb24812f -> b311affac706, Update nullability of columns of Activations
-- Running upgrade 0189cb24812f -> b311affac706
ALTER TABLE pepys."Activations" ALTER COLUMN "end" DROP NOT NULL;
ALTER TABLE pepys."Activations" ALTER COLUMN start DROP NOT NULL;
UPDATE pepys.alembic_version SET version_num='b311affac706' WHERE pepys.alembic_version.version_num = '0189cb24812f';
COMMIT;
render_as_batch=True
is passed to the Alembic’s context and batch_alter_table
is used in migration scripts. For further information: Running Batch Migrations--autogenerate
flag. For example:alembic revision -m "YOUR MESSAGE"
upgrade()
, downgrade()
functions. You can fill them manually.alembic upgrade head
alembic downgrade head REVISION_ID
alembic downgrade -1
, you might check it out: Relative IdentifiersAddGeometryColumn() error: "UNIQUE constraint failed: geometry_columns.f_table_name, geometry_columns.f_geometry_column"
You can ignore this error because it says that there is already a geometry column entity for your table that has a Geometry column (i.e. States, Geometries, Contact). This error happens because SQLite doesn’t support ALTER TABLE statement.
Instead, it creates a new one, copies values from the previous table, and drops the existing table. However, during the creation of a new table, it tries to push the Geometry column to the geometry_columns
table again.
How to use it? (For Users)¶
Migrations might be applied in pepys_admin
. Its migration option migrates the DB to the latest version.
However, if they would like to check it, after ensuring their config file points to the correct database, they might run this command to upgrade their DB if there are any migrations:
alembic upgrade head
Helpful Commands¶
If you would like to see the current head of Alembic: alembic current
If you would like to see the history of the migration: alembic history
If you have changed schema and want to create a migration script: alembic revision -m "YOUR MESSAGE" --autogenerate
If you would like to see SQL script of migration scripts (Don’t forget to change START and END values with the migration revision IDs):
alembic upgrade START:END --sql
Note: Recall that this only works for PostgreSQL.
alembic -c alembic_sqlite.ini show XXXX
or alembic -c alembic_postgres.ini show XXXX
alembic -c alembic_sqlite.ini edit XXXX
or alembic -c alembic_postgres.ini edit XXXX
Note: Show and Edit commands don’t run env.py
file and it checks the migration scripts in the version_locations
which is not specified in the main config file.
Therefore, one of the extra config files should be passed to show/edit scripts in migrations/postgres_versions
and migrations/sqlite_versions
.
Please check the cookbook and documentation of Alembic for further information.
Possible Errors and Solutions¶
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) Cannot add a NOT NULL column with default value NULL
If you have values in your DB and you try to add a new attribute to the filled table which can’t be NULL, you should make this attribute nullable.
File "migrations/env.py", line 9, in <module>
from config import DB_HOST, DB_NAME, DB_PASSWORD, DB_PORT, DB_TYPE, DB_USERNAME
ModuleNotFoundError: No module named 'config'
If you face this error, it means that the pepys-import
repository should be installed in a development environment. Please run the following command when you are at the root of the repository:
source PATH/TO/YOUR/ENV/bin/activate
pip install -e .
PYTHONPATH
. For example:PYTHONPATH=. alembic current
(pepys-import) baris@bariss-MacBook-Pro pepys-import % alembic revision -m "message" --autogenerate
Database tables are not found! (Hint: Did you initialise the DataStore?)
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
ERROR [alembic.util.messaging] Target database is not up to date.
FAILED: Target database is not up to date.
If you make some changes and try to create a new migration script without having the latest version of the database, you will face this issue. You should upgrade your DB and then run the revision command:
alembic upgrade head
alembic revision -m "your message" --autogenerate
FOR DEVELOPERS ONLY
Database tables are not found! (Hint: Did you initialise the DataStore?)
If you migrated your SQLite database and started to see this message, it might mean that your migration script is wrong.
It would have been because of the extra tables created by Alembic. When a table with Geometry column is included in migration,
please check the SQLite database if there are any temporary tables (with tmp keyword) such as idx_Geometries_tmp_geometry_node
etc. If there is any, please drop these tables. After that,
please open your migration script and add spatial_index=False
argument to the Geometry column. An example is as follows:
# Change this column
geometry = deferred(Column(Geometry(geometry_type="GEOMETRY", management=True), nullable=False))
# Add spatial_index=False parameter
geometry = deferred(
Column(
Geometry(geometry_type="GEOMETRY", management=True, spatial_index=False), nullable=False
)
)
Why do we have many SQLite related problems ?
In general, SQLite’s unsupporting ALTER TABLE statement pushes us to use the move-and-copy approach to not lose any data. But when we use it, all problems arise because we use render_as_batch=True parameter to apply the move-and-copy approach. Alembic calls it the Batch mode and I think many of our problems related to that.
Migration tools are instead expected to produce copies of SQLite tables that correspond to the new structure, transfer the data from the existing table to the new one,
then drop the old table. For our purposes here we’ll call this “move and copy” workflow, and in order to accommodate it in a way that is reasonably predictable,
while also remaining compatible with other databases, Alembic provides the batch operations context.