15 KiB
Database Migrations
This guide covers managing database schema changes in Fission Python projects.
Table of Contents
- Overview
- Migration Files
- Applying Migrations
- Writing Migrations
- Best Practices
- Rollback Strategies
- Automation
Overview
Database schema changes should be managed through versioned migration scripts, not manual CREATE TABLE statements.
This template uses plain SQL migration files (.sql), which provide:
- Version control of schema changes
- Repeatable application to different environments
- Clear upgrade/downgrade paths
- Audit trail of schema evolution
Migration Files
Place SQL migration scripts in the migrates/ directory:
migrates/
├── 001_initial_schema.sql
├── 002_add_user_email.sql
├── 003_create_indexes.sql
└── ...
Naming convention:
- Prefix with sequential number (zero-padded for sorting)
- Descriptive name after underscore
.sqlextension- Numbers should be unique and monotonically increasing
Initial Schema Example
-- migrates/001_create_items_table.sql
-- Create items table
CREATE TABLE IF NOT EXISTS items (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
description TEXT,
status VARCHAR(50) DEFAULT 'active',
metadata JSONB,
created TIMESTAMPTZ DEFAULT NOW(),
modified TIMESTAMPTZ DEFAULT NOW()
);
-- Add indexes
CREATE INDEX idx_items_status ON items(status);
CREATE INDEX idx_items_created ON items(created);
-- Add comments
COMMENT ON TABLE items IS 'Stores item records';
COMMENT ON COLUMN items.status IS 'Item status: active, inactive, pending';
Applying Migrations
Manually
# Connect to database
psql -h localhost -U postgres -d mydb
# Run migration file
\i migrates/001_create_items_table.sql
# Run all migrations in order (bash script)
for file in $(ls migrates/*.sql | sort); do
echo "Applying $file..."
psql -h localhost -U postgres -d mydb -f "$file"
done
Automatically from Python
Create a simple migration runner:
# src/migrate.py (not part of function, standalone script)
import os
import psycopg2
from helpers import init_db_connection
def run_migrations():
conn = init_db_connection()
cursor = conn.cursor()
# Create migrations tracking table if not exists
cursor.execute("""
CREATE TABLE IF NOT EXISTS schema_migrations (
version INTEGER PRIMARY KEY,
name VARCHAR(255) NOT NULL,
applied_at TIMESTAMPTZ DEFAULT NOW()
)
""")
# Get already-applied migrations
cursor.execute("SELECT version FROM schema_migrations")
applied = {row[0] for row in cursor.fetchall()}
# Find migration files
migrates_dir = os.path.join(os.path.dirname(__file__), "..", "migrates")
files = sorted([
f for f in os.listdir(migrates_dir)
if f.endswith(".sql")
])
# Apply pending migrations
for filename in files:
# Extract version number
version = int(filename.split("_")[0])
if version in applied:
print(f"Skipping {filename} (already applied)")
continue
path = os.path.join(migrates_dir, filename)
print(f"Applying {filename}...")
with open(path, 'r') as f:
sql = f.read()
try:
cursor.execute(sql)
cursor.execute(
"INSERT INTO schema_migrations (version, name) VALUES (%s, %s)",
(version, filename)
)
conn.commit()
print(f" ✓ Applied {filename}")
except Exception as e:
conn.rollback()
print(f" ✗ Failed: {e}")
raise
conn.close()
print("All migrations applied")
if __name__ == "__main__":
run_migrations()
Run:
python src/migrate.py
Using Migration Tools
For more advanced features (rollbacks, branching), consider:
- Alembic - Database migration tool for SQLAlchemy (if using ORM)
- pg migrator - Heroku's migration tool
- goose - Multi-database migration tool (can use from Python)
- yoyo-migrations - Python-based migrations
Writing Migrations
Principles
- Idempotent - Script should succeed if run multiple times
- Additive first - Add columns/tables before removing/dropping
- Backward compatible - New schema should work with old code
- Atomic - One logical change per migration file
- Test locally - Apply to test database before production
Common Operations
Create Table
CREATE TABLE IF NOT EXISTS orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
total DECIMAL(10,2) NOT NULL,
status VARCHAR(50) NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Add foreign key
ALTER TABLE orders
ADD CONSTRAINT fk_orders_user
FOREIGN KEY (user_id)
REFERENCES users(id)
ON DELETE CASCADE;
-- Index for performance
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at);
Add Column
-- Add nullable column (safe, backward compatible)
ALTER TABLE orders
ADD COLUMN shipping_address JSONB;
-- Add column with default (be careful with large tables!)
-- This rewrites entire table - use cautiously
ALTER TABLE orders
ADD COLUMN tax_amount DECIMAL(10,2) DEFAULT 0.00;
Rename Column
-- PostgreSQL 9.2+ supports RENAME COLUMN
ALTER TABLE orders
RENAME COLUMN total TO order_total;
Modify Column Type
-- Change VARCHAR length
ALTER TABLE users
ALTER COLUMN email TYPE VARCHAR(320);
-- Convert to different type (use USING clause)
ALTER TABLE orders
ALTER COLUMN status TYPE VARCHAR(100)
USING status::VARCHAR(100);
Create Index
-- Simple index
CREATE INDEX idx_users_email ON users(email);
-- Unique index
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
-- Partial index (only active users)
CREATE INDEX idx_users_active ON users(id)
WHERE status = 'active';
-- Multi-column index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
Drop Column/Table
-- First, ensure no one is using it
-- Consider using SET DEFAULT then dropping in subsequent migration
-- Drop column
ALTER TABLE orders
DROP COLUMN IF EXISTS old_column;
-- Drop table (dangerous!)
DROP TABLE IF EXISTS old_logs;
Data Migrations
Sometimes you need to transform data:
-- Backfill new column from existing data
UPDATE orders
SET shipping_address = jsonb_build_object(
'street', address_street,
'city', address_city,
'zip', address_zip
)
WHERE shipping_address IS NULL;
-- Migrate enum values
UPDATE products
SET status = 'active' WHERE status = 'ACTIVE';
-- Clean up duplicates
WITH duplicates AS (
SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) AS rn
FROM users
)
DELETE FROM users WHERE id IN (SELECT id FROM duplicates WHERE rn > 1);
Transactional Migrations
Wrap critical migrations in transactions:
BEGIN;
-- Multiple related operations
ALTER TABLE orders ADD COLUMN shipping_id UUID;
UPDATE orders SET shipping_id = uuid_generate_v4() WHERE shipping_id IS NULL;
ALTER TABLE orders ALTER COLUMN shipping_id SET NOT NULL;
COMMIT;
Note: DDL statements in PostgreSQL auto-commit, so BEGIN/COMMIT may not work as expected for schema changes. For complex multi-step changes, consider using advisory locks or deployment coordination.
Best Practices
✅ Do's
- Test migrations on copy of production database before applying to prod
- Keep migrations small - One logical change per file
- Write data migrations as separate files from schema migrations
- Use
IF NOT EXISTSandIF EXISTSto make migrations idempotent - Never drop columns/tables in the same migration you add them - Separate to allow rollback
- Document why - Add comments explaining the purpose
- Consider indexes - Add indexes for frequently queried columns in same migration as table creation
- Use UUIDs for primary keys (
gen_random_uuid()in PostgreSQL 13+) - Add
created_atandupdated_attimestamps to all tables - Version numbers must be unique and sequential
❌ Don'ts
- Don't modify already-applied migrations - They're part of history
- Don't skip version numbers - Creates gaps but not critical
- Don't use destructive operations without backup -
DROP COLUMN,DROP TABLE - Don't run long-running migrations during peak hours - Use low-traffic windows
- Don't add NOT NULL without default on non-empty tables - Will fail due to existing NULL rows
- Don't assume order of execution - Always number sequentially
- Don't mix unrelated changes in one migration file
Zero-Downtime Migrations
Adding Column
-- Step 1: Add column as nullable or with default (fast)
ALTER TABLE orders ADD COLUMN status VARCHAR(50);
-- Step 2: Deploy code that writes to new column
-- Your application updates to populate status
-- Step 3: Backfill existing rows (if needed)
UPDATE orders SET status = 'completed' WHERE status IS NULL AND shipped_at IS NOT NULL;
-- Step 4: Make column NOT NULL (if needed) - only after all rows have values
ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
Renaming Column
-- Step 1: Add new column
ALTER TABLE orders ADD COLUMN order_status VARCHAR(50);
-- Step 2: Deploy code writing to both old and new columns (dual-write)
-- Step 3: Backfill data
UPDATE orders SET order_status = status;
-- Step 4: Deploy code reading from new column, stop writing to old
-- Step 5: Drop old column (in separate migration)
ALTER TABLE orders DROP COLUMN status;
Rollback Strategies
Manual Rollback
For each migration, you may want to write a corresponding "down" migration:
-- 002_add_user_email.sql (UP)
ALTER TABLE users ADD COLUMN email VARCHAR(320);
-- 002_add_user_email_rollback.sql (DOWN)
ALTER TABLE users DROP COLUMN IF EXISTS email;
Store rollback scripts alongside migrations or in separate rollbacks/ directory.
Point-in-Time Recovery
Best strategy: Restore database from backup to point before bad migration, then re-apply good migrations.
# Restore from PITR backup (if using WAL archiving)
pg_restore -h localhost -U postgres -d mydb --point-in-time="2025-03-18 10:30:00"
# Re-run migrations up to good version
python src/migrate.py # But this applies all, so need selective
Selective Rollback Script
# rollback.py
import sys
from helpers import init_db_connection
def rollback(to_version: int):
conn = init_db_connection()
cursor = conn.cursor()
# Find migrations after target version
cursor.execute("""
SELECT version, name
FROM schema_migrations
WHERE version > %s
ORDER BY version DESC
""", (to_version,))
migrations = cursor.fetchall()
for version, name in migrations:
rollback_file = f"rollbacks/{version:03d}_{name.split('_', 1)[1]}.sql"
print(f"Rolling back {name} using {rollback_file}...")
with open(rollback_file, 'r') as f:
sql = f.read()
cursor.execute(sql)
cursor.execute("DELETE FROM schema_migrations WHERE version = %s", (version,))
conn.commit()
print(f" Rolled back {name}")
conn.close()
print(f"Rolled back to version {to_version}")
if __name__ == "__main__":
target = int(sys.argv[1])
rollback(target)
Automation
CI/CD Integration
In your deployment pipeline:
# Before deploying new code
python src/migrate.py
# If migrations fail, abort deployment
if [ $? -ne 0 ]; then
echo "Migrations failed, aborting deployment"
exit 1
fi
# Deploy new code
fission deploy
Pre-deployment Hooks
Use Fission hooks to run migrations automatically:
{
"hooks": {
"function_pre_deploy": [
{
"type": "http",
"url": "http://migration-service/migrate",
"timeout": 300000
}
]
}
}
Or simpler: run migration as part of build.sh:
#!/bin/sh
# src/build.sh
# Install dependencies
pip3 install -r requirements.txt -t .
# Run migrations against test DB (or do nothing, migrations are separate)
# python ../migrate.py
# Package up
cp -r . ${DEPLOY_PKG}
Database Change Management Tools
Consider specialized tools for larger teams:
- Flyway - Java-based, supports repeatable migrations
- Liquibase - XML/YAML/JSON migrations
- Prisma Migrate - If using Prisma ORM
- Alembic - Python, SQLAlchemy-specific
Example Workflow
-
Create migration:
touch migrates/004_add_orders_table.sql -
Write SQL:
CREATE TABLE orders ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id), total DECIMAL(10,2) NOT NULL, status VARCHAR(50) DEFAULT 'pending', created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE INDEX idx_orders_user_id ON orders(user_id); -
Test locally:
createdb test_migration psql test_migration -f migrates/004_add_orders_table.sql -
Commit migration file:
git add migrates/004_add_orders_table.sql git commit -m "Add orders table" -
Apply to staging:
# Update dev-deployment.json if new env vars needed fission deploy --dev python src/migrate.py -
Apply to production:
# Maintenance window or blue-green deployment fission deploy python src/migrate.py
Troubleshooting
Migration Fails
Check error message:
- syntax error: Validate SQL with
psql -c "SQL"manually - duplicate column: Migration already applied, check
schema_migrations - permission denied: DB user lacks ALTER/CREATE privileges
- lock timeout: Another migration running, wait or kill process
Migration Already Applied But Failed
If migration was recorded in schema_migrations but failed mid-way:
- Manually revert partial changes or fix broken state
- Delete row from
schema_migrations:DELETE FROM schema_migrations WHERE version = 4; - Re-run migration
Long-Running Migration
Large table alterations can lock rows and cause downtime:
- Run during low-traffic period
- Use
CONCURRENTLYfor index creation (PostgreSQL):CREATE INDEX CONCURRENTLY idx_orders_created ON orders(created_at); - For adding NOT NULL, populate values first with UPDATE, then add constraint
- Consider using
pg_repackfor online table reorganization
Summary
- Store migrations in
migrates/directory, numbered sequentially - Use
init_db_connection()to run migrations programmatically - Test migrations on staging database before production
- Keep migrations backward compatible when possible
- Have a rollback plan (backups, down scripts)
- Integrate migrations into CI/CD pipeline