ref: up
This commit is contained in:
582
fission-python/template/docs/MIGRATIONS.md
Normal file
582
fission-python/template/docs/MIGRATIONS.md
Normal file
@@ -0,0 +1,582 @@
|
||||
# Database Migrations
|
||||
|
||||
This guide covers managing database schema changes in Fission Python projects.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Migration Files](#migration-files)
|
||||
3. [Applying Migrations](#applying-migrations)
|
||||
4. [Writing Migrations](#writing-migrations)
|
||||
5. [Best Practices](#best-practices)
|
||||
6. [Rollback Strategies](#rollback-strategies)
|
||||
7. [Automation](#automation)
|
||||
|
||||
## Overview
|
||||
|
||||
Database schema changes should be managed through versioned migration scripts, not manual `CREATE TABLE` statements.
|
||||
|
||||
This template uses **plain SQL migration files** (`.sql`), which provide:
|
||||
- Version control of schema changes
|
||||
- Repeatable application to different environments
|
||||
- Clear upgrade/downgrade paths
|
||||
- Audit trail of schema evolution
|
||||
|
||||
## Migration Files
|
||||
|
||||
Place SQL migration scripts in the `migrates/` directory:
|
||||
|
||||
```
|
||||
migrates/
|
||||
├── 001_initial_schema.sql
|
||||
├── 002_add_user_email.sql
|
||||
├── 003_create_indexes.sql
|
||||
└── ...
|
||||
```
|
||||
|
||||
**Naming convention**:
|
||||
- Prefix with sequential number (zero-padded for sorting)
|
||||
- Descriptive name after underscore
|
||||
- `.sql` extension
|
||||
- Numbers should be unique and monotonically increasing
|
||||
|
||||
### Initial Schema Example
|
||||
|
||||
```sql
|
||||
-- migrates/001_create_items_table.sql
|
||||
-- Create items table
|
||||
CREATE TABLE IF NOT EXISTS items (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
name VARCHAR(255) NOT NULL,
|
||||
description TEXT,
|
||||
status VARCHAR(50) DEFAULT 'active',
|
||||
metadata JSONB,
|
||||
created TIMESTAMPTZ DEFAULT NOW(),
|
||||
modified TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Add indexes
|
||||
CREATE INDEX idx_items_status ON items(status);
|
||||
CREATE INDEX idx_items_created ON items(created);
|
||||
|
||||
-- Add comments
|
||||
COMMENT ON TABLE items IS 'Stores item records';
|
||||
COMMENT ON COLUMN items.status IS 'Item status: active, inactive, pending';
|
||||
```
|
||||
|
||||
## Applying Migrations
|
||||
|
||||
### Manually
|
||||
|
||||
```bash
|
||||
# Connect to database
|
||||
psql -h localhost -U postgres -d mydb
|
||||
|
||||
# Run migration file
|
||||
\i migrates/001_create_items_table.sql
|
||||
|
||||
# Run all migrations in order (bash script)
|
||||
for file in $(ls migrates/*.sql | sort); do
|
||||
echo "Applying $file..."
|
||||
psql -h localhost -U postgres -d mydb -f "$file"
|
||||
done
|
||||
```
|
||||
|
||||
### Automatically from Python
|
||||
|
||||
Create a simple migration runner:
|
||||
|
||||
```python
|
||||
# src/migrate.py (not part of function, standalone script)
|
||||
import os
|
||||
import psycopg2
|
||||
from helpers import init_db_connection
|
||||
|
||||
def run_migrations():
|
||||
conn = init_db_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Create migrations tracking table if not exists
|
||||
cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS schema_migrations (
|
||||
version INTEGER PRIMARY KEY,
|
||||
name VARCHAR(255) NOT NULL,
|
||||
applied_at TIMESTAMPTZ DEFAULT NOW()
|
||||
)
|
||||
""")
|
||||
|
||||
# Get already-applied migrations
|
||||
cursor.execute("SELECT version FROM schema_migrations")
|
||||
applied = {row[0] for row in cursor.fetchall()}
|
||||
|
||||
# Find migration files
|
||||
migrates_dir = os.path.join(os.path.dirname(__file__), "..", "migrates")
|
||||
files = sorted([
|
||||
f for f in os.listdir(migrates_dir)
|
||||
if f.endswith(".sql")
|
||||
])
|
||||
|
||||
# Apply pending migrations
|
||||
for filename in files:
|
||||
# Extract version number
|
||||
version = int(filename.split("_")[0])
|
||||
if version in applied:
|
||||
print(f"Skipping {filename} (already applied)")
|
||||
continue
|
||||
|
||||
path = os.path.join(migrates_dir, filename)
|
||||
print(f"Applying {filename}...")
|
||||
with open(path, 'r') as f:
|
||||
sql = f.read()
|
||||
|
||||
try:
|
||||
cursor.execute(sql)
|
||||
cursor.execute(
|
||||
"INSERT INTO schema_migrations (version, name) VALUES (%s, %s)",
|
||||
(version, filename)
|
||||
)
|
||||
conn.commit()
|
||||
print(f" ✓ Applied {filename}")
|
||||
except Exception as e:
|
||||
conn.rollback()
|
||||
print(f" ✗ Failed: {e}")
|
||||
raise
|
||||
|
||||
conn.close()
|
||||
print("All migrations applied")
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_migrations()
|
||||
```
|
||||
|
||||
Run:
|
||||
```bash
|
||||
python src/migrate.py
|
||||
```
|
||||
|
||||
### Using Migration Tools
|
||||
|
||||
For more advanced features (rollbacks, branching), consider:
|
||||
|
||||
- **[Alembic](https://alembic.sqlalchemy.org/)** - Database migration tool for SQLAlchemy (if using ORM)
|
||||
- **[pg migrator](https://github.com/heroku/pg-migrator)** - Heroku's migration tool
|
||||
- **[goose](https://github.com/pressly/goose)** - Multi-database migration tool (can use from Python)
|
||||
- **[yoyo-migrations](https://github.com/gugulet-h/yoyo-migrations)** - Python-based migrations
|
||||
|
||||
## Writing Migrations
|
||||
|
||||
### Principles
|
||||
|
||||
1. **Idempotent** - Script should succeed if run multiple times
|
||||
2. **Additive first** - Add columns/tables before removing/dropping
|
||||
3. **Backward compatible** - New schema should work with old code
|
||||
4. **Atomic** - One logical change per migration file
|
||||
5. **Test locally** - Apply to test database before production
|
||||
|
||||
### Common Operations
|
||||
|
||||
#### Create Table
|
||||
|
||||
```sql
|
||||
CREATE TABLE IF NOT EXISTS orders (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id UUID NOT NULL,
|
||||
total DECIMAL(10,2) NOT NULL,
|
||||
status VARCHAR(50) NOT NULL DEFAULT 'pending',
|
||||
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Add foreign key
|
||||
ALTER TABLE orders
|
||||
ADD CONSTRAINT fk_orders_user
|
||||
FOREIGN KEY (user_id)
|
||||
REFERENCES users(id)
|
||||
ON DELETE CASCADE;
|
||||
|
||||
-- Index for performance
|
||||
CREATE INDEX idx_orders_user_id ON orders(user_id);
|
||||
CREATE INDEX idx_orders_created_at ON orders(created_at);
|
||||
```
|
||||
|
||||
#### Add Column
|
||||
|
||||
```sql
|
||||
-- Add nullable column (safe, backward compatible)
|
||||
ALTER TABLE orders
|
||||
ADD COLUMN shipping_address JSONB;
|
||||
|
||||
-- Add column with default (be careful with large tables!)
|
||||
-- This rewrites entire table - use cautiously
|
||||
ALTER TABLE orders
|
||||
ADD COLUMN tax_amount DECIMAL(10,2) DEFAULT 0.00;
|
||||
```
|
||||
|
||||
#### Rename Column
|
||||
|
||||
```sql
|
||||
-- PostgreSQL 9.2+ supports RENAME COLUMN
|
||||
ALTER TABLE orders
|
||||
RENAME COLUMN total TO order_total;
|
||||
```
|
||||
|
||||
#### Modify Column Type
|
||||
|
||||
```sql
|
||||
-- Change VARCHAR length
|
||||
ALTER TABLE users
|
||||
ALTER COLUMN email TYPE VARCHAR(320);
|
||||
|
||||
-- Convert to different type (use USING clause)
|
||||
ALTER TABLE orders
|
||||
ALTER COLUMN status TYPE VARCHAR(100)
|
||||
USING status::VARCHAR(100);
|
||||
```
|
||||
|
||||
#### Create Index
|
||||
|
||||
```sql
|
||||
-- Simple index
|
||||
CREATE INDEX idx_users_email ON users(email);
|
||||
|
||||
-- Unique index
|
||||
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
|
||||
|
||||
-- Partial index (only active users)
|
||||
CREATE INDEX idx_users_active ON users(id)
|
||||
WHERE status = 'active';
|
||||
|
||||
-- Multi-column index
|
||||
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
|
||||
```
|
||||
|
||||
#### Drop Column/Table
|
||||
|
||||
```sql
|
||||
-- First, ensure no one is using it
|
||||
-- Consider using SET DEFAULT then dropping in subsequent migration
|
||||
|
||||
-- Drop column
|
||||
ALTER TABLE orders
|
||||
DROP COLUMN IF EXISTS old_column;
|
||||
|
||||
-- Drop table (dangerous!)
|
||||
DROP TABLE IF EXISTS old_logs;
|
||||
```
|
||||
|
||||
### Data Migrations
|
||||
|
||||
Sometimes you need to transform data:
|
||||
|
||||
```sql
|
||||
-- Backfill new column from existing data
|
||||
UPDATE orders
|
||||
SET shipping_address = jsonb_build_object(
|
||||
'street', address_street,
|
||||
'city', address_city,
|
||||
'zip', address_zip
|
||||
)
|
||||
WHERE shipping_address IS NULL;
|
||||
|
||||
-- Migrate enum values
|
||||
UPDATE products
|
||||
SET status = 'active' WHERE status = 'ACTIVE';
|
||||
|
||||
-- Clean up duplicates
|
||||
WITH duplicates AS (
|
||||
SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) AS rn
|
||||
FROM users
|
||||
)
|
||||
DELETE FROM users WHERE id IN (SELECT id FROM duplicates WHERE rn > 1);
|
||||
```
|
||||
|
||||
### Transactional Migrations
|
||||
|
||||
Wrap critical migrations in transactions:
|
||||
|
||||
```sql
|
||||
BEGIN;
|
||||
|
||||
-- Multiple related operations
|
||||
ALTER TABLE orders ADD COLUMN shipping_id UUID;
|
||||
UPDATE orders SET shipping_id = uuid_generate_v4() WHERE shipping_id IS NULL;
|
||||
ALTER TABLE orders ALTER COLUMN shipping_id SET NOT NULL;
|
||||
|
||||
COMMIT;
|
||||
```
|
||||
|
||||
**Note**: DDL statements in PostgreSQL auto-commit, so `BEGIN`/`COMMIT` may not work as expected for schema changes. For complex multi-step changes, consider using advisory locks or deployment coordination.
|
||||
|
||||
## Best Practices
|
||||
|
||||
### ✅ Do's
|
||||
|
||||
1. **Test migrations on copy of production database** before applying to prod
|
||||
2. **Keep migrations small** - One logical change per file
|
||||
3. **Write data migrations as separate files** from schema migrations
|
||||
4. **Use `IF NOT EXISTS` and `IF EXISTS`** to make migrations idempotent
|
||||
5. **Never drop columns/tables in the same migration you add them** - Separate to allow rollback
|
||||
6. **Document why** - Add comments explaining the purpose
|
||||
7. **Consider indexes** - Add indexes for frequently queried columns in same migration as table creation
|
||||
8. **Use UUIDs** for primary keys (`gen_random_uuid()` in PostgreSQL 13+)
|
||||
9. **Add `created_at` and `updated_at` timestamps** to all tables
|
||||
10. **Version numbers must be unique and sequential**
|
||||
|
||||
### ❌ Don'ts
|
||||
|
||||
1. **Don't modify already-applied migrations** - They're part of history
|
||||
2. **Don't skip version numbers** - Creates gaps but not critical
|
||||
3. **Don't use destructive operations without backup** - `DROP COLUMN`, `DROP TABLE`
|
||||
4. **Don't run long-running migrations during peak hours** - Use low-traffic windows
|
||||
5. **Don't add NOT NULL without default** on non-empty tables - Will fail due to existing NULL rows
|
||||
6. **Don't assume order of execution** - Always number sequentially
|
||||
7. **Don't mix unrelated changes** in one migration file
|
||||
|
||||
### Zero-Downtime Migrations
|
||||
|
||||
#### Adding Column
|
||||
|
||||
```sql
|
||||
-- Step 1: Add column as nullable or with default (fast)
|
||||
ALTER TABLE orders ADD COLUMN status VARCHAR(50);
|
||||
|
||||
-- Step 2: Deploy code that writes to new column
|
||||
-- Your application updates to populate status
|
||||
|
||||
-- Step 3: Backfill existing rows (if needed)
|
||||
UPDATE orders SET status = 'completed' WHERE status IS NULL AND shipped_at IS NOT NULL;
|
||||
|
||||
-- Step 4: Make column NOT NULL (if needed) - only after all rows have values
|
||||
ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
|
||||
```
|
||||
|
||||
#### Renaming Column
|
||||
|
||||
```sql
|
||||
-- Step 1: Add new column
|
||||
ALTER TABLE orders ADD COLUMN order_status VARCHAR(50);
|
||||
|
||||
-- Step 2: Deploy code writing to both old and new columns (dual-write)
|
||||
|
||||
-- Step 3: Backfill data
|
||||
UPDATE orders SET order_status = status;
|
||||
|
||||
-- Step 4: Deploy code reading from new column, stop writing to old
|
||||
|
||||
-- Step 5: Drop old column (in separate migration)
|
||||
ALTER TABLE orders DROP COLUMN status;
|
||||
```
|
||||
|
||||
## Rollback Strategies
|
||||
|
||||
### Manual Rollback
|
||||
|
||||
For each migration, you may want to write a corresponding "down" migration:
|
||||
|
||||
```sql
|
||||
-- 002_add_user_email.sql (UP)
|
||||
ALTER TABLE users ADD COLUMN email VARCHAR(320);
|
||||
|
||||
-- 002_add_user_email_rollback.sql (DOWN)
|
||||
ALTER TABLE users DROP COLUMN IF EXISTS email;
|
||||
```
|
||||
|
||||
Store rollback scripts alongside migrations or in separate `rollbacks/` directory.
|
||||
|
||||
### Point-in-Time Recovery
|
||||
|
||||
**Best strategy**: Restore database from backup to point before bad migration, then re-apply good migrations.
|
||||
|
||||
```bash
|
||||
# Restore from PITR backup (if using WAL archiving)
|
||||
pg_restore -h localhost -U postgres -d mydb --point-in-time="2025-03-18 10:30:00"
|
||||
|
||||
# Re-run migrations up to good version
|
||||
python src/migrate.py # But this applies all, so need selective
|
||||
```
|
||||
|
||||
### Selective Rollback Script
|
||||
|
||||
```python
|
||||
# rollback.py
|
||||
import sys
|
||||
from helpers import init_db_connection
|
||||
|
||||
def rollback(to_version: int):
|
||||
conn = init_db_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Find migrations after target version
|
||||
cursor.execute("""
|
||||
SELECT version, name
|
||||
FROM schema_migrations
|
||||
WHERE version > %s
|
||||
ORDER BY version DESC
|
||||
""", (to_version,))
|
||||
|
||||
migrations = cursor.fetchall()
|
||||
|
||||
for version, name in migrations:
|
||||
rollback_file = f"rollbacks/{version:03d}_{name.split('_', 1)[1]}.sql"
|
||||
print(f"Rolling back {name} using {rollback_file}...")
|
||||
with open(rollback_file, 'r') as f:
|
||||
sql = f.read()
|
||||
cursor.execute(sql)
|
||||
cursor.execute("DELETE FROM schema_migrations WHERE version = %s", (version,))
|
||||
conn.commit()
|
||||
print(f" Rolled back {name}")
|
||||
|
||||
conn.close()
|
||||
print(f"Rolled back to version {to_version}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
target = int(sys.argv[1])
|
||||
rollback(target)
|
||||
```
|
||||
|
||||
## Automation
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
In your deployment pipeline:
|
||||
|
||||
```bash
|
||||
# Before deploying new code
|
||||
python src/migrate.py
|
||||
|
||||
# If migrations fail, abort deployment
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Migrations failed, aborting deployment"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Deploy new code
|
||||
fission deploy
|
||||
```
|
||||
|
||||
### Pre-deployment Hooks
|
||||
|
||||
Use Fission hooks to run migrations automatically:
|
||||
|
||||
```json
|
||||
{
|
||||
"hooks": {
|
||||
"function_pre_deploy": [
|
||||
{
|
||||
"type": "http",
|
||||
"url": "http://migration-service/migrate",
|
||||
"timeout": 300000
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Or simpler: run migration as part of `build.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
# src/build.sh
|
||||
|
||||
# Install dependencies
|
||||
pip3 install -r requirements.txt -t .
|
||||
|
||||
# Run migrations against test DB (or do nothing, migrations are separate)
|
||||
# python ../migrate.py
|
||||
|
||||
# Package up
|
||||
cp -r . ${DEPLOY_PKG}
|
||||
```
|
||||
|
||||
### Database Change Management Tools
|
||||
|
||||
Consider specialized tools for larger teams:
|
||||
- **[Flyway](https://flywaydb.org/)** - Java-based, supports repeatable migrations
|
||||
- **[Liquibase](https://www.liquibase.org/)** - XML/YAML/JSON migrations
|
||||
- **[Prisma Migrate](https://www.prisma.io/docs/concepts/components/prisma-migrate)** - If using Prisma ORM
|
||||
- **[Alembic](https://alembic.sqlalchemy.org/)** - Python, SQLAlchemy-specific
|
||||
|
||||
## Example Workflow
|
||||
|
||||
1. **Create migration**:
|
||||
```bash
|
||||
touch migrates/004_add_orders_table.sql
|
||||
```
|
||||
|
||||
2. **Write SQL**:
|
||||
```sql
|
||||
CREATE TABLE orders (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id UUID NOT NULL REFERENCES users(id),
|
||||
total DECIMAL(10,2) NOT NULL,
|
||||
status VARCHAR(50) DEFAULT 'pending',
|
||||
created_at TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_orders_user_id ON orders(user_id);
|
||||
```
|
||||
|
||||
3. **Test locally**:
|
||||
```bash
|
||||
createdb test_migration
|
||||
psql test_migration -f migrates/004_add_orders_table.sql
|
||||
```
|
||||
|
||||
4. **Commit migration file**:
|
||||
```bash
|
||||
git add migrates/004_add_orders_table.sql
|
||||
git commit -m "Add orders table"
|
||||
```
|
||||
|
||||
5. **Apply to staging**:
|
||||
```bash
|
||||
# Update dev-deployment.json if new env vars needed
|
||||
fission deploy --dev
|
||||
python src/migrate.py
|
||||
```
|
||||
|
||||
6. **Apply to production**:
|
||||
```bash
|
||||
# Maintenance window or blue-green deployment
|
||||
fission deploy
|
||||
python src/migrate.py
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Migration Fails
|
||||
|
||||
Check error message:
|
||||
- **syntax error**: Validate SQL with `psql -c "SQL"` manually
|
||||
- **duplicate column**: Migration already applied, check `schema_migrations`
|
||||
- **permission denied**: DB user lacks ALTER/CREATE privileges
|
||||
- **lock timeout**: Another migration running, wait or kill process
|
||||
|
||||
### Migration Already Applied But Failed
|
||||
|
||||
If migration was recorded in `schema_migrations` but failed mid-way:
|
||||
|
||||
1. Manually revert partial changes or fix broken state
|
||||
2. Delete row from `schema_migrations`: `DELETE FROM schema_migrations WHERE version = 4;`
|
||||
3. Re-run migration
|
||||
|
||||
### Long-Running Migration
|
||||
|
||||
Large table alterations can lock rows and cause downtime:
|
||||
|
||||
- Run during low-traffic period
|
||||
- Use `CONCURRENTLY` for index creation (PostgreSQL):
|
||||
```sql
|
||||
CREATE INDEX CONCURRENTLY idx_orders_created ON orders(created_at);
|
||||
```
|
||||
- For adding NOT NULL, populate values first with UPDATE, then add constraint
|
||||
- Consider using `pg_repack` for online table reorganization
|
||||
|
||||
## Summary
|
||||
|
||||
- Store migrations in `migrates/` directory, numbered sequentially
|
||||
- Use `init_db_connection()` to run migrations programmatically
|
||||
- Test migrations on staging database before production
|
||||
- Keep migrations backward compatible when possible
|
||||
- Have a rollback plan (backups, down scripts)
|
||||
- Integrate migrations into CI/CD pipeline
|
||||
Reference in New Issue
Block a user