ref: up

2026-03-18 20:21:56 +07:00
commit 29667cd92f
58 changed files with 8459 additions and 0 deletions
--- a/fission-python/template/docs/MIGRATIONS.md
+++ b/fission-python/template/docs/MIGRATIONS.md
@@ -0,0 +1,582 @@
+# Database Migrations
+
+This guide covers managing database schema changes in Fission Python projects.
+
+## Table of Contents
+
+1. [Overview](#overview)
+2. [Migration Files](#migration-files)
+3. [Applying Migrations](#applying-migrations)
+4. [Writing Migrations](#writing-migrations)
+5. [Best Practices](#best-practices)
+6. [Rollback Strategies](#rollback-strategies)
+7. [Automation](#automation)
+
+## Overview
+
+Database schema changes should be managed through versioned migration scripts, not manual `CREATE TABLE` statements.
+
+This template uses **plain SQL migration files** (`.sql`), which provide:
+- Version control of schema changes
+- Repeatable application to different environments
+- Clear upgrade/downgrade paths
+- Audit trail of schema evolution
+
+## Migration Files
+
+Place SQL migration scripts in the `migrates/` directory:
+
+```
+migrates/
+├── 001_initial_schema.sql
+├── 002_add_user_email.sql
+├── 003_create_indexes.sql
+└── ...
+```
+
+**Naming convention**:
+- Prefix with sequential number (zero-padded for sorting)
+- Descriptive name after underscore
+- `.sql` extension
+- Numbers should be unique and monotonically increasing
+
+### Initial Schema Example
+
+```sql
+-- migrates/001_create_items_table.sql
+-- Create items table
+CREATE TABLE IF NOT EXISTS items (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    name VARCHAR(255) NOT NULL,
+    description TEXT,
+    status VARCHAR(50) DEFAULT 'active',
+    metadata JSONB,
+    created TIMESTAMPTZ DEFAULT NOW(),
+    modified TIMESTAMPTZ DEFAULT NOW()
+);
+
+-- Add indexes
+CREATE INDEX idx_items_status ON items(status);
+CREATE INDEX idx_items_created ON items(created);
+
+-- Add comments
+COMMENT ON TABLE items IS 'Stores item records';
+COMMENT ON COLUMN items.status IS 'Item status: active, inactive, pending';
+```
+
+## Applying Migrations
+
+### Manually
+
+```bash
+# Connect to database
+psql -h localhost -U postgres -d mydb
+
+# Run migration file
+\i migrates/001_create_items_table.sql
+
+# Run all migrations in order (bash script)
+for file in $(ls migrates/*.sql | sort); do
+    echo "Applying $file..."
+    psql -h localhost -U postgres -d mydb -f "$file"
+done
+```
+
+### Automatically from Python
+
+Create a simple migration runner:
+
+```python
+# src/migrate.py (not part of function, standalone script)
+import os
+import psycopg2
+from helpers import init_db_connection
+
+def run_migrations():
+    conn = init_db_connection()
+    cursor = conn.cursor()
+
+    # Create migrations tracking table if not exists
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS schema_migrations (
+            version INTEGER PRIMARY KEY,
+            name VARCHAR(255) NOT NULL,
+            applied_at TIMESTAMPTZ DEFAULT NOW()
+        )
+    """)
+
+    # Get already-applied migrations
+    cursor.execute("SELECT version FROM schema_migrations")
+    applied = {row[0] for row in cursor.fetchall()}
+
+    # Find migration files
+    migrates_dir = os.path.join(os.path.dirname(__file__), "..", "migrates")
+    files = sorted([
+        f for f in os.listdir(migrates_dir)
+        if f.endswith(".sql")
+    ])
+
+    # Apply pending migrations
+    for filename in files:
+        # Extract version number
+        version = int(filename.split("_")[0])
+        if version in applied:
+            print(f"Skipping {filename} (already applied)")
+            continue
+
+        path = os.path.join(migrates_dir, filename)
+        print(f"Applying {filename}...")
+        with open(path, 'r') as f:
+            sql = f.read()
+
+        try:
+            cursor.execute(sql)
+            cursor.execute(
+                "INSERT INTO schema_migrations (version, name) VALUES (%s, %s)",
+                (version, filename)
+            )
+            conn.commit()
+            print(f"  ✓ Applied {filename}")
+        except Exception as e:
+            conn.rollback()
+            print(f"  ✗ Failed: {e}")
+            raise
+
+    conn.close()
+    print("All migrations applied")
+
+if __name__ == "__main__":
+    run_migrations()
+```
+
+Run:
+```bash
+python src/migrate.py
+```
+
+### Using Migration Tools
+
+For more advanced features (rollbacks, branching), consider:
+
+- **[Alembic](https://alembic.sqlalchemy.org/)** - Database migration tool for SQLAlchemy (if using ORM)
+- **[pg migrator](https://github.com/heroku/pg-migrator)** - Heroku's migration tool
+- **[goose](https://github.com/pressly/goose)** - Multi-database migration tool (can use from Python)
+- **[yoyo-migrations](https://github.com/gugulet-h/yoyo-migrations)** - Python-based migrations
+
+## Writing Migrations
+
+### Principles
+
+1. **Idempotent** - Script should succeed if run multiple times
+2. **Additive first** - Add columns/tables before removing/dropping
+3. **Backward compatible** - New schema should work with old code
+4. **Atomic** - One logical change per migration file
+5. **Test locally** - Apply to test database before production
+
+### Common Operations
+
+#### Create Table
+
+```sql
+CREATE TABLE IF NOT EXISTS orders (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    user_id UUID NOT NULL,
+    total DECIMAL(10,2) NOT NULL,
+    status VARCHAR(50) NOT NULL DEFAULT 'pending',
+    created_at TIMESTAMPTZ DEFAULT NOW(),
+    updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+-- Add foreign key
+ALTER TABLE orders
+ADD CONSTRAINT fk_orders_user
+FOREIGN KEY (user_id)
+REFERENCES users(id)
+ON DELETE CASCADE;
+
+-- Index for performance
+CREATE INDEX idx_orders_user_id ON orders(user_id);
+CREATE INDEX idx_orders_created_at ON orders(created_at);
+```
+
+#### Add Column
+
+```sql
+-- Add nullable column (safe, backward compatible)
+ALTER TABLE orders
+ADD COLUMN shipping_address JSONB;
+
+-- Add column with default (be careful with large tables!)
+-- This rewrites entire table - use cautiously
+ALTER TABLE orders
+ADD COLUMN tax_amount DECIMAL(10,2) DEFAULT 0.00;
+```
+
+#### Rename Column
+
+```sql
+-- PostgreSQL 9.2+ supports RENAME COLUMN
+ALTER TABLE orders
+RENAME COLUMN total TO order_total;
+```
+
+#### Modify Column Type
+
+```sql
+-- Change VARCHAR length
+ALTER TABLE users
+ALTER COLUMN email TYPE VARCHAR(320);
+
+-- Convert to different type (use USING clause)
+ALTER TABLE orders
+ALTER COLUMN status TYPE VARCHAR(100)
+USING status::VARCHAR(100);
+```
+
+#### Create Index
+
+```sql
+-- Simple index
+CREATE INDEX idx_users_email ON users(email);
+
+-- Unique index
+CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
+
+-- Partial index (only active users)
+CREATE INDEX idx_users_active ON users(id)
+WHERE status = 'active';
+
+-- Multi-column index
+CREATE INDEX idx_orders_user_status ON orders(user_id, status);
+```
+
+#### Drop Column/Table
+
+```sql
+-- First, ensure no one is using it
+-- Consider using SET DEFAULT then dropping in subsequent migration
+
+-- Drop column
+ALTER TABLE orders
+DROP COLUMN IF EXISTS old_column;
+
+-- Drop table (dangerous!)
+DROP TABLE IF EXISTS old_logs;
+```
+
+### Data Migrations
+
+Sometimes you need to transform data:
+
+```sql
+-- Backfill new column from existing data
+UPDATE orders
+SET shipping_address = jsonb_build_object(
+    'street', address_street,
+    'city', address_city,
+    'zip', address_zip
+)
+WHERE shipping_address IS NULL;
+
+-- Migrate enum values
+UPDATE products
+SET status = 'active' WHERE status = 'ACTIVE';
+
+-- Clean up duplicates
+WITH duplicates AS (
+    SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) AS rn
+    FROM users
+)
+DELETE FROM users WHERE id IN (SELECT id FROM duplicates WHERE rn > 1);
+```
+
+### Transactional Migrations
+
+Wrap critical migrations in transactions:
+
+```sql
+BEGIN;
+
+-- Multiple related operations
+ALTER TABLE orders ADD COLUMN shipping_id UUID;
+UPDATE orders SET shipping_id = uuid_generate_v4() WHERE shipping_id IS NULL;
+ALTER TABLE orders ALTER COLUMN shipping_id SET NOT NULL;
+
+COMMIT;
+```
+
+**Note**: DDL statements in PostgreSQL auto-commit, so `BEGIN`/`COMMIT` may not work as expected for schema changes. For complex multi-step changes, consider using advisory locks or deployment coordination.
+
+## Best Practices
+
+### ✅ Do's
+
+1. **Test migrations on copy of production database** before applying to prod
+2. **Keep migrations small** - One logical change per file
+3. **Write data migrations as separate files** from schema migrations
+4. **Use `IF NOT EXISTS` and `IF EXISTS`** to make migrations idempotent
+5. **Never drop columns/tables in the same migration you add them** - Separate to allow rollback
+6. **Document why** - Add comments explaining the purpose
+7. **Consider indexes** - Add indexes for frequently queried columns in same migration as table creation
+8. **Use UUIDs** for primary keys (`gen_random_uuid()` in PostgreSQL 13+)
+9. **Add `created_at` and `updated_at` timestamps** to all tables
+10. **Version numbers must be unique and sequential**
+
+### ❌ Don'ts
+
+1. **Don't modify already-applied migrations** - They're part of history
+2. **Don't skip version numbers** - Creates gaps but not critical
+3. **Don't use destructive operations without backup** - `DROP COLUMN`, `DROP TABLE`
+4. **Don't run long-running migrations during peak hours** - Use low-traffic windows
+5. **Don't add NOT NULL without default** on non-empty tables - Will fail due to existing NULL rows
+6. **Don't assume order of execution** - Always number sequentially
+7. **Don't mix unrelated changes** in one migration file
+
+### Zero-Downtime Migrations
+
+#### Adding Column
+
+```sql
+-- Step 1: Add column as nullable or with default (fast)
+ALTER TABLE orders ADD COLUMN status VARCHAR(50);
+
+-- Step 2: Deploy code that writes to new column
+-- Your application updates to populate status
+
+-- Step 3: Backfill existing rows (if needed)
+UPDATE orders SET status = 'completed' WHERE status IS NULL AND shipped_at IS NOT NULL;
+
+-- Step 4: Make column NOT NULL (if needed) - only after all rows have values
+ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
+```
+
+#### Renaming Column
+
+```sql
+-- Step 1: Add new column
+ALTER TABLE orders ADD COLUMN order_status VARCHAR(50);
+
+-- Step 2: Deploy code writing to both old and new columns (dual-write)
+
+-- Step 3: Backfill data
+UPDATE orders SET order_status = status;
+
+-- Step 4: Deploy code reading from new column, stop writing to old
+
+-- Step 5: Drop old column (in separate migration)
+ALTER TABLE orders DROP COLUMN status;
+```
+
+## Rollback Strategies
+
+### Manual Rollback
+
+For each migration, you may want to write a corresponding "down" migration:
+
+```sql
+-- 002_add_user_email.sql (UP)
+ALTER TABLE users ADD COLUMN email VARCHAR(320);
+
+-- 002_add_user_email_rollback.sql (DOWN)
+ALTER TABLE users DROP COLUMN IF EXISTS email;
+```
+
+Store rollback scripts alongside migrations or in separate `rollbacks/` directory.
+
+### Point-in-Time Recovery
+
+**Best strategy**: Restore database from backup to point before bad migration, then re-apply good migrations.
+
+```bash
+# Restore from PITR backup (if using WAL archiving)
+pg_restore -h localhost -U postgres -d mydb --point-in-time="2025-03-18 10:30:00"
+
+# Re-run migrations up to good version
+python src/migrate.py  # But this applies all, so need selective
+```
+
+### Selective Rollback Script
+
+```python
+# rollback.py
+import sys
+from helpers import init_db_connection
+
+def rollback(to_version: int):
+    conn = init_db_connection()
+    cursor = conn.cursor()
+
+    # Find migrations after target version
+    cursor.execute("""
+        SELECT version, name
+        FROM schema_migrations
+        WHERE version > %s
+        ORDER BY version DESC
+    """, (to_version,))
+
+    migrations = cursor.fetchall()
+
+    for version, name in migrations:
+        rollback_file = f"rollbacks/{version:03d}_{name.split('_', 1)[1]}.sql"
+        print(f"Rolling back {name} using {rollback_file}...")
+        with open(rollback_file, 'r') as f:
+            sql = f.read()
+        cursor.execute(sql)
+        cursor.execute("DELETE FROM schema_migrations WHERE version = %s", (version,))
+        conn.commit()
+        print(f"  Rolled back {name}")
+
+    conn.close()
+    print(f"Rolled back to version {to_version}")
+
+if __name__ == "__main__":
+    target = int(sys.argv[1])
+    rollback(target)
+```
+
+## Automation
+
+### CI/CD Integration
+
+In your deployment pipeline:
+
+```bash
+# Before deploying new code
+python src/migrate.py
+
+# If migrations fail, abort deployment
+if [ $? -ne 0 ]; then
+    echo "Migrations failed, aborting deployment"
+    exit 1
+fi
+
+# Deploy new code
+fission deploy
+```
+
+### Pre-deployment Hooks
+
+Use Fission hooks to run migrations automatically:
+
+```json
+{
+  "hooks": {
+    "function_pre_deploy": [
+      {
+        "type": "http",
+        "url": "http://migration-service/migrate",
+        "timeout": 300000
+      }
+    ]
+  }
+}
+```
+
+Or simpler: run migration as part of `build.sh`:
+
+```bash
+#!/bin/sh
+# src/build.sh
+
+# Install dependencies
+pip3 install -r requirements.txt -t .
+
+# Run migrations against test DB (or do nothing, migrations are separate)
+# python ../migrate.py
+
+# Package up
+cp -r . ${DEPLOY_PKG}
+```
+
+### Database Change Management Tools
+
+Consider specialized tools for larger teams:
+- **[Flyway](https://flywaydb.org/)** - Java-based, supports repeatable migrations
+- **[Liquibase](https://www.liquibase.org/)** - XML/YAML/JSON migrations
+- **[Prisma Migrate](https://www.prisma.io/docs/concepts/components/prisma-migrate)** - If using Prisma ORM
+- **[Alembic](https://alembic.sqlalchemy.org/)** - Python, SQLAlchemy-specific
+
+## Example Workflow
+
+1. **Create migration**:
+   ```bash
+   touch migrates/004_add_orders_table.sql
+   ```
+
+2. **Write SQL**:
+   ```sql
+   CREATE TABLE orders (
+       id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+       user_id UUID NOT NULL REFERENCES users(id),
+       total DECIMAL(10,2) NOT NULL,
+       status VARCHAR(50) DEFAULT 'pending',
+       created_at TIMESTAMPTZ DEFAULT NOW()
+   );
+
+   CREATE INDEX idx_orders_user_id ON orders(user_id);
+   ```
+
+3. **Test locally**:
+   ```bash
+   createdb test_migration
+   psql test_migration -f migrates/004_add_orders_table.sql
+   ```
+
+4. **Commit migration file**:
+   ```bash
+   git add migrates/004_add_orders_table.sql
+   git commit -m "Add orders table"
+   ```
+
+5. **Apply to staging**:
+   ```bash
+   # Update dev-deployment.json if new env vars needed
+   fission deploy --dev
+   python src/migrate.py
+   ```
+
+6. **Apply to production**:
+   ```bash
+   # Maintenance window or blue-green deployment
+   fission deploy
+   python src/migrate.py
+   ```
+
+## Troubleshooting
+
+### Migration Fails
+
+Check error message:
+- **syntax error**: Validate SQL with `psql -c "SQL"` manually
+- **duplicate column**: Migration already applied, check `schema_migrations`
+- **permission denied**: DB user lacks ALTER/CREATE privileges
+- **lock timeout**: Another migration running, wait or kill process
+
+### Migration Already Applied But Failed
+
+If migration was recorded in `schema_migrations` but failed mid-way:
+
+1. Manually revert partial changes or fix broken state
+2. Delete row from `schema_migrations`: `DELETE FROM schema_migrations WHERE version = 4;`
+3. Re-run migration
+
+### Long-Running Migration
+
+Large table alterations can lock rows and cause downtime:
+
+- Run during low-traffic period
+- Use `CONCURRENTLY` for index creation (PostgreSQL):
+  ```sql
+  CREATE INDEX CONCURRENTLY idx_orders_created ON orders(created_at);
+  ```
+- For adding NOT NULL, populate values first with UPDATE, then add constraint
+- Consider using `pg_repack` for online table reorganization
+
+## Summary
+
+- Store migrations in `migrates/` directory, numbered sequentially
+- Use `init_db_connection()` to run migrations programmatically
+- Test migrations on staging database before production
+- Keep migrations backward compatible when possible
+- Have a rollback plan (backups, down scripts)
+- Integrate migrations into CI/CD pipeline