ref: up
This commit is contained in:
421
.claude/plans/rosy-giggling-flurry.md
Normal file
421
.claude/plans/rosy-giggling-flurry.md
Normal file
@@ -0,0 +1,421 @@
|
||||
# Plan: Enhance Fission Python Projects with Exceptions, Pydantic Models, and Code Quality Improvements
|
||||
|
||||
## Context
|
||||
|
||||
Three Fission Python projects need systematic improvements to enhance error handling, data validation, and code maintainability:
|
||||
|
||||
- **py-eom-storage**: Storage management API (GET/POST /storages, GET/PUT/DELETE /storages/{id})
|
||||
- **py-eom-quota**: Quota management API (GET/POST /quotas, POST/DELETE /users/{userId}/quotas/{quotaId})
|
||||
- **py-ailbl-scheduler**: Background worker system for scheduled tasks
|
||||
|
||||
Currently, all projects use generic `Exception` with simple error messages returned as `{"error": str(err)}` with 500 status. There's no structured error handling, request validation, or consistent response formatting. Some projects have Pydantic models but not comprehensively used.
|
||||
|
||||
## Goals
|
||||
|
||||
1. **Custom Exceptions**: Implement domain-specific exception classes with:
|
||||
- `error_code`: Machine-readable error identifier
|
||||
- `http_status_code`: Appropriate HTTP status (400, 404, 409, 500, etc.)
|
||||
- `error_msg`: Human-readable message
|
||||
- `x_user`: User identifier from request header (X-Fission-Params-UserId or similar)
|
||||
|
||||
2. **Pydantic Models**: Add comprehensive request/response models for all endpoints:
|
||||
- Request body validation (POST/PUT)
|
||||
- Query parameter validation (GET)
|
||||
- Structured response schemas
|
||||
- Consistent error response format
|
||||
|
||||
3. **Code Quality**: Improve maintainability with:
|
||||
- Detailed docstrings for all functions and classes
|
||||
- Refactoring of complex, multi-responsibility functions
|
||||
- Consistent error handling patterns
|
||||
- Fix broken imports and type issues
|
||||
|
||||
## Project-Specific Plans
|
||||
|
||||
### 1. py-eom-storage
|
||||
|
||||
**Current State:**
|
||||
- Has Pydantic models: `S3Resource`, `S3Credential` (unused)
|
||||
- Uses dataclasses: `Page`, `Filter` (should be Pydantic)
|
||||
- Endpoints: `/eom/admin/storages` (filter_or_insert.py), `/eom/admin/storages/{StorageId}` (update_or_delete.py)
|
||||
|
||||
**Changes Needed:**
|
||||
|
||||
**A. Create `src/exceptions.py`:**
|
||||
```python
|
||||
class StorageException(Exception):
|
||||
"""Base exception for storage-related errors."""
|
||||
def __init__(self, error_code: str, http_status: int, error_msg: str, x_user: str = None):
|
||||
self.error_code = error_code
|
||||
self.http_status = http_status
|
||||
self.error_msg = error_msg
|
||||
self.x_user = x_user
|
||||
super().__init__(self.error_msg)
|
||||
|
||||
class ValidationError(StorageException):
|
||||
"""Invalid input data."""
|
||||
class NotFoundError(StorageException):
|
||||
"""Resource not found."""
|
||||
class ConflictError(StorageException):
|
||||
"""Resource conflict (e.g., duplicate name)."""
|
||||
class DatabaseError(StorageException):
|
||||
"""Database operation failed."""
|
||||
class S3ConnectionError(StorageException):
|
||||
"""S3/MinIO connection failed."""
|
||||
```
|
||||
|
||||
**B. Create/Update `src/models.py` (or extend existing):**
|
||||
```python
|
||||
# Request models
|
||||
class StorageCreateRequest(BaseModel):
|
||||
name: str = Field(..., min_length=1, max_length=255)
|
||||
description: typing.Optional[str] = None
|
||||
resource: dict # Should validate S3 structure
|
||||
|
||||
class StorageUpdateRequest(BaseModel):
|
||||
name: typing.Optional[str] = None
|
||||
description: typing.Optional[str] = None
|
||||
resource: typing.Optional[dict] = None
|
||||
active: typing.Optional[bool] = None
|
||||
|
||||
# Query models (convert Page/Filter to Pydantic)
|
||||
class StorageFilter(BaseModel):
|
||||
ids: typing.Optional[typing.List[str]] = None
|
||||
keyword: typing.Optional[str] = None
|
||||
collection_id: typing.Optional[str] = None
|
||||
enable: typing.Optional[bool] = None
|
||||
created_from: typing.Optional[datetime] = None
|
||||
created_to: typing.Optional[datetime] = None
|
||||
# ... other filters
|
||||
|
||||
class StorageQuery(BaseModel):
|
||||
page: int = 0
|
||||
size: int = Field(8, ge=1, le=100)
|
||||
asc: bool = True
|
||||
sortby: typing.Optional[Literal["name", "enable", "created", "modified"]] = None
|
||||
filter: StorageFilter = StorageFilter()
|
||||
|
||||
# Response models
|
||||
class StorageResponse(BaseModel):
|
||||
id: str
|
||||
name: str
|
||||
description: typing.Optional[str]
|
||||
resource: dict
|
||||
enable: bool
|
||||
created: datetime
|
||||
modified: datetime
|
||||
|
||||
class ErrorResponse(BaseModel):
|
||||
error_code: str
|
||||
http_status: int
|
||||
error_msg: str
|
||||
x_user: typing.Optional[str] = None
|
||||
details: typing.Optional[dict] = None
|
||||
```
|
||||
|
||||
**C. Refactor `filter_or_insert.py`:**
|
||||
- Replace try-except to catch custom exceptions
|
||||
- Validate request body using Pydantic in `make_insert_request`
|
||||
- Use Pydantic for query parsing in `make_filter_request`
|
||||
- Add helper function `handle_exception` to format error responses consistently
|
||||
- Extract SQL queries into separate functions for testability
|
||||
- Add comprehensive docstrings explaining each endpoint's behavior
|
||||
|
||||
**D. Refactor `update_or_delete.py`:**
|
||||
- Similar pattern: custom exceptions, Pydantic validation
|
||||
- Refactor `is_depended_on_storage` - this function does too much, split into smaller helpers
|
||||
- Add detailed comments for each database operation
|
||||
- Ensure proper error messages with appropriate HTTP status codes
|
||||
|
||||
**E. Update `helpers.py`:**
|
||||
- Add utility `get_user_from_header(request)` to extract x-user from various headers
|
||||
|
||||
---
|
||||
|
||||
### 2. py-eom-quota
|
||||
|
||||
**Current State:**
|
||||
- Already has extensive Pydantic models in `models.py` (QuotaPage, UserQuotaPage, ScheduleCreate, etc.)
|
||||
- But: `userquota_filter.py` imports from `quota_update_or_delete` which doesn't exist (broken import)
|
||||
- Need to expand models to cover all request/response scenarios
|
||||
- Endpoints: `/eom/admin/quotas` (filter), `/eom/admin/users/{UserId}/quotas` (filter/insert), `/eom/admin/users/{UserId}/quotas/{QuotaId}` (update/delete)
|
||||
|
||||
**Changes Needed:**
|
||||
|
||||
**A. Create `src/exceptions.py`:**
|
||||
```python
|
||||
class QuotaException(Exception):
|
||||
"""Base exception for quota management."""
|
||||
def __init__(self, error_code: str, http_status: int, error_msg: str, x_user: str = None):
|
||||
self.error_code = error_code
|
||||
self.http_status = http_status
|
||||
self.error_msg = error_msg
|
||||
self.x_user = x_user
|
||||
super().__init__(self.error_msg)
|
||||
|
||||
class QuotaNotFoundError(QuotaException):
|
||||
"""Quota does not exist."""
|
||||
class UserQuotaConflictError(QuotaException):
|
||||
"""User already has this type of quota."""
|
||||
class ValidationError(QuotaException):
|
||||
"""Invalid request data."""
|
||||
class DatabaseError(QuotaException):
|
||||
"""Database operation failed."""
|
||||
```
|
||||
|
||||
**B. Extend `src/models.py`:**
|
||||
The existing models mix schedule and quota models. Need to:
|
||||
- Separate or clearly document which are for quotas vs schedules
|
||||
- Add request models:
|
||||
```python
|
||||
class QuotaCreateRequest(BaseModel):
|
||||
name: str
|
||||
description: typing.Optional[str] = None
|
||||
type: QuotaType
|
||||
value: typing.Union[MaxSizeBody, MaxOrderTimesBody]
|
||||
expire: ExpireBody
|
||||
|
||||
class QuotaUpdateRequest(BaseModel):
|
||||
name: typing.Optional[str] = None
|
||||
description: typing.Optional[str] = None
|
||||
enable: typing.Optional[bool] = None
|
||||
type: typing.Optional[QuotaType] = None
|
||||
value: typing.Optional[typing.Union[MaxSizeBody, MaxOrderTimesBody]] = None
|
||||
expire: typing.Optional[ExpireBody] = None
|
||||
|
||||
class UserQuotaAssignRequest(BaseModel):
|
||||
quota_id: str
|
||||
```
|
||||
|
||||
- Ensure response models exist (QuotaResponse, UserQuotaResponse)
|
||||
|
||||
**C. Fix `userquota_filter.py`:**
|
||||
- Fix broken import: `from quota_update_or_delete import __get_by_id` → `from userquota_insert_or_delete import __get_by_id` (or better: move `__get_by_id` to a shared helpers module)
|
||||
- Refactor `make_filter_request`:
|
||||
- Use `UserQuotaPage` Pydantic model properly
|
||||
- Validate user_id header is present using Pydantic
|
||||
- Replace try-except with custom exceptions
|
||||
- Add comprehensive docstring
|
||||
- The function currently manually sets `paging.filter.user_ids = [user_id]` - this should be part of a validation layer
|
||||
|
||||
**D. Refactor `userquota_insert_or_delete.py`:**
|
||||
- Fix the same broken import pattern (it imports nothing but uses `__get_by_id` in filter)
|
||||
- Add proper request validation using Pydantic models
|
||||
- Replace generic exceptions with `UserQuotaConflictError`, `QuotaNotFoundError`, etc.
|
||||
- Refactor `__validate_user_quota_type` - currently SQL query is hardcoded, add comments explaining business logic
|
||||
- The insert SQL has wrong columns: `INSERT INTO eom_user_quota(id, name, description, type, value, expire)` but the table likely only has (id, user_id, quota_id). Need to check database schema but from the code it seems mismatched.
|
||||
|
||||
**E. Improve `helpers.py`:**
|
||||
- Add utility functions for extracting and validating user headers
|
||||
- Add consistent error handling helpers
|
||||
|
||||
---
|
||||
|
||||
### 3. py-ailbl-scheduler
|
||||
|
||||
**Current State:**
|
||||
- No HTTP endpoints (only time-triggered workers)
|
||||
- No Pydantic models needed per user's choice
|
||||
- Needs custom exceptions and code quality improvements
|
||||
- Workers: `worker_session_picker.py`, `worker_session_poller.py`, `worker_scheduler_scan.py`, `worker_schedule_auto_disable.py`
|
||||
- Common utilities in `common.py`, `helpers.py`
|
||||
|
||||
**Changes Needed:**
|
||||
|
||||
**A. Create `src/exceptions.py`:**
|
||||
```python
|
||||
class SchedulerException(Exception):
|
||||
"""Base exception for scheduler operations."""
|
||||
def __init__(self, error_code: str, error_msg: str, details: dict = None):
|
||||
self.error_code = error_code
|
||||
self.error_msg = error_msg
|
||||
self.details = details
|
||||
super().__init__(self.error_msg)
|
||||
|
||||
class ScheduleNotFoundError(SchedulerException):
|
||||
"""Schedule does not exist."""
|
||||
class SessionLockError(SchedulerError):
|
||||
"""Failed to acquire session lock."""
|
||||
class DagsterError(SchedulerError):
|
||||
"""Dagster pipeline execution failed."""
|
||||
class CronParseError(SchedulerError):
|
||||
"""Invalid cron expression."""
|
||||
class ConfigurationError(SchedulerError):
|
||||
"""Missing or invalid configuration."""
|
||||
```
|
||||
|
||||
**B. Refactor `worker_scheduler_scan.py`:**
|
||||
This is the most complex function (446 lines). Goals:
|
||||
- Extract helper functions:
|
||||
- `_normalize_cron_for_cronner` (already exists)
|
||||
- `_as_date`, `_as_time` (already exist)
|
||||
- `_within_active_window` (already exists)
|
||||
- `_is_due_by_cron` (already exists)
|
||||
- `_is_due_by_freq` (already exists)
|
||||
- Extract the schedule creation logic into `_create_session_for_schedule(cur, schedule, now, slot_start)`
|
||||
- Extract the candidate schedule selection into `_fetch_due_schedules(cur, now, slot_start, slot_end, limit=50)`
|
||||
- Add detailed docstrings explaining the overall algorithm: "Scan for schedules that are due in the current time slot and create sessions atomically"
|
||||
- Improve variable names (e.g., `s` → `schedule`, `cur` → `cursor`)
|
||||
- Add comments explaining the advisory lock strategy and why it's needed
|
||||
- Ensure proper exception handling with custom exceptions
|
||||
- The function currently catches generic Exception at the end - wrap specific operations with appropriate custom exceptions
|
||||
|
||||
**C. Refactor `worker_session_picker.py`:**
|
||||
- Similar breakdown: extract `_pick_and_claim_sessions(conn, limit=20)` helper
|
||||
- Extract `_process_kind5_session(session, ctx)` and `_process_kind1_session(session, ctx)` into separate functions
|
||||
- Add detailed docstring explaining the picking strategy (FOR UPDATE SKIP LOCKED)
|
||||
- Replace bare `except Exception` with specific exception types
|
||||
- Add comments explaining the kind handling logic (kind 5 vs kind 1)
|
||||
- The function `_build_run_config_kind5` is specific to that kind - could be moved to a separate module if needed
|
||||
|
||||
**D. Refactor `worker_session_poller.py`:**
|
||||
- Extract `_update_completed_session(cur, session_id, status_info, now)` helper
|
||||
- Extract `_update_started_session(cur, session_id, started_dt)` helper
|
||||
- Add docstring explaining polling strategy
|
||||
- Replace generic exception handling with `DagsterError` when Dagster calls fail
|
||||
- Add type hints for the row unpacking: `for sid, run_id, started, cron_description, created_by in rows:`
|
||||
|
||||
**E. Refactor `worker_schedule_auto_disable.py`:**
|
||||
- This is simple enough already but still add comprehensive docstring
|
||||
- Consider adding custom exception for database errors
|
||||
|
||||
**F. Improve `helpers.py` (in scheduler):**
|
||||
- The `GraphQL` class and related functions are specific to Dagster - add docstrings
|
||||
- `safe_notify` is good, add docstring
|
||||
- Consider creating a `SchedulerHelper` class to group related utilities
|
||||
|
||||
**G. Improve `common.py`:**
|
||||
- Already has good docstrings but could be expanded
|
||||
- Add type hints to function signatures
|
||||
- Break `launch_pipeline_execution` if too complex (handles multiple error cases)
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Exception Hierarchy
|
||||
|
||||
Each project will have:
|
||||
```python
|
||||
class BaseProjectException(Exception):
|
||||
"""Base with error_code, http_status (if applicable), message, metadata."""
|
||||
pass
|
||||
|
||||
# Specific exceptions inherit from base
|
||||
class NotFoundError(BaseProjectException): ...
|
||||
class ValidationError(BaseProjectException): ...
|
||||
class ConflictError(BaseProjectException): ...
|
||||
class DatabaseError(BaseProjectException): ...
|
||||
# Domain-specific: StorageNotFoundError, QuotaConflictError, ScheduleNotFoundError, etc.
|
||||
```
|
||||
|
||||
### Error Response Format
|
||||
|
||||
Standardized JSON response:
|
||||
```json
|
||||
{
|
||||
"error_code": "STORAGE_NOT_FOUND",
|
||||
"http_status": 404,
|
||||
"error_msg": "Storage with id 'xyz' does not exist",
|
||||
"x_user": "user123",
|
||||
"details": { /* optional additional context */ }
|
||||
}
|
||||
```
|
||||
|
||||
### Middleware Pattern
|
||||
|
||||
In each HTTP endpoint function:
|
||||
```python
|
||||
def main():
|
||||
try:
|
||||
# Extract user header
|
||||
x_user = request.headers.get("X-Fission-Params-UserId")
|
||||
# Route to handler
|
||||
return handler()
|
||||
except ValidationError as e:
|
||||
return error_response(e), 400
|
||||
except NotFoundError as e:
|
||||
return error_response(e), 404
|
||||
except ConflictError as e:
|
||||
return error_response(e), 409
|
||||
except StorageException as e:
|
||||
logger.error(f"Storage error: {e.error_code}: {e.error_msg}")
|
||||
return error_response(e), 500
|
||||
except Exception as e:
|
||||
logger.exception("Unexpected error")
|
||||
return {"error": "Internal server error"}, 500
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
1. **Phase 1**: Create exception modules for all three projects
|
||||
2. **Phase 2**: Add/expand Pydantic models (storage, then complete quota)
|
||||
3. **Phase 3**: Refactor endpoints to use exceptions and models
|
||||
4. **Phase 4**: Refactor complex functions in scheduler
|
||||
5. **Phase 5**: Documentation pass - ensure all functions have docstrings
|
||||
6. **Phase 6**: Test manually by running functions (no automated tests to update)
|
||||
|
||||
---
|
||||
|
||||
## Verification Steps
|
||||
|
||||
1. **Manual Testing**:
|
||||
- Deploy each function to local Fission or use test environment
|
||||
- Test error cases: invalid input, missing resources, database failures
|
||||
- Verify error response format matches specification
|
||||
- Check logs for proper error logging
|
||||
|
||||
2. **Code Review**:
|
||||
- All functions have docstrings with Args, Returns, Raises sections
|
||||
- No function exceeds ~50 lines (extracted helpers where needed)
|
||||
- All exceptions are specific, not generic `Exception`
|
||||
- Request validation happens before business logic
|
||||
|
||||
3. **Import Verification**:
|
||||
- Fix broken imports (especially in py-eom-quota's userquota_filter.py)
|
||||
- Ensure circular dependencies are avoided
|
||||
|
||||
4. **Type Safety**:
|
||||
- Run static type checker if available (mypy/pyright)
|
||||
- Ensure all functions have return type hints
|
||||
|
||||
---
|
||||
|
||||
## Critical Files to Modify
|
||||
|
||||
**py-eom-storage:**
|
||||
- `src/exceptions.py` (new)
|
||||
- `src/models.py` (create/extend)
|
||||
- `src/filter_or_insert.py` (refactor)
|
||||
- `src/update_or_delete.py` (refactor)
|
||||
- `src/helpers.py` (add utilities)
|
||||
- `src/vault.py` (minor: improve docs)
|
||||
|
||||
**py-eom-quota:**
|
||||
- `src/exceptions.py` (new)
|
||||
- `src/models.py` (extend with request models)
|
||||
- `src/userquota_filter.py` (fix imports, refactor)
|
||||
- `src/userquota_insert_or_delete.py` (refactor, fix SQL if needed)
|
||||
- `src/helpers.py` (add utilities)
|
||||
|
||||
**py-ailbl-scheduler:**
|
||||
- `src/exceptions.py` (new)
|
||||
- `src/worker_scheduler_scan.py` (major refactor)
|
||||
- `src/worker_session_picker.py` (refactor)
|
||||
- `src/worker_session_poller.py` (refactor)
|
||||
- `src/worker_schedule_auto_disable.py` (docs)
|
||||
- `src/common.py` (docs, type hints)
|
||||
- `src/helpers.py` (docs, maybe extract class)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- All changes are in `/workspaces/claude-marketplace/data/examples/`
|
||||
- Preserve existing API contracts (URLs, HTTP methods)
|
||||
- Do not change database schema
|
||||
- Maintain backward compatibility with existing clients
|
||||
- Focus on internal improvements: error handling, validation, documentation
|
||||
- Use consistent patterns across all three projects
|
||||
Reference in New Issue
Block a user