This commit is contained in:
Duc Nguyen
2026-03-18 20:21:56 +07:00
commit 29667cd92f
58 changed files with 8459 additions and 0 deletions

View File

@@ -0,0 +1,421 @@
# Plan: Enhance Fission Python Projects with Exceptions, Pydantic Models, and Code Quality Improvements
## Context
Three Fission Python projects need systematic improvements to enhance error handling, data validation, and code maintainability:
- **py-eom-storage**: Storage management API (GET/POST /storages, GET/PUT/DELETE /storages/{id})
- **py-eom-quota**: Quota management API (GET/POST /quotas, POST/DELETE /users/{userId}/quotas/{quotaId})
- **py-ailbl-scheduler**: Background worker system for scheduled tasks
Currently, all projects use generic `Exception` with simple error messages returned as `{"error": str(err)}` with 500 status. There's no structured error handling, request validation, or consistent response formatting. Some projects have Pydantic models but not comprehensively used.
## Goals
1. **Custom Exceptions**: Implement domain-specific exception classes with:
- `error_code`: Machine-readable error identifier
- `http_status_code`: Appropriate HTTP status (400, 404, 409, 500, etc.)
- `error_msg`: Human-readable message
- `x_user`: User identifier from request header (X-Fission-Params-UserId or similar)
2. **Pydantic Models**: Add comprehensive request/response models for all endpoints:
- Request body validation (POST/PUT)
- Query parameter validation (GET)
- Structured response schemas
- Consistent error response format
3. **Code Quality**: Improve maintainability with:
- Detailed docstrings for all functions and classes
- Refactoring of complex, multi-responsibility functions
- Consistent error handling patterns
- Fix broken imports and type issues
## Project-Specific Plans
### 1. py-eom-storage
**Current State:**
- Has Pydantic models: `S3Resource`, `S3Credential` (unused)
- Uses dataclasses: `Page`, `Filter` (should be Pydantic)
- Endpoints: `/eom/admin/storages` (filter_or_insert.py), `/eom/admin/storages/{StorageId}` (update_or_delete.py)
**Changes Needed:**
**A. Create `src/exceptions.py`:**
```python
class StorageException(Exception):
"""Base exception for storage-related errors."""
def __init__(self, error_code: str, http_status: int, error_msg: str, x_user: str = None):
self.error_code = error_code
self.http_status = http_status
self.error_msg = error_msg
self.x_user = x_user
super().__init__(self.error_msg)
class ValidationError(StorageException):
"""Invalid input data."""
class NotFoundError(StorageException):
"""Resource not found."""
class ConflictError(StorageException):
"""Resource conflict (e.g., duplicate name)."""
class DatabaseError(StorageException):
"""Database operation failed."""
class S3ConnectionError(StorageException):
"""S3/MinIO connection failed."""
```
**B. Create/Update `src/models.py` (or extend existing):**
```python
# Request models
class StorageCreateRequest(BaseModel):
name: str = Field(..., min_length=1, max_length=255)
description: typing.Optional[str] = None
resource: dict # Should validate S3 structure
class StorageUpdateRequest(BaseModel):
name: typing.Optional[str] = None
description: typing.Optional[str] = None
resource: typing.Optional[dict] = None
active: typing.Optional[bool] = None
# Query models (convert Page/Filter to Pydantic)
class StorageFilter(BaseModel):
ids: typing.Optional[typing.List[str]] = None
keyword: typing.Optional[str] = None
collection_id: typing.Optional[str] = None
enable: typing.Optional[bool] = None
created_from: typing.Optional[datetime] = None
created_to: typing.Optional[datetime] = None
# ... other filters
class StorageQuery(BaseModel):
page: int = 0
size: int = Field(8, ge=1, le=100)
asc: bool = True
sortby: typing.Optional[Literal["name", "enable", "created", "modified"]] = None
filter: StorageFilter = StorageFilter()
# Response models
class StorageResponse(BaseModel):
id: str
name: str
description: typing.Optional[str]
resource: dict
enable: bool
created: datetime
modified: datetime
class ErrorResponse(BaseModel):
error_code: str
http_status: int
error_msg: str
x_user: typing.Optional[str] = None
details: typing.Optional[dict] = None
```
**C. Refactor `filter_or_insert.py`:**
- Replace try-except to catch custom exceptions
- Validate request body using Pydantic in `make_insert_request`
- Use Pydantic for query parsing in `make_filter_request`
- Add helper function `handle_exception` to format error responses consistently
- Extract SQL queries into separate functions for testability
- Add comprehensive docstrings explaining each endpoint's behavior
**D. Refactor `update_or_delete.py`:**
- Similar pattern: custom exceptions, Pydantic validation
- Refactor `is_depended_on_storage` - this function does too much, split into smaller helpers
- Add detailed comments for each database operation
- Ensure proper error messages with appropriate HTTP status codes
**E. Update `helpers.py`:**
- Add utility `get_user_from_header(request)` to extract x-user from various headers
---
### 2. py-eom-quota
**Current State:**
- Already has extensive Pydantic models in `models.py` (QuotaPage, UserQuotaPage, ScheduleCreate, etc.)
- But: `userquota_filter.py` imports from `quota_update_or_delete` which doesn't exist (broken import)
- Need to expand models to cover all request/response scenarios
- Endpoints: `/eom/admin/quotas` (filter), `/eom/admin/users/{UserId}/quotas` (filter/insert), `/eom/admin/users/{UserId}/quotas/{QuotaId}` (update/delete)
**Changes Needed:**
**A. Create `src/exceptions.py`:**
```python
class QuotaException(Exception):
"""Base exception for quota management."""
def __init__(self, error_code: str, http_status: int, error_msg: str, x_user: str = None):
self.error_code = error_code
self.http_status = http_status
self.error_msg = error_msg
self.x_user = x_user
super().__init__(self.error_msg)
class QuotaNotFoundError(QuotaException):
"""Quota does not exist."""
class UserQuotaConflictError(QuotaException):
"""User already has this type of quota."""
class ValidationError(QuotaException):
"""Invalid request data."""
class DatabaseError(QuotaException):
"""Database operation failed."""
```
**B. Extend `src/models.py`:**
The existing models mix schedule and quota models. Need to:
- Separate or clearly document which are for quotas vs schedules
- Add request models:
```python
class QuotaCreateRequest(BaseModel):
name: str
description: typing.Optional[str] = None
type: QuotaType
value: typing.Union[MaxSizeBody, MaxOrderTimesBody]
expire: ExpireBody
class QuotaUpdateRequest(BaseModel):
name: typing.Optional[str] = None
description: typing.Optional[str] = None
enable: typing.Optional[bool] = None
type: typing.Optional[QuotaType] = None
value: typing.Optional[typing.Union[MaxSizeBody, MaxOrderTimesBody]] = None
expire: typing.Optional[ExpireBody] = None
class UserQuotaAssignRequest(BaseModel):
quota_id: str
```
- Ensure response models exist (QuotaResponse, UserQuotaResponse)
**C. Fix `userquota_filter.py`:**
- Fix broken import: `from quota_update_or_delete import __get_by_id` → `from userquota_insert_or_delete import __get_by_id` (or better: move `__get_by_id` to a shared helpers module)
- Refactor `make_filter_request`:
- Use `UserQuotaPage` Pydantic model properly
- Validate user_id header is present using Pydantic
- Replace try-except with custom exceptions
- Add comprehensive docstring
- The function currently manually sets `paging.filter.user_ids = [user_id]` - this should be part of a validation layer
**D. Refactor `userquota_insert_or_delete.py`:**
- Fix the same broken import pattern (it imports nothing but uses `__get_by_id` in filter)
- Add proper request validation using Pydantic models
- Replace generic exceptions with `UserQuotaConflictError`, `QuotaNotFoundError`, etc.
- Refactor `__validate_user_quota_type` - currently SQL query is hardcoded, add comments explaining business logic
- The insert SQL has wrong columns: `INSERT INTO eom_user_quota(id, name, description, type, value, expire)` but the table likely only has (id, user_id, quota_id). Need to check database schema but from the code it seems mismatched.
**E. Improve `helpers.py`:**
- Add utility functions for extracting and validating user headers
- Add consistent error handling helpers
---
### 3. py-ailbl-scheduler
**Current State:**
- No HTTP endpoints (only time-triggered workers)
- No Pydantic models needed per user's choice
- Needs custom exceptions and code quality improvements
- Workers: `worker_session_picker.py`, `worker_session_poller.py`, `worker_scheduler_scan.py`, `worker_schedule_auto_disable.py`
- Common utilities in `common.py`, `helpers.py`
**Changes Needed:**
**A. Create `src/exceptions.py`:**
```python
class SchedulerException(Exception):
"""Base exception for scheduler operations."""
def __init__(self, error_code: str, error_msg: str, details: dict = None):
self.error_code = error_code
self.error_msg = error_msg
self.details = details
super().__init__(self.error_msg)
class ScheduleNotFoundError(SchedulerException):
"""Schedule does not exist."""
class SessionLockError(SchedulerError):
"""Failed to acquire session lock."""
class DagsterError(SchedulerError):
"""Dagster pipeline execution failed."""
class CronParseError(SchedulerError):
"""Invalid cron expression."""
class ConfigurationError(SchedulerError):
"""Missing or invalid configuration."""
```
**B. Refactor `worker_scheduler_scan.py`:**
This is the most complex function (446 lines). Goals:
- Extract helper functions:
- `_normalize_cron_for_cronner` (already exists)
- `_as_date`, `_as_time` (already exist)
- `_within_active_window` (already exists)
- `_is_due_by_cron` (already exists)
- `_is_due_by_freq` (already exists)
- Extract the schedule creation logic into `_create_session_for_schedule(cur, schedule, now, slot_start)`
- Extract the candidate schedule selection into `_fetch_due_schedules(cur, now, slot_start, slot_end, limit=50)`
- Add detailed docstrings explaining the overall algorithm: "Scan for schedules that are due in the current time slot and create sessions atomically"
- Improve variable names (e.g., `s` → `schedule`, `cur` → `cursor`)
- Add comments explaining the advisory lock strategy and why it's needed
- Ensure proper exception handling with custom exceptions
- The function currently catches generic Exception at the end - wrap specific operations with appropriate custom exceptions
**C. Refactor `worker_session_picker.py`:**
- Similar breakdown: extract `_pick_and_claim_sessions(conn, limit=20)` helper
- Extract `_process_kind5_session(session, ctx)` and `_process_kind1_session(session, ctx)` into separate functions
- Add detailed docstring explaining the picking strategy (FOR UPDATE SKIP LOCKED)
- Replace bare `except Exception` with specific exception types
- Add comments explaining the kind handling logic (kind 5 vs kind 1)
- The function `_build_run_config_kind5` is specific to that kind - could be moved to a separate module if needed
**D. Refactor `worker_session_poller.py`:**
- Extract `_update_completed_session(cur, session_id, status_info, now)` helper
- Extract `_update_started_session(cur, session_id, started_dt)` helper
- Add docstring explaining polling strategy
- Replace generic exception handling with `DagsterError` when Dagster calls fail
- Add type hints for the row unpacking: `for sid, run_id, started, cron_description, created_by in rows:`
**E. Refactor `worker_schedule_auto_disable.py`:**
- This is simple enough already but still add comprehensive docstring
- Consider adding custom exception for database errors
**F. Improve `helpers.py` (in scheduler):**
- The `GraphQL` class and related functions are specific to Dagster - add docstrings
- `safe_notify` is good, add docstring
- Consider creating a `SchedulerHelper` class to group related utilities
**G. Improve `common.py`:**
- Already has good docstrings but could be expanded
- Add type hints to function signatures
- Break `launch_pipeline_execution` if too complex (handles multiple error cases)
---
## Common Patterns
### Exception Hierarchy
Each project will have:
```python
class BaseProjectException(Exception):
"""Base with error_code, http_status (if applicable), message, metadata."""
pass
# Specific exceptions inherit from base
class NotFoundError(BaseProjectException): ...
class ValidationError(BaseProjectException): ...
class ConflictError(BaseProjectException): ...
class DatabaseError(BaseProjectException): ...
# Domain-specific: StorageNotFoundError, QuotaConflictError, ScheduleNotFoundError, etc.
```
### Error Response Format
Standardized JSON response:
```json
{
"error_code": "STORAGE_NOT_FOUND",
"http_status": 404,
"error_msg": "Storage with id 'xyz' does not exist",
"x_user": "user123",
"details": { /* optional additional context */ }
}
```
### Middleware Pattern
In each HTTP endpoint function:
```python
def main():
try:
# Extract user header
x_user = request.headers.get("X-Fission-Params-UserId")
# Route to handler
return handler()
except ValidationError as e:
return error_response(e), 400
except NotFoundError as e:
return error_response(e), 404
except ConflictError as e:
return error_response(e), 409
except StorageException as e:
logger.error(f"Storage error: {e.error_code}: {e.error_msg}")
return error_response(e), 500
except Exception as e:
logger.exception("Unexpected error")
return {"error": "Internal server error"}, 500
```
---
## Implementation Order
1. **Phase 1**: Create exception modules for all three projects
2. **Phase 2**: Add/expand Pydantic models (storage, then complete quota)
3. **Phase 3**: Refactor endpoints to use exceptions and models
4. **Phase 4**: Refactor complex functions in scheduler
5. **Phase 5**: Documentation pass - ensure all functions have docstrings
6. **Phase 6**: Test manually by running functions (no automated tests to update)
---
## Verification Steps
1. **Manual Testing**:
- Deploy each function to local Fission or use test environment
- Test error cases: invalid input, missing resources, database failures
- Verify error response format matches specification
- Check logs for proper error logging
2. **Code Review**:
- All functions have docstrings with Args, Returns, Raises sections
- No function exceeds ~50 lines (extracted helpers where needed)
- All exceptions are specific, not generic `Exception`
- Request validation happens before business logic
3. **Import Verification**:
- Fix broken imports (especially in py-eom-quota's userquota_filter.py)
- Ensure circular dependencies are avoided
4. **Type Safety**:
- Run static type checker if available (mypy/pyright)
- Ensure all functions have return type hints
---
## Critical Files to Modify
**py-eom-storage:**
- `src/exceptions.py` (new)
- `src/models.py` (create/extend)
- `src/filter_or_insert.py` (refactor)
- `src/update_or_delete.py` (refactor)
- `src/helpers.py` (add utilities)
- `src/vault.py` (minor: improve docs)
**py-eom-quota:**
- `src/exceptions.py` (new)
- `src/models.py` (extend with request models)
- `src/userquota_filter.py` (fix imports, refactor)
- `src/userquota_insert_or_delete.py` (refactor, fix SQL if needed)
- `src/helpers.py` (add utilities)
**py-ailbl-scheduler:**
- `src/exceptions.py` (new)
- `src/worker_scheduler_scan.py` (major refactor)
- `src/worker_session_picker.py` (refactor)
- `src/worker_session_poller.py` (refactor)
- `src/worker_schedule_auto_disable.py` (docs)
- `src/common.py` (docs, type hints)
- `src/helpers.py` (docs, maybe extract class)
---
## Notes
- All changes are in `/workspaces/claude-marketplace/data/examples/`
- Preserve existing API contracts (URLs, HTTP methods)
- Do not change database schema
- Maintain backward compatibility with existing clients
- Focus on internal improvements: error handling, validation, documentation
- Use consistent patterns across all three projects