refactor: comprehensive code review fixes across 31 files

Addresses all critical, high, medium, and low issues from full codebase
review. 494 tests pass, ruff clean, ty type-check clean.

Security:
- Add tool_error_handler context manager (exceptions.py) — standardised
  error handling, eliminates 11 bare except-reraise patterns
- Remove unused exception subclasses (ConfigurationError, UnraidAPIError,
  SubscriptionError, ValidationError, IdempotentOperationError)
- Harden GraphQL subscription query validator with allow-list and
  forbidden-keyword regex (diagnostics.py)
- Add input validation for rclone create_remote config_data: injection,
  path-traversal, and key-count limits (rclone.py)
- Validate notifications importance enum before GraphQL request (notifications.py)
- Sanitise HTTP/network/JSON error messages — no raw exception strings
  leaked to clients (client.py)
- Strip path/creds from displayed API URL via _safe_display_url (health.py)
- Enable Ruff S (bandit) rule category in pyproject.toml
- Harden container mutations to strict-only matching — no fuzzy/substring
  for destructive operations (docker.py)

Performance:
- Token-bucket rate limiter (90 tokens, 9 req/s) with 429 retry backoff (client.py)
- Lazy asyncio.Lock init via _get_client_lock() — fixes event-loop
  module-load crash (client.py)
- Double-checked locking in get_http_client() for fast-path (client.py)
- Short hex container ID fast-path skips list fetch (docker.py)
- Cap resource_data log content to 1 MB / 5,000 lines (manager.py)
- Reset reconnect counter after 30 s stable connection (manager.py)
- Move tail_lines validation to module level; enforce 10,000 line cap
  (storage.py, docker.py)
- force_terminal=True removed from logging RichHandler (logging.py)

Architecture:
- Register diagnostic tools in server startup (server.py)
- Move ALL_ACTIONS computation to module level in all tools
- Consolidate format_kb / format_bytes into shared core/utils.py
- Add _safe_get() helper in core/utils.py for nested dict traversal
- Extract _analyze_subscription_status() from health.py diagnose handler
- Validate required config at startup — fail fast with CRITICAL log (server.py)

Code quality:
- Remove ~90 lines of dead Rich formatting helpers from logging.py
- Remove dead self.websocket attribute from SubscriptionManager
- Remove dead setup_uvicorn_logging() wrapper
- Move _VALID_IMPORTANCE to module level (N806 fix)
- Add slots=True to all three dataclasses (SubscriptionData, SystemHealth, APIResponse)
- Fix None rendering as literal "None" string in info.py summaries
- Change fuzzy-match log messages from INFO to DEBUG (docker.py)
- UTC-aware datetimes throughout (manager.py, diagnostics.py)

Infrastructure:
- Upgrade base image python:3.11-slim → python:3.12-slim (Dockerfile)
- Add non-root appuser (UID/GID 1000) with HEALTHCHECK (Dockerfile)
- Add read_only, cap_drop: ALL, tmpfs /tmp to docker-compose.yml
- Single-source version via importlib.metadata (pyproject.toml → __init__.py)
- Add open_timeout to all websockets.connect() calls

Tests:
- Update error message matchers to match sanitised messages (test_client.py)
- Fix patch targets for UNRAID_API_URL → utils module (test_subscriptions.py)
- Fix importance="info" → importance="normal" (test_notifications.py, http_layer)
- Fix naive datetime fixtures → UTC-aware (test_subscriptions.py)

Co-authored-by: Claude <claude@anthropic.com>
This commit is contained in:
Jacob Magar
2026-02-18 01:02:13 -05:00
parent 5b6a728f45
commit 316193c04b
32 changed files with 995 additions and 622 deletions

View File

@@ -6,8 +6,10 @@ development and debugging purposes.
"""
import asyncio
import contextlib
import json
from datetime import datetime
import re
from datetime import UTC, datetime
from typing import Any
import websockets
@@ -19,7 +21,58 @@ from ..config.settings import UNRAID_API_KEY, UNRAID_API_URL
from ..core.exceptions import ToolError
from .manager import subscription_manager
from .resources import ensure_subscriptions_started
from .utils import build_ws_ssl_context
from .utils import build_ws_ssl_context, build_ws_url
_ALLOWED_SUBSCRIPTION_NAMES = frozenset(
{
"logFileSubscription",
"containerStatsSubscription",
"cpuSubscription",
"memorySubscription",
"arraySubscription",
"networkSubscription",
"dockerSubscription",
"vmSubscription",
}
)
# Pattern: must start with "subscription", contain only a known subscription name,
# and not contain mutation/query keywords or semicolons (prevents injection).
_SUBSCRIPTION_NAME_PATTERN = re.compile(r"^\s*subscription\b[^{]*\{\s*(\w+)", re.IGNORECASE)
_FORBIDDEN_KEYWORDS = re.compile(r"\b(mutation|query)\b", re.IGNORECASE)
def _validate_subscription_query(query: str) -> str:
"""Validate that a subscription query is safe to execute.
Only allows subscription operations targeting whitelisted subscription names.
Rejects any query containing mutation/query keywords.
Returns:
The extracted subscription name.
Raises:
ToolError: If the query fails validation.
"""
if _FORBIDDEN_KEYWORDS.search(query):
raise ToolError("Query rejected: must be a subscription, not a mutation or query.")
match = _SUBSCRIPTION_NAME_PATTERN.match(query)
if not match:
raise ToolError(
"Query rejected: must start with 'subscription' and contain a valid "
"subscription operation. Example: subscription { logFileSubscription { ... } }"
)
sub_name = match.group(1)
if sub_name not in _ALLOWED_SUBSCRIPTION_NAMES:
raise ToolError(
f"Subscription '{sub_name}' is not allowed. "
f"Allowed subscriptions: {sorted(_ALLOWED_SUBSCRIPTION_NAMES)}"
)
return sub_name
def register_diagnostic_tools(mcp: FastMCP) -> None:
@@ -34,6 +87,10 @@ def register_diagnostic_tools(mcp: FastMCP) -> None:
"""Test a GraphQL subscription query directly to debug schema issues.
Use this to find working subscription field names and structure.
Only whitelisted subscriptions are allowed (logFileSubscription,
containerStatsSubscription, cpuSubscription, memorySubscription,
arraySubscription, networkSubscription, dockerSubscription,
vmSubscription).
Args:
subscription_query: The GraphQL subscription query to test
@@ -41,16 +98,16 @@ def register_diagnostic_tools(mcp: FastMCP) -> None:
Returns:
Dict containing test results and response data
"""
try:
logger.info(f"[TEST_SUBSCRIPTION] Testing query: {subscription_query}")
# Validate before any network I/O
sub_name = _validate_subscription_query(subscription_query)
# Build WebSocket URL
if not UNRAID_API_URL:
raise ToolError("UNRAID_API_URL is not configured")
ws_url = (
UNRAID_API_URL.replace("https://", "wss://").replace("http://", "ws://")
+ "/graphql"
)
try:
logger.info(f"[TEST_SUBSCRIPTION] Testing validated subscription '{sub_name}'")
try:
ws_url = build_ws_url()
except ValueError as e:
raise ToolError(str(e)) from e
ssl_context = build_ws_ssl_context(ws_url)
@@ -59,6 +116,7 @@ def register_diagnostic_tools(mcp: FastMCP) -> None:
ws_url,
subprotocols=[Subprotocol("graphql-transport-ws"), Subprotocol("graphql-ws")],
ssl=ssl_context,
open_timeout=10,
ping_interval=30,
ping_timeout=10,
) as websocket:
@@ -122,14 +180,14 @@ def register_diagnostic_tools(mcp: FastMCP) -> None:
logger.info("[DIAGNOSTIC] Running subscription diagnostics...")
# Get comprehensive status
status = subscription_manager.get_subscription_status()
status = await subscription_manager.get_subscription_status()
# Initialize connection issues list with proper type
connection_issues: list[dict[str, Any]] = []
# Add environment info with explicit typing
diagnostic_info: dict[str, Any] = {
"timestamp": datetime.now().isoformat(),
"timestamp": datetime.now(UTC).isoformat(),
"environment": {
"auto_start_enabled": subscription_manager.auto_start_enabled,
"max_reconnect_attempts": subscription_manager.max_reconnect_attempts,
@@ -152,17 +210,9 @@ def register_diagnostic_tools(mcp: FastMCP) -> None:
},
}
# Calculate WebSocket URL
if UNRAID_API_URL:
if UNRAID_API_URL.startswith("https://"):
ws_url = "wss://" + UNRAID_API_URL[len("https://") :]
elif UNRAID_API_URL.startswith("http://"):
ws_url = "ws://" + UNRAID_API_URL[len("http://") :]
else:
ws_url = UNRAID_API_URL
if not ws_url.endswith("/graphql"):
ws_url = ws_url.rstrip("/") + "/graphql"
diagnostic_info["environment"]["websocket_url"] = ws_url
# Calculate WebSocket URL (stays None if UNRAID_API_URL not configured)
with contextlib.suppress(ValueError):
diagnostic_info["environment"]["websocket_url"] = build_ws_url()
# Analyze issues
for sub_name, sub_status in status.items():