- Deleted the `renamer.decorators` package, including `caching.py` and `__init__.py`, to streamline the codebase. - Updated tests to reflect changes in import paths for caching decorators. - Added a comprehensive changelog to document major refactoring efforts and future plans. - Introduced an engineering guide detailing architecture, core components, and development setup.
27 KiB
Renamer Engineering Guide
Version: 0.7.0-dev Last Updated: 2026-01-01 Python: 3.11+ Status: Active Development
This is the comprehensive technical reference for the Renamer project. It contains all architectural information, implementation details, development workflows, and AI assistant instructions.
Table of Contents
- Project Overview
- Architecture
- Core Components
- Development Setup
- Testing Strategy
- Code Standards
- AI Assistant Instructions
- Release Process
Project Overview
Purpose
Renamer is a sophisticated Terminal User Interface (TUI) application for managing, viewing metadata, and renaming media files. Built with Python and the Textual framework.
Dual-Mode Operation:
- Technical Mode: Detailed technical metadata (video tracks, audio streams, codecs, bitrates)
- Catalog Mode: Media library catalog view with TMDB integration (posters, ratings, descriptions)
Current Version
- Version: 0.7.0-dev (in development)
- Python: 3.11+
- License: Not specified
- Repository:
/home/sha/bin/renamer
Technology Stack
Core Dependencies
- textual (≥6.11.0): TUI framework
- pymediainfo (≥6.0.0): Media track analysis
- mutagen (≥1.47.0): Embedded metadata
- python-magic (≥0.4.27): MIME detection
- langcodes (≥3.5.1): Language code handling
- requests (≥2.31.0): HTTP for TMDB API
- rich-pixels (≥1.0.0): Terminal image display
- pytest (≥7.0.0): Testing framework
Dev Dependencies
- mypy (≥1.0.0): Type checking
System Requirements
- Python 3.11 or higher
- UV package manager (recommended)
- MediaInfo library (system dependency)
Architecture
Architectural Layers
┌─────────────────────────────────────────┐
│ TUI Layer (Textual) │
│ app.py, screens.py │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Service Layer │
│ FileTreeService, MetadataService, │
│ RenameService │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Extractor Layer │
│ MediaExtractor coordinates: │
│ - FilenameExtractor │
│ - MediaInfoExtractor │
│ - MetadataExtractor │
│ - FileInfoExtractor │
│ - TMDBExtractor │
│ - DefaultExtractor │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Formatter Layer │
│ FormatterApplier coordinates: │
│ - DataFormatters (size, duration) │
│ - TextFormatters (case, style) │
│ - MarkupFormatters (colors, bold) │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Utility & Cache Layer │
│ - PatternExtractor │
│ - LanguageCodeExtractor │
│ - FrameClassMatcher │
│ - Unified Cache Subsystem │
└─────────────────────────────────────────┘
Design Patterns
- Protocol-Based Architecture:
DataExtractorProtocol defines extractor interface - Coordinator Pattern:
MediaExtractorcoordinates multiple extractors with priority system - Strategy Pattern: Cache key strategies for different data types
- Decorator Pattern:
@cached_method()for method-level caching - Service Layer: Business logic separated from UI
- Dependency Injection: Services receive extractors/formatters as dependencies
Core Components
1. Main Application (renamer/app.py)
Class: RenamerApp(App)
Responsibilities:
- TUI layout management (split view: file tree + details panel)
- Keyboard/mouse navigation
- Command palette integration (Ctrl+P)
- File operation coordination
- Efficient tree updates
Key Features:
- Two command providers:
AppCommandProvider,CacheCommandProvider - Dual-mode support (technical/catalog)
- Real-time metadata display
2. Service Layer (renamer/services/)
FileTreeService (file_tree_service.py)
- Directory scanning and validation
- Recursive tree building with filtering
- Media file detection (based on
MEDIA_TYPES) - Permission error handling
- Tree node searching by path
- Directory statistics
MetadataService (metadata_service.py)
- Thread pool management (ThreadPoolExecutor, configurable workers)
- Thread-safe operations with Lock
- Concurrent metadata extraction
- Active extraction tracking and cancellation
- Cache integration via decorators
- Synchronous and asynchronous modes
- Formatter coordination
- Error handling with callbacks
- Context manager support
RenameService (rename_service.py)
- Proposed name generation from metadata
- Filename validation and sanitization
- Invalid character removal (cross-platform)
- Reserved name checking (Windows compatibility)
- File conflict detection
- Atomic rename operations
- Dry-run mode
- Callback-based rename with success/error handlers
- Markup tag stripping
3. Extractor System (renamer/extractors/)
Base Protocol (base.py)
class DataExtractor(Protocol):
"""Defines standard interface for all extractors"""
def extract_title(self) -> Optional[str]: ...
def extract_year(self) -> Optional[str]: ...
# ... 21 methods total
MediaExtractor (extractor.py)
Coordinator class managing priority-based extraction:
Priority Order Examples:
- Title: TMDB → Metadata → Filename → Default
- Year: Filename → Default
- Technical info: MediaInfo → Default
- File info: FileInfo → Default
Usage:
extractor = MediaExtractor(Path("movie.mkv"))
title = extractor.get("title") # Tries sources in priority order
year = extractor.get("year", source="Filename") # Force specific source
Specialized Extractors
-
FilenameExtractor (
filename_extractor.py)- Parses metadata from filename patterns
- Detects year, resolution, source, codecs, edition
- Uses regex patterns and utility classes
- Handles Cyrillic normalization
- Extracts language codes with counts (e.g., "2xUKR_ENG")
-
MediaInfoExtractor (
mediainfo_extractor.py)- Uses PyMediaInfo library
- Extracts detailed track information
- Provides codec, bitrate, frame rate, resolution
- Frame class matching with tolerances
-
MetadataExtractor (
metadata_extractor.py)- Uses Mutagen library for embedded tags
- Extracts title, artist, duration
- Falls back to MIME type detection
- Handles multiple container formats
-
FileInfoExtractor (
fileinfo_extractor.py)- Basic file system information
- Size, modification time, paths
- Extension extraction
- Fast, no external dependencies
-
TMDBExtractor (
tmdb_extractor.py)- The Movie Database API integration
- Fetches title, year, ratings, overview, genres
- Downloads and caches posters
- Supports movies and TV shows
- Rate limiting and error handling
-
DefaultExtractor (
default_extractor.py)- Fallback extractor providing default values
- Returns None or empty collections
- Safe final fallback in extractor chain
4. Formatter System (renamer/formatters/)
Base Classes (base.py)
Formatter: Base ABC with abstractformat()methodDataFormatter: For data transformations (sizes, durations, dates)TextFormatter: For text transformations (case changes)MarkupFormatter: For visual styling (colors, bold, links)CompositeFormatter: For chaining multiple formatters
FormatterApplier (formatter.py)
Coordinator ensuring correct formatter order:
Order: Data → Text → Markup
Global Ordering:
- Data formatters (size, duration, date, track info)
- Text formatters (uppercase, lowercase, camelcase)
- Markup formatters (bold, colors, dim, underline)
Usage:
formatters = [SizeFormatter.format_size, TextFormatter.bold]
result = FormatterApplier.apply_formatters(1024, formatters)
# Result: bold("1.00 KB")
Specialized Formatters
- MediaFormatter: Main coordinator, mode-aware (technical/catalog)
- CatalogFormatter: TMDB data, ratings, genres, poster display
- TrackFormatter: Video/audio/subtitle track formatting with colors
- ProposedNameFormatter: Intelligent rename suggestions
- SizeFormatter: Human-readable file sizes
- DurationFormatter: Duration in HH:MM:SS
- DateFormatter: Timestamp formatting
- ResolutionFormatter: Resolution display
- ExtensionFormatter: File extension handling
- SpecialInfoFormatter: Edition/source formatting
- TextFormatter: Text styling utilities
5. Utility Modules (renamer/utils/)
PatternExtractor (pattern_utils.py)
Centralized regex pattern matching:
- Movie database ID extraction (TMDB, IMDB, Trakt, TVDB)
- Year extraction and validation
- Quality indicator detection
- Source indicator detection
- Bracketed content manipulation
- Position finding for year/quality/source
Example:
extractor = PatternExtractor()
db_info = extractor.extract_movie_db_ids("[tmdbid-12345]")
# Returns: {'type': 'tmdb', 'id': '12345'}
LanguageCodeExtractor (language_utils.py)
Language code processing:
- Extract from brackets:
[UKR_ENG]→['ukr', 'eng'] - Extract standalone codes from filename
- Handle count patterns:
[2xUKR_ENG] - Convert to ISO 639-3 codes
- Skip quality indicators and file extensions
- Format as language counts:
"2ukr,eng"
Example:
extractor = LanguageCodeExtractor()
langs = extractor.extract_from_brackets("[2xUKR_ENG]")
# Returns: ['ukr', 'ukr', 'eng']
FrameClassMatcher (frame_utils.py)
Resolution/frame class matching:
- Multi-step matching algorithm
- Height and width tolerance
- Aspect ratio calculation
- Scan type detection (progressive/interlaced)
- Standard resolution checking
- Nominal height/typical widths lookup
Matching Strategy:
- Exact height + width match
- Height match with aspect ratio validation
- Closest height match
- Non-standard quality indicator detection
6. Constants (renamer/constants/)
Modular organization (8 files):
- media_constants.py:
MEDIA_TYPES- Supported video formats - source_constants.py:
SOURCE_DICT- Video source types - frame_constants.py:
FRAME_CLASSES,NON_STANDARD_QUALITY_INDICATORS - moviedb_constants.py:
MOVIE_DB_DICT- Database identifiers - edition_constants.py:
SPECIAL_EDITIONS- Edition types - lang_constants.py:
SKIP_WORDS- Words to skip in language detection - year_constants.py:
is_valid_year(), dynamic year validation - cyrillic_constants.py:
CYRILLIC_TO_ENGLISH- Character mappings
Backward Compatibility: All constants exported via __init__.py
7. Cache Subsystem (renamer/cache/)
Unified, modular architecture:
renamer/cache/
├── __init__.py # Exports and convenience functions
├── core.py # Core Cache class (thread-safe with RLock)
├── types.py # CacheEntry, CacheStats TypedDicts
├── strategies.py # Cache key generation strategies
├── managers.py # CacheManager for operations
└── decorators.py # Enhanced cache decorators
Cache Key Strategies
FilepathMethodStrategy: For extractor methodsAPIRequestStrategy: For API responsesSimpleKeyStrategy: For simple prefix+id patternsCustomStrategy: User-defined key generation
Cache Decorators
@cached_method(ttl=3600) # Method caching
def extract_title(self):
...
@cached_api(service="tmdb", ttl=21600) # API caching
def fetch_movie_data(self, movie_id):
...
Cache Manager Operations
clear_all(): Remove all cache entriesclear_by_prefix(prefix): Clear specific cache typeclear_expired(): Remove expired entriesget_stats(): Comprehensive statisticsclear_file_cache(file_path): Clear cache for specific filecompact_cache(): Remove empty directories
Command Palette Integration
Access via Ctrl+P:
- Cache: View Statistics
- Cache: Clear All
- Cache: Clear Extractors / TMDB / Posters
- Cache: Clear Expired / Compact
Thread Safety
- All operations protected by
threading.RLock - Safe for concurrent extractor access
- Memory cache synchronized with file cache
8. UI Screens (renamer/screens.py)
- OpenScreen: Directory selection dialog with validation
- HelpScreen: Comprehensive help with key bindings
- RenameConfirmScreen: File rename confirmation with error handling
- SettingsScreen: Settings configuration interface
9. Settings System (renamer/settings.py)
Configuration: ~/.config/renamer/config.json
Options:
{
"mode": "technical", // or "catalog"
"cache_ttl_extractors": 21600, // 6 hours
"cache_ttl_tmdb": 21600, // 6 hours
"cache_ttl_posters": 2592000 // 30 days
}
Automatic save/load with defaults.
Development Setup
Installation
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and sync
cd /home/sha/bin/renamer
uv sync
# Install dev dependencies
uv sync --extra dev
# Run from source
uv run python renamer/main.py [directory]
Development Commands
# Run installed version
uv run renamer [directory]
# Run tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=renamer
# Type checking
uv run mypy renamer/extractors/default_extractor.py
# Version management
uv run bump-version # Increment patch version
uv run release # Bump + sync + build
# Build distribution
uv build # Create wheel and tarball
# Install as global tool
uv tool install .
Debugging
# Enable formatter logging
FORMATTER_LOG=1 uv run renamer /path/to/directory
# Creates formatter.log with detailed call traces
Testing Strategy
Test Organization
renamer/test/
├── datasets/ # Test data
│ ├── filenames/
│ │ ├── filename_patterns.json # 46 test cases
│ │ └── sample_files/ # Legacy reference
│ ├── mediainfo/
│ │ └── frame_class_tests.json # 25 test cases
│ └── sample_mediafiles/ # Generated (in .gitignore)
├── conftest.py # Fixtures and dataset loaders
├── test_cache_subsystem.py # 18 cache tests
├── test_services.py # 30+ service tests
├── test_utils.py # 70+ utility tests
├── test_formatters.py # 40+ formatter tests
├── test_filename_detection.py # Comprehensive filename parsing
├── test_filename_extractor.py # 368 extractor tests
├── test_mediainfo_*.py # MediaInfo tests
├── test_fileinfo_extractor.py # File info tests
└── test_metadata_extractor.py # Metadata tests
Test Statistics
- Total Tests: 560 (1 skipped)
- Service Layer: 30+ tests
- Utilities: 70+ tests
- Formatters: 40+ tests
- Extractors: 400+ tests
- Cache: 18 tests
Sample File Generation
# Generate 46 test files from filename_patterns.json
uv run python renamer/test/fill_sample_mediafiles.py
Test Fixtures
# Load test datasets
patterns = load_filename_patterns()
frame_tests = load_frame_class_tests()
dataset = load_dataset("custom_name")
file_path = get_test_file_path("movie.mkv")
Running Tests
# All tests
uv run pytest
# Specific test file
uv run pytest renamer/test/test_services.py
# With verbose output
uv run pytest -xvs
# With coverage
uv run pytest --cov=renamer --cov-report=html
Code Standards
Python Standards
- Version: Python 3.11+
- Style: PEP 8 guidelines
- Type Hints: Encouraged for all public APIs
- Docstrings: Google-style format
- Pathlib: For all file operations
- Exception Handling: Specific exceptions (no bare
except:)
Docstring Format
def example_function(param1: int, param2: str) -> bool:
"""Brief description of function.
Longer description if needed, explaining behavior,
edge cases, or important details.
Args:
param1: Description of param1
param2: Description of param2
Returns:
Description of return value
Raises:
ValueError: When param1 is negative
Example:
>>> example_function(5, "test")
True
"""
pass
Type Hints
from typing import Optional
# Function type hints
def extract_title(self) -> Optional[str]:
...
# Union types (Python 3.10+)
def extract_movie_db(self) -> list[str] | None:
...
# Generic types
def extract_tracks(self) -> list[dict]:
...
Logging Strategy
Levels:
- Debug: Language code conversions, metadata reads, MIME detection
- Warning: Network failures, API errors, MediaInfo parse failures
- Error: Formatter application failures
Usage:
import logging
logger = logging.getLogger(__name__)
logger.debug(f"Converted {lang_code} to {iso3_code}")
logger.warning(f"TMDB API request failed: {e}")
logger.error(f"Error applying {formatter.__name__}: {e}")
Error Handling
Guidelines:
- Catch specific exceptions:
(LookupError, ValueError, AttributeError) - Log all caught exceptions with context
- Network errors:
(requests.RequestException, ValueError) - Always close file handles (use context managers)
Example:
try:
lang_obj = langcodes.Language.get(lang_code.lower())
return lang_obj.to_alpha3()
except (LookupError, ValueError, AttributeError) as e:
logger.debug(f"Invalid language code '{lang_code}': {e}")
return None
Architecture Patterns
- Extractor Pattern: Each extractor focuses on one data source
- Formatter Pattern: Formatters handle display logic, extractors handle data
- Separation of Concerns: Data extraction → formatting → display
- Dependency Injection: Extractors and formatters are modular
- Configuration Management: Settings class for all config
Best Practices
- Simplicity: Avoid over-engineering, keep solutions simple
- Minimal Changes: Only modify what's explicitly requested
- Validation: Only at system boundaries (user input, external APIs)
- Trust Internal Code: Don't add unnecessary error handling
- Delete Unused Code: No backwards-compatibility hacks
- No Premature Abstraction: Three similar lines > premature abstraction
AI Assistant Instructions
Core Principles
- Read Before Modify: Always read files before suggesting modifications
- Follow Existing Patterns: Understand established architecture before changes
- Test Everything: Run
uv run pytestafter all changes - Simplicity First: Avoid over-engineering solutions
- Document Changes: Update relevant documentation
When Adding Features
- Read existing code and understand architecture
- Check
REFACTORING_PROGRESS.mdfor pending tasks - Implement features incrementally
- Test with real media files
- Ensure backward compatibility
- Update documentation
- Update tests as needed
- Run
uv run releasebefore committing
When Debugging
- Enable formatter logging:
FORMATTER_LOG=1 - Check cache state (clear if stale data suspected)
- Verify file permissions
- Test with sample filenames first
- Check logs in
formatter.log
When Refactoring
- Maintain backward compatibility unless explicitly breaking
- Update tests to reflect refactored code
- Check all formatters (formatting is centralized)
- Verify extractor chain (ensure data flow intact)
- Run full test suite
Common Pitfalls to Avoid
- ❌ Don't create new files unless absolutely necessary
- ❌ Don't add features beyond what's requested
- ❌ Don't skip testing with real files
- ❌ Don't forget to update version number for releases
- ❌ Don't commit secrets or API keys
- ❌ Don't use deprecated Textual APIs
- ❌ Don't use bare
except:clauses - ❌ Don't use command-line tools when specialized tools exist
Tool Usage
- Read files: Use
Readtool, notcat - Edit files: Use
Edittool, notsed - Write files: Use
Writetool, notecho >> - Search files: Use
Globtool, notfind - Search content: Use
Greptool, notgrep - Run commands: Use
Bashtool for terminal operations only
Git Workflow
Commit Standards:
- Clear, descriptive messages
- Focus on "why" not "what"
- One logical change per commit
Commit Message Format:
type: Brief description (imperative mood)
Longer explanation if needed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Safety Protocol:
- ❌ NEVER update git config
- ❌ NEVER run destructive commands without explicit request
- ❌ NEVER skip hooks (--no-verify, --no-gpg-sign)
- ❌ NEVER force push to main/master
- ❌ Avoid
git commit --amendunless conditions met
Creating Pull Requests
- Run
git status,git diff,git logto understand changes - Analyze ALL commits that will be included
- Draft comprehensive PR summary
- Create PR using:
gh pr create --title "Title" --body "$(cat <<'EOF' ## Summary - Bullet points of changes ## Test plan - Testing checklist 🤖 Generated with [Claude Code](https://claude.com/claude-code) EOF )"
Release Process
Version Management
Version Scheme: SemVer (MAJOR.MINOR.PATCH)
Commands:
# Bump patch version (0.6.0 → 0.6.1)
uv run bump-version
# Full release process
uv run release # Bump + sync + build
Release Checklist
- All tests passing:
uv run pytest - Type checking passes:
uv run mypy renamer/ - Documentation updated (CHANGELOG.md, README.md)
- Version bumped in
pyproject.toml - Dependencies synced:
uv sync - Build successful:
uv build - Install test:
uv tool install . - Manual testing with real media files
Build Artifacts
dist/
├── renamer-0.7.0-py3-none-any.whl # Wheel distribution
└── renamer-0.7.0.tar.gz # Source distribution
API Integration
TMDB API
Configuration:
- API key stored in
renamer/secrets.py - Base URL:
https://api.themoviedb.org/3/ - Image base URL for poster downloads
Endpoints Used:
- Search:
/search/movie - Movie details:
/movie/{id}
Rate Limiting: Handled gracefully with error fallback
Caching:
- API responses cached for 6 hours
- Posters cached for 30 days
- Cache location:
~/.cache/renamer/tmdb/,~/.cache/renamer/posters/
File Operations
Directory Scanning
- Recursive search for supported video formats
- File tree representation with hierarchical structure
- Efficient tree updates on file operations
- Permission error handling
File Renaming
Process:
- Select file in tree
- Press
rto initiate rename - Review proposed name (current vs proposed)
- Confirm with
yor cancel withn - Tree updates in-place without full reload
Proposed Name Format:
Title (Year) [Resolution Source Edition].ext
Sanitization:
- Invalid characters removed (cross-platform)
- Reserved names checked (Windows compatibility)
- Markup tags stripped
- Length validation
Metadata Caching
- First extraction cached for 6 hours
- TMDB data cached for 6 hours
- Posters cached for 30 days
- Force refresh with
fcommand - Cache invalidated on file rename
Keyboard Commands
| Key | Action |
|---|---|
q |
Quit application |
o |
Open directory |
s |
Scan/rescan directory |
f |
Refresh metadata for selected file |
r |
Rename file with proposed name |
p |
Toggle tree expansion |
m |
Toggle mode (technical/catalog) |
h |
Show help screen |
Ctrl+S |
Open settings |
Ctrl+P |
Open command palette |
Known Issues & Limitations
Current Limitations
- TMDB API requires internet connection
- Poster display requires terminal with image support
- Some special characters in filenames need sanitization
- Large directories may have initial scan delay
Performance Notes
- In-memory cache reduces repeated extraction overhead
- File cache persists across sessions
- Tree updates optimized for rename operations
- TMDB requests throttled to respect API limits
- Large directory scans use async/await patterns
Security Considerations
- Input sanitization for filenames (see
ProposedNameFormatter) - No shell command injection risks
- Safe file operations (pathlib, proper error handling)
- TMDB API key should not be committed (stored in
secrets.py) - Cache directory permissions should be user-only
Project History
Evolution
- Started as simple file renamer
- Added metadata extraction (MediaInfo, Mutagen)
- Expanded to TUI with Textual framework
- Added filename parsing intelligence
- Integrated TMDB for catalog mode
- Added settings and caching system
- Implemented poster display with rich-pixels
- Added dual-mode interface (technical/catalog)
- Phase 1-3 refactoring (2025-12-31 to 2026-01-01)
Version Milestones
- 0.2.x: Initial TUI with basic metadata
- 0.3.x: Enhanced extractors and formatters
- 0.4.x: Added TMDB integration
- 0.5.x: Settings, caching, catalog mode, poster display
- 0.6.0: Cache subsystem, service layer, protocols
- 0.7.0-dev: Complete refactoring (in progress)
Resources
External Documentation
- Textual Documentation
- PyMediaInfo Documentation
- Mutagen Documentation
- TMDB API Documentation
- UV Documentation
- Python Type Hints
- Mypy Documentation
Internal Documentation
- README.md: User guide and quick start
- INSTALL.md: Installation methods
- DEVELOP.md: Developer setup and debugging
- CHANGELOG.md: Version history and changes
- REFACTORING_PROGRESS.md: Future refactoring plans
- ToDo.md: Current task list
Last Updated: 2026-01-01
Maintainer: sha
For: AI Assistants and Developers
Repository: /home/sha/bin/renamer