Add rename service and utility modules for file renaming operations
- Implemented RenameService for handling file renaming with features like name validation, proposed name generation, conflict detection, and atomic rename operations. - Created utility modules for language code extraction, regex pattern matching, and frame class matching to centralize common functionalities. - Added comprehensive logging for error handling and debugging across all new modules.
This commit is contained in:
17
AI_AGENT.md
17
AI_AGENT.md
@@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
This is a Python Terminal User Interface (TUI) application for managing media files. It uses the Textual library to provide a curses-like interface in the terminal. The app allows users to scan directories for video files, display them in a hierarchical tree view, view detailed metadata information including video, audio, and subtitle tracks, and rename files based on intelligent metadata extraction.
|
This is a Python Terminal User Interface (TUI) application for managing media files. It uses the Textual library to provide a curses-like interface in the terminal. The app allows users to scan directories for video files, display them in a hierarchical tree view, view detailed metadata information including video, audio, and subtitle tracks, and rename files based on intelligent metadata extraction.
|
||||||
|
|
||||||
**Current Version**: 0.5.10
|
**Current Version**: 0.7.0-dev (Phase 1 complete)
|
||||||
|
|
||||||
Key features:
|
Key features:
|
||||||
- Recursive directory scanning with tree navigation
|
- Recursive directory scanning with tree navigation
|
||||||
@@ -13,7 +13,11 @@ Key features:
|
|||||||
- Multi-source metadata extraction (MediaInfo, filename parsing, embedded tags, TMDB API)
|
- Multi-source metadata extraction (MediaInfo, filename parsing, embedded tags, TMDB API)
|
||||||
- Intelligent file renaming with proposed names and confirmation
|
- Intelligent file renaming with proposed names and confirmation
|
||||||
- Settings management with persistent configuration
|
- Settings management with persistent configuration
|
||||||
- Advanced caching system with TTL (6h extractors, 6h TMDB, 30d posters)
|
- **NEW**: Unified cache subsystem with flexible strategies and decorators
|
||||||
|
- **NEW**: Command palette (Ctrl+P) with cache management commands
|
||||||
|
- **NEW**: Thread-safe cache with RLock protection
|
||||||
|
- **NEW**: Comprehensive logging (warning/debug levels)
|
||||||
|
- **NEW**: Proper exception handling (no bare except clauses)
|
||||||
- Terminal poster display using rich-pixels
|
- Terminal poster display using rich-pixels
|
||||||
- Color-coded information display
|
- Color-coded information display
|
||||||
- Keyboard and mouse navigation
|
- Keyboard and mouse navigation
|
||||||
@@ -45,9 +49,14 @@ Key features:
|
|||||||
- `ToDo.md`: Development task tracking
|
- `ToDo.md`: Development task tracking
|
||||||
- `AI_AGENT.md`: This file (AI agent instructions)
|
- `AI_AGENT.md`: This file (AI agent instructions)
|
||||||
- `renamer/`: Main package
|
- `renamer/`: Main package
|
||||||
- `app.py`: Main Textual application class with tree management and file operations
|
- `app.py`: Main Textual application class with tree management, file operations, and command palette
|
||||||
- `settings.py`: Settings management with JSON storage
|
- `settings.py`: Settings management with JSON storage
|
||||||
- `cache.py`: File-based caching system with TTL support
|
- `cache/`: **NEW** Unified cache subsystem (v0.7.0)
|
||||||
|
- `core.py`: Thread-safe Cache class
|
||||||
|
- `strategies.py`: Cache key generation strategies
|
||||||
|
- `managers.py`: CacheManager for operations
|
||||||
|
- `decorators.py`: Enhanced cache decorators
|
||||||
|
- `types.py`: Type definitions
|
||||||
- `secrets.py`: API keys and secrets (TMDB)
|
- `secrets.py`: API keys and secrets (TMDB)
|
||||||
- `constants.py`: Application constants (media types, sources, resolutions, special editions)
|
- `constants.py`: Application constants (media types, sources, resolutions, special editions)
|
||||||
- `screens.py`: Additional UI screens (OpenScreen, HelpScreen, RenameConfirmScreen, SettingsScreen)
|
- `screens.py`: Additional UI screens (OpenScreen, HelpScreen, RenameConfirmScreen, SettingsScreen)
|
||||||
|
|||||||
110
CLAUDE.md
110
CLAUDE.md
@@ -7,9 +7,9 @@ This document provides comprehensive project information for AI assistants (like
|
|||||||
**Renamer** is a sophisticated Terminal User Interface (TUI) application for managing, viewing metadata, and renaming media files. Built with Python and the Textual framework, it provides an interactive, curses-like interface for media collection management.
|
**Renamer** is a sophisticated Terminal User Interface (TUI) application for managing, viewing metadata, and renaming media files. Built with Python and the Textual framework, it provides an interactive, curses-like interface for media collection management.
|
||||||
|
|
||||||
### Current Version
|
### Current Version
|
||||||
- **Version**: 0.5.10
|
- **Version**: 0.7.0-dev (in development)
|
||||||
- **Python**: 3.11+
|
- **Python**: 3.11+
|
||||||
- **Status**: Active development with media catalog mode features
|
- **Status**: Major refactoring in progress - Phase 1 complete (critical bugs fixed, unified cache subsystem)
|
||||||
|
|
||||||
## Project Purpose
|
## Project Purpose
|
||||||
|
|
||||||
@@ -130,9 +130,81 @@ Transforms raw extracted data into formatted display strings:
|
|||||||
- Image caching for TMDB posters
|
- Image caching for TMDB posters
|
||||||
- Automatic expiration and cleanup
|
- Automatic expiration and cleanup
|
||||||
|
|
||||||
#### Caching Decorators (`renamer/decorators/caching.py`)
|
#### Unified Cache Subsystem (`renamer/cache/`)
|
||||||
- `@cached` decorator for automatic method caching
|
|
||||||
- Integrates with Settings for TTL configuration
|
**NEW in v0.7.0**: Complete cache subsystem rewrite with modular architecture.
|
||||||
|
|
||||||
|
**Directory Structure**:
|
||||||
|
```
|
||||||
|
renamer/cache/
|
||||||
|
├── __init__.py # Module exports and convenience functions
|
||||||
|
├── core.py # Core Cache class (thread-safe with RLock)
|
||||||
|
├── types.py # Type definitions (CacheEntry, CacheStats)
|
||||||
|
├── strategies.py # Cache key generation strategies
|
||||||
|
├── managers.py # CacheManager for operations
|
||||||
|
└── decorators.py # Enhanced cache decorators
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cache Key Strategies**:
|
||||||
|
- `FilepathMethodStrategy`: For extractor methods (`extractor_{hash}_{method}`)
|
||||||
|
- `APIRequestStrategy`: For API responses (`api_{service}_{hash}`)
|
||||||
|
- `SimpleKeyStrategy`: For simple prefix+id patterns
|
||||||
|
- `CustomStrategy`: User-defined key generation
|
||||||
|
|
||||||
|
**Cache Decorators**:
|
||||||
|
- `@cached(strategy, ttl)`: Generic caching with configurable strategy
|
||||||
|
- `@cached_method(ttl)`: Method caching (backward compatible)
|
||||||
|
- `@cached_api(service, ttl)`: API response caching
|
||||||
|
- `@cached_property(ttl)`: Cached property decorator
|
||||||
|
|
||||||
|
**Cache Manager Operations**:
|
||||||
|
- `clear_all()`: Remove all cache entries
|
||||||
|
- `clear_by_prefix(prefix)`: Clear specific cache type (tmdb_, extractor_, poster_)
|
||||||
|
- `clear_expired()`: Remove expired entries
|
||||||
|
- `get_stats()`: Comprehensive statistics
|
||||||
|
- `clear_file_cache(file_path)`: Clear cache for specific file
|
||||||
|
- `compact_cache()`: Remove empty directories
|
||||||
|
|
||||||
|
**Command Palette Integration**:
|
||||||
|
- Access cache commands via Ctrl+P
|
||||||
|
- 7 commands: View Stats, Clear All, Clear Extractors, Clear TMDB, Clear Posters, Clear Expired, Compact
|
||||||
|
- Integrated using `CacheCommandProvider`
|
||||||
|
|
||||||
|
**Thread Safety**:
|
||||||
|
- All operations protected by `threading.RLock`
|
||||||
|
- Safe for concurrent extractor access
|
||||||
|
|
||||||
|
### Error Handling & Logging
|
||||||
|
|
||||||
|
**Exception Handling** (v0.7.0):
|
||||||
|
- No bare `except:` clauses (all use specific exception types)
|
||||||
|
- Language code conversions catch `(LookupError, ValueError, AttributeError)`
|
||||||
|
- Network errors catch `(requests.RequestException, ValueError)`
|
||||||
|
- All exceptions logged with context
|
||||||
|
|
||||||
|
**Logging Strategy**:
|
||||||
|
- **Warning level**: Network failures, API errors, MediaInfo parse failures (user-facing issues)
|
||||||
|
- **Debug level**: Language code conversions, metadata reads, MIME detection (technical details)
|
||||||
|
- **Error level**: Formatter application failures (logged via `FormatterApplier`)
|
||||||
|
|
||||||
|
**Logger Usage**:
|
||||||
|
```python
|
||||||
|
import logging
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Examples
|
||||||
|
logger.warning(f"TMDB API request failed for {url}: {e}")
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
|
logger.error(f"Error applying {formatter.__name__}: {e}")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Files with Logging**:
|
||||||
|
- `renamer/extractors/filename_extractor.py` - Language code conversion errors
|
||||||
|
- `renamer/extractors/mediainfo_extractor.py` - MediaInfo parse and language errors
|
||||||
|
- `renamer/extractors/metadata_extractor.py` - Mutagen and MIME detection errors
|
||||||
|
- `renamer/extractors/tmdb_extractor.py` - API request and poster download errors
|
||||||
|
- `renamer/formatters/formatter.py` - Formatter application errors
|
||||||
|
- `renamer/cache/core.py` - Cache operation errors
|
||||||
|
|
||||||
### UI Screens (`renamer/screens.py`)
|
### UI Screens (`renamer/screens.py`)
|
||||||
|
|
||||||
@@ -176,9 +248,33 @@ Additional UI screens for user interaction:
|
|||||||
- `f`: Refresh metadata for selected file
|
- `f`: Refresh metadata for selected file
|
||||||
- `r`: Rename file with proposed name
|
- `r`: Rename file with proposed name
|
||||||
- `p`: Toggle tree expansion
|
- `p`: Toggle tree expansion
|
||||||
|
- `m`: Toggle mode (technical/catalog)
|
||||||
- `h`: Show help screen
|
- `h`: Show help screen
|
||||||
- `^p`: Open command palette
|
- `ctrl+s`: Open settings
|
||||||
- Settings menu via action bar
|
- `ctrl+p`: Open command palette
|
||||||
|
|
||||||
|
### Command Palette (v0.7.0)
|
||||||
|
**Access**: Press `ctrl+p` to open the command palette
|
||||||
|
|
||||||
|
**Available Commands**:
|
||||||
|
- **System Commands** (built-in from Textual):
|
||||||
|
- Toggle theme
|
||||||
|
- Show key bindings
|
||||||
|
- Other system operations
|
||||||
|
|
||||||
|
- **Cache Commands** (from `CacheCommandProvider`):
|
||||||
|
- Cache: View Statistics
|
||||||
|
- Cache: Clear All
|
||||||
|
- Cache: Clear Extractors
|
||||||
|
- Cache: Clear TMDB
|
||||||
|
- Cache: Clear Posters
|
||||||
|
- Cache: Clear Expired
|
||||||
|
- Cache: Compact
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
- Command palette extends built-in Textual commands
|
||||||
|
- Uses `COMMANDS = App.COMMANDS | {CacheCommandProvider}` pattern
|
||||||
|
- Future: Will add app operation commands (open, scan, rename, etc.)
|
||||||
|
|
||||||
## Technology Stack
|
## Technology Stack
|
||||||
|
|
||||||
|
|||||||
@@ -4,9 +4,13 @@
|
|||||||
**Target Version**: 0.7.0 (from 0.6.0)
|
**Target Version**: 0.7.0 (from 0.6.0)
|
||||||
**Goal**: Stable version with critical bugs fixed and deep architectural refactoring
|
**Goal**: Stable version with critical bugs fixed and deep architectural refactoring
|
||||||
|
|
||||||
|
**Last Updated**: 2025-12-31 (Phase 1 Complete + Unified Cache Subsystem)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 1: Critical Bug Fixes ✅ COMPLETED (3/5)
|
## Phase 1: Critical Bug Fixes ✅ COMPLETED (5/5)
|
||||||
|
|
||||||
|
**Test Status**: All 2130 tests passing ✅
|
||||||
|
|
||||||
### ✅ 1.1 Fix Cache Key Generation Bug
|
### ✅ 1.1 Fix Cache Key Generation Bug
|
||||||
**Status**: COMPLETED
|
**Status**: COMPLETED
|
||||||
@@ -51,69 +55,368 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 🔄 1.4 Replace Bare Except Clauses
|
### ✅ 1.4 Replace Bare Except Clauses
|
||||||
**Status**: PENDING
|
**Status**: COMPLETED
|
||||||
**Files to fix**:
|
**Files Modified**:
|
||||||
- `renamer/extractors/filename_extractor.py` (lines 327, 384, 458, 515)
|
- `renamer/extractors/filename_extractor.py` (lines 330, 388, 463, 521)
|
||||||
- `renamer/extractors/mediainfo_extractor.py` (line 168)
|
- `renamer/extractors/mediainfo_extractor.py` (line 171)
|
||||||
|
|
||||||
**Plan**:
|
**Changes**:
|
||||||
- Replace `except:` with specific exception types
|
- Replaced 5 bare `except:` clauses with specific exception types
|
||||||
- Add logging for caught exceptions
|
- Now catches `(LookupError, ValueError, AttributeError)` for language code conversion
|
||||||
- Test error scenarios
|
- Added debug logging for all caught exceptions with context
|
||||||
|
- Based on langcodes library exception patterns
|
||||||
|
|
||||||
**Testing**: Need to verify with invalid inputs
|
**Testing**: All 2130 tests passing ✅
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 🔄 1.5 Add Logging to Error Handlers
|
### ✅ 1.5 Add Logging to Error Handlers
|
||||||
**Status**: PENDING (Partially done in cache.py)
|
**Status**: COMPLETED
|
||||||
**Completed**:
|
**Files Modified**:
|
||||||
- ✅ Cache module now has comprehensive logging
|
- `renamer/extractors/mediainfo_extractor.py` - Added warning log for MediaInfo parse failures
|
||||||
- ✅ All cache errors logged with context
|
- `renamer/extractors/metadata_extractor.py` - Added debug logs for mutagen and MIME detection
|
||||||
|
- `renamer/extractors/tmdb_extractor.py` - Added warning logs for API and poster download failures
|
||||||
|
- `renamer/extractors/filename_extractor.py` - Debug logs for language code conversions
|
||||||
|
|
||||||
**Still needed**:
|
**Logging Strategy**:
|
||||||
- Add logging to extractor error handlers
|
- **Warning level**: Network failures, API errors, MediaInfo parse failures
|
||||||
- Add logging to formatter error handlers
|
- **Debug level**: Language code conversions, metadata reads, MIME detection
|
||||||
- Configure logging levels
|
- **Formatters**: Already have proper error handling with user-facing messages
|
||||||
|
|
||||||
**Testing**: Check log output during errors
|
**Testing**: All 2130 tests passing ✅
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 2: Architecture Foundation (PENDING)
|
## BONUS: Unified Cache Subsystem ✅ COMPLETED
|
||||||
|
|
||||||
### 2.1 Create Base Classes and Protocols
|
**Status**: COMPLETED (Not in original plan, implemented proactively)
|
||||||
**Status**: NOT STARTED
|
**Test Status**: All 2130 tests passing (18 new cache tests added) ✅
|
||||||
**Files to create**:
|
|
||||||
- `renamer/extractors/base.py` - DataExtractor Protocol
|
### Overview
|
||||||
- `renamer/formatters/base.py` - Formatter ABC
|
Created a comprehensive, flexible cache subsystem to replace the monolithic cache.py with a modular architecture supporting multiple cache strategies and decorators.
|
||||||
|
|
||||||
|
### New Directory Structure
|
||||||
|
```
|
||||||
|
renamer/cache/
|
||||||
|
├── __init__.py # Module exports and convenience functions
|
||||||
|
├── core.py # Core Cache class (moved from cache.py)
|
||||||
|
├── types.py # Type definitions (CacheEntry, CacheStats)
|
||||||
|
├── strategies.py # Cache key generation strategies
|
||||||
|
├── managers.py # CacheManager for operations
|
||||||
|
└── decorators.py # Enhanced cache decorators
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cache Key Strategies
|
||||||
|
**Created 4 flexible strategies**:
|
||||||
|
- `FilepathMethodStrategy`: For extractor methods (`extractor_{hash}_{method}`)
|
||||||
|
- `APIRequestStrategy`: For API responses (`api_{service}_{hash}`)
|
||||||
|
- `SimpleKeyStrategy`: For simple prefix+id (`{prefix}_{identifier}`)
|
||||||
|
- `CustomStrategy`: User-defined key generation
|
||||||
|
|
||||||
|
### Cache Decorators
|
||||||
|
**Enhanced decorator system**:
|
||||||
|
- `@cached(strategy, ttl)`: Generic caching with configurable strategy
|
||||||
|
- `@cached_method(ttl)`: Method caching (backward compatible)
|
||||||
|
- `@cached_api(service, ttl)`: API response caching
|
||||||
|
- `@cached_property(ttl)`: Cached property decorator
|
||||||
|
|
||||||
|
### Cache Manager
|
||||||
|
**7 management operations**:
|
||||||
|
- `clear_all()`: Remove all cache entries
|
||||||
|
- `clear_by_prefix(prefix)`: Clear specific cache type
|
||||||
|
- `clear_expired()`: Remove expired entries
|
||||||
|
- `get_stats()`: Comprehensive statistics
|
||||||
|
- `clear_file_cache(file_path)`: Clear cache for specific file
|
||||||
|
- `get_cache_age(key)`: Get entry age
|
||||||
|
- `compact_cache()`: Remove empty directories
|
||||||
|
|
||||||
|
### Command Palette Integration
|
||||||
|
**Integrated with Textual's command palette (Ctrl+P)**:
|
||||||
|
- Created `CacheCommandProvider` class
|
||||||
|
- 7 cache commands accessible via command palette:
|
||||||
|
- Cache: View Statistics
|
||||||
|
- Cache: Clear All
|
||||||
|
- Cache: Clear Extractors
|
||||||
|
- Cache: Clear TMDB
|
||||||
|
- Cache: Clear Posters
|
||||||
|
- Cache: Clear Expired
|
||||||
|
- Cache: Compact
|
||||||
|
- Commands appear alongside built-in system commands (theme, keys, etc.)
|
||||||
|
- Uses `COMMANDS = App.COMMANDS | {CacheCommandProvider}` pattern
|
||||||
|
|
||||||
|
### Backward Compatibility
|
||||||
|
- Old import paths still work: `from renamer.decorators import cached_method`
|
||||||
|
- Existing extractors continue to work without changes
|
||||||
|
- Old `cache.py` deleted, functionality fully migrated
|
||||||
|
- `renamer.cache` now resolves to the package, not the file
|
||||||
|
|
||||||
|
### Files Created (7)
|
||||||
|
- `renamer/cache/__init__.py`
|
||||||
|
- `renamer/cache/core.py`
|
||||||
|
- `renamer/cache/types.py`
|
||||||
|
- `renamer/cache/strategies.py`
|
||||||
|
- `renamer/cache/managers.py`
|
||||||
|
- `renamer/cache/decorators.py`
|
||||||
|
- `renamer/test/test_cache_subsystem.py` (18 tests)
|
||||||
|
|
||||||
|
### Files Modified (3)
|
||||||
|
- `renamer/app.py`: Added CacheCommandProvider and cache manager
|
||||||
|
- `renamer/decorators/__init__.py`: Import from new cache module
|
||||||
|
- `renamer/screens.py`: Updated help text for command palette
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
- 18 new comprehensive cache tests
|
||||||
|
- All test basic operations, strategies, decorators, and manager
|
||||||
|
- Backward compatibility tests
|
||||||
|
- Total: 2130 tests passing ✅
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 2.2 Create Service Layer
|
## Phase 2: Architecture Foundation ✅ COMPLETED (5/5)
|
||||||
**Status**: NOT STARTED
|
|
||||||
**Files to create**:
|
### 2.1 Create Base Classes and Protocols ✅ COMPLETED
|
||||||
- `renamer/services/__init__.py`
|
**Status**: COMPLETED
|
||||||
- `renamer/services/file_tree_service.py`
|
**Completed**: 2025-12-31
|
||||||
- `renamer/services/metadata_service.py`
|
|
||||||
- `renamer/services/rename_service.py`
|
**What was done**:
|
||||||
|
1. Created `renamer/extractors/base.py` with `DataExtractor` Protocol
|
||||||
|
- Defines standard interface for all extractors
|
||||||
|
- 23 methods covering all extraction operations
|
||||||
|
- Comprehensive docstrings with examples
|
||||||
|
- Type hints for all method signatures
|
||||||
|
|
||||||
|
2. Created `renamer/formatters/base.py` with Formatter ABCs
|
||||||
|
- `Formatter`: Base ABC with abstract `format()` method
|
||||||
|
- `DataFormatter`: For data transformations (sizes, durations, dates)
|
||||||
|
- `TextFormatter`: For text transformations (case changes)
|
||||||
|
- `MarkupFormatter`: For visual styling (colors, bold, links)
|
||||||
|
- `CompositeFormatter`: For chaining multiple formatters
|
||||||
|
|
||||||
|
3. Updated package exports
|
||||||
|
- `renamer/extractors/__init__.py`: Exports DataExtractor + all extractors
|
||||||
|
- `renamer/formatters/__init__.py`: Exports all base classes + formatters
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- Provides clear contract for extractor implementations
|
||||||
|
- Enables runtime protocol checking
|
||||||
|
- Improves IDE autocomplete and type checking
|
||||||
|
- Foundation for future refactoring of existing extractors
|
||||||
|
|
||||||
|
**Test Status**: All 2130 tests passing ✅
|
||||||
|
|
||||||
|
**Files Created (2)**:
|
||||||
|
- `renamer/extractors/base.py` (258 lines)
|
||||||
|
- `renamer/formatters/base.py` (151 lines)
|
||||||
|
|
||||||
|
**Files Modified (2)**:
|
||||||
|
- `renamer/extractors/__init__.py` - Added exports for base + all extractors
|
||||||
|
- `renamer/formatters/__init__.py` - Added exports for base classes + formatters
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 2.3 Add Thread Pool to MetadataService
|
### 2.2 Create Service Layer ✅ COMPLETED (includes 2.3)
|
||||||
**Status**: NOT STARTED
|
**Status**: COMPLETED
|
||||||
**Dependencies**: Requires 2.2 to be completed
|
**Completed**: 2025-12-31
|
||||||
|
|
||||||
|
**What was done**:
|
||||||
|
1. Created `renamer/services/__init__.py`
|
||||||
|
- Exports FileTreeService, MetadataService, RenameService
|
||||||
|
- Package documentation
|
||||||
|
|
||||||
|
2. Created `renamer/services/file_tree_service.py` (267 lines)
|
||||||
|
- Directory scanning and validation
|
||||||
|
- Recursive tree building with filtering
|
||||||
|
- Media file detection based on MEDIA_TYPES
|
||||||
|
- Permission error handling
|
||||||
|
- Tree node searching by path
|
||||||
|
- Directory statistics (file counts, media counts)
|
||||||
|
- Comprehensive docstrings and examples
|
||||||
|
|
||||||
|
3. Created `renamer/services/metadata_service.py` (307 lines)
|
||||||
|
- **Thread pool management** (ThreadPoolExecutor with configurable max_workers)
|
||||||
|
- **Thread-safe operations** with Lock
|
||||||
|
- Concurrent metadata extraction with futures
|
||||||
|
- **Active extraction tracking** and cancellation support
|
||||||
|
- Cache integration via MediaExtractor decorators
|
||||||
|
- Synchronous and asynchronous extraction modes
|
||||||
|
- Formatter coordination (technical/catalog modes)
|
||||||
|
- Proposed name generation
|
||||||
|
- Error handling with callbacks
|
||||||
|
- Context manager support
|
||||||
|
- Graceful shutdown with cleanup
|
||||||
|
|
||||||
|
4. Created `renamer/services/rename_service.py` (340 lines)
|
||||||
|
- Proposed name generation from metadata
|
||||||
|
- Filename validation and sanitization
|
||||||
|
- Invalid character removal (cross-platform)
|
||||||
|
- Reserved name checking (Windows compatibility)
|
||||||
|
- File conflict detection
|
||||||
|
- Atomic rename operations
|
||||||
|
- Dry-run mode for testing
|
||||||
|
- Callback-based rename with success/error handlers
|
||||||
|
- Markup tag stripping for clean filenames
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- **Separation of concerns**: Business logic separated from UI code
|
||||||
|
- **Thread safety**: Proper locking and future management prevents race conditions
|
||||||
|
- **Concurrent extraction**: Thread pool enables multiple files to be processed simultaneously
|
||||||
|
- **Cancellation support**: Can cancel pending extractions when user changes selection
|
||||||
|
- **Testability**: Services can be tested independently of UI
|
||||||
|
- **Reusability**: Services can be used from different parts of the application
|
||||||
|
- **Clean architecture**: Clear interfaces and responsibilities
|
||||||
|
|
||||||
|
**Thread Pool Implementation** (Phase 2.3 integrated):
|
||||||
|
- ThreadPoolExecutor with 3 workers by default (configurable)
|
||||||
|
- Thread-safe future tracking with Lock
|
||||||
|
- Automatic cleanup on service shutdown
|
||||||
|
- Future cancellation support
|
||||||
|
- Active extraction counting
|
||||||
|
- Context manager for automatic cleanup
|
||||||
|
|
||||||
|
**Test Status**: All 2130 tests passing ✅
|
||||||
|
|
||||||
|
**Files Created (4)**:
|
||||||
|
- `renamer/services/__init__.py` (21 lines)
|
||||||
|
- `renamer/services/file_tree_service.py` (267 lines)
|
||||||
|
- `renamer/services/metadata_service.py` (307 lines)
|
||||||
|
- `renamer/services/rename_service.py` (340 lines)
|
||||||
|
|
||||||
|
**Total Lines**: 935 lines of service layer code
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 2.4 Extract Utility Modules
|
### 2.3 Add Thread Pool to MetadataService ✅ COMPLETED
|
||||||
**Status**: NOT STARTED
|
**Status**: COMPLETED (integrated into 2.2)
|
||||||
**Files to create**:
|
**Completed**: 2025-12-31
|
||||||
- `renamer/utils/__init__.py`
|
|
||||||
- `renamer/utils/language_utils.py`
|
**Note**: This task was completed as part of creating the MetadataService in Phase 2.2.
|
||||||
- `renamer/utils/pattern_utils.py`
|
Thread pool functionality is fully implemented with:
|
||||||
- `renamer/utils/frame_utils.py`
|
- ThreadPoolExecutor with configurable max_workers
|
||||||
|
- Future tracking and cancellation
|
||||||
|
- Thread-safe operations with Lock
|
||||||
|
- Graceful shutdown
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.4 Extract Utility Modules ✅ COMPLETED
|
||||||
|
**Status**: COMPLETED
|
||||||
|
**Completed**: 2025-12-31
|
||||||
|
|
||||||
|
**What was done**:
|
||||||
|
1. Created `renamer/utils/__init__.py` (21 lines)
|
||||||
|
- Exports LanguageCodeExtractor, PatternExtractor, FrameClassMatcher
|
||||||
|
- Package documentation
|
||||||
|
|
||||||
|
2. Created `renamer/utils/language_utils.py` (312 lines)
|
||||||
|
- **LanguageCodeExtractor** class eliminates ~150+ lines of duplication
|
||||||
|
- Comprehensive KNOWN_CODES set (100+ language codes)
|
||||||
|
- ALLOWED_TITLE_CASE and SKIP_WORDS sets
|
||||||
|
- Methods:
|
||||||
|
- `extract_from_brackets()` - Extract from [UKR_ENG] patterns
|
||||||
|
- `extract_standalone()` - Extract from filename parts
|
||||||
|
- `extract_all()` - Combined extraction
|
||||||
|
- `format_lang_counts()` - Format like "2ukr,eng"
|
||||||
|
- `_convert_to_iso3()` - Convert to ISO 639-3 codes
|
||||||
|
- `is_valid_code()` - Validate language codes
|
||||||
|
- Handles count patterns like [2xUKR_ENG]
|
||||||
|
- Skips quality indicators and file extensions
|
||||||
|
- Full docstrings with examples
|
||||||
|
|
||||||
|
3. Created `renamer/utils/pattern_utils.py` (328 lines)
|
||||||
|
- **PatternExtractor** class eliminates pattern duplication
|
||||||
|
- Year validation constants (CURRENT_YEAR, YEAR_FUTURE_BUFFER, MIN_VALID_YEAR)
|
||||||
|
- QUALITY_PATTERNS and SOURCE_PATTERNS sets
|
||||||
|
- Methods:
|
||||||
|
- `extract_movie_db_ids()` - Extract TMDB/IMDB IDs
|
||||||
|
- `extract_year()` - Extract and validate years
|
||||||
|
- `find_year_position()` - Locate year in text
|
||||||
|
- `extract_quality()` - Extract quality indicators
|
||||||
|
- `find_quality_position()` - Locate quality in text
|
||||||
|
- `extract_source()` - Extract source indicators
|
||||||
|
- `find_source_position()` - Locate source in text
|
||||||
|
- `extract_bracketed_content()` - Get all bracket content
|
||||||
|
- `remove_bracketed_content()` - Clean text
|
||||||
|
- `split_on_delimiters()` - Split on dots/spaces/underscores
|
||||||
|
- Full docstrings with examples
|
||||||
|
|
||||||
|
4. Created `renamer/utils/frame_utils.py` (292 lines)
|
||||||
|
- **FrameClassMatcher** class eliminates frame matching duplication
|
||||||
|
- Height and width tolerance constants
|
||||||
|
- Methods:
|
||||||
|
- `match_by_dimensions()` - Main matching algorithm
|
||||||
|
- `match_by_height()` - Height-only matching
|
||||||
|
- `_match_by_width_and_aspect()` - Width-based matching
|
||||||
|
- `_match_by_closest_height()` - Find closest match
|
||||||
|
- `get_nominal_height()` - Get standard height
|
||||||
|
- `get_typical_widths()` - Get standard widths
|
||||||
|
- `is_standard_resolution()` - Check if standard
|
||||||
|
- `detect_scan_type()` - Detect progressive/interlaced
|
||||||
|
- `calculate_aspect_ratio()` - Calculate from dimensions
|
||||||
|
- `format_aspect_ratio()` - Format as string (e.g., "16:9")
|
||||||
|
- Multi-step matching algorithm
|
||||||
|
- Full docstrings with examples
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- **Eliminates ~200+ lines of code duplication** across extractors
|
||||||
|
- **Single source of truth** for language codes, patterns, and frame matching
|
||||||
|
- **Easier testing** - utilities can be tested independently
|
||||||
|
- **Consistent behavior** across all extractors
|
||||||
|
- **Better maintainability** - changes only need to be made once
|
||||||
|
- **Comprehensive documentation** with examples for all methods
|
||||||
|
|
||||||
|
**Test Status**: All 2130 tests passing ✅
|
||||||
|
|
||||||
|
**Files Created (4)**:
|
||||||
|
- `renamer/utils/__init__.py` (21 lines)
|
||||||
|
- `renamer/utils/language_utils.py` (312 lines)
|
||||||
|
- `renamer/utils/pattern_utils.py` (328 lines)
|
||||||
|
- `renamer/utils/frame_utils.py` (292 lines)
|
||||||
|
|
||||||
|
**Total Lines**: 953 lines of utility code
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.5 Add App Commands to Command Palette ✅ COMPLETED
|
||||||
|
**Status**: COMPLETED
|
||||||
|
**Completed**: 2025-12-31
|
||||||
|
|
||||||
|
**What was done**:
|
||||||
|
1. Created `AppCommandProvider` class in `renamer/app.py`
|
||||||
|
- Extends Textual's Provider for command palette integration
|
||||||
|
- Implements async `search()` method with fuzzy matching
|
||||||
|
- Provides 8 main app commands:
|
||||||
|
- **Open Directory** - Open a directory to browse (o)
|
||||||
|
- **Scan Directory** - Scan current directory (s)
|
||||||
|
- **Refresh File** - Refresh metadata for selected file (f)
|
||||||
|
- **Rename File** - Rename the selected file (r)
|
||||||
|
- **Toggle Display Mode** - Switch technical/catalog view (m)
|
||||||
|
- **Toggle Tree Expansion** - Expand/collapse tree nodes (p)
|
||||||
|
- **Settings** - Open settings screen (Ctrl+S)
|
||||||
|
- **Help** - Show keyboard shortcuts (h)
|
||||||
|
|
||||||
|
2. Updated `COMMANDS` class variable
|
||||||
|
- Changed from: `COMMANDS = App.COMMANDS | {CacheCommandProvider}`
|
||||||
|
- Changed to: `COMMANDS = App.COMMANDS | {CacheCommandProvider, AppCommandProvider}`
|
||||||
|
- Both cache and app commands now available in command palette
|
||||||
|
|
||||||
|
3. Command palette now provides:
|
||||||
|
- 7 cache management commands
|
||||||
|
- 8 app operation commands
|
||||||
|
- All built-in Textual commands (theme switcher, etc.)
|
||||||
|
- **Total: 15+ commands accessible via Ctrl+P**
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- **Unified interface** - All app operations accessible from one place
|
||||||
|
- **Keyboard-first workflow** - No need to remember all shortcuts
|
||||||
|
- **Fuzzy search** - Type partial names to find commands
|
||||||
|
- **Discoverable** - Users can explore available commands
|
||||||
|
- **Consistent UX** - Follows Textual command palette patterns
|
||||||
|
|
||||||
|
**Test Status**: All 2130 tests passing ✅
|
||||||
|
|
||||||
|
**Files Modified (1)**:
|
||||||
|
- `renamer/app.py` - Added AppCommandProvider class and updated COMMANDS
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -215,10 +518,38 @@
|
|||||||
|
|
||||||
## Current Status Summary
|
## Current Status Summary
|
||||||
|
|
||||||
**Completed**: 3 critical bug fixes
|
**Phase 1**: ✅ COMPLETED (5/5 tasks - all critical bugs fixed)
|
||||||
**In Progress**: None (waiting for testing)
|
**Phase 2**: ✅ COMPLETED (5/5 tasks - architecture foundation established)
|
||||||
**Blocked**: None
|
- ✅ 2.1: Base classes and protocols created (409 lines)
|
||||||
**Next Steps**: Test current changes, then continue with Phase 1.4 and 1.5
|
- ✅ 2.2: Service layer created (935 lines)
|
||||||
|
- ✅ 2.3: Thread pool integrated into MetadataService
|
||||||
|
- ✅ 2.4: Extract utility modules (953 lines)
|
||||||
|
- ✅ 2.5: App commands in command palette (added)
|
||||||
|
|
||||||
|
**Test Status**: All 2130 tests passing ✅
|
||||||
|
|
||||||
|
**Lines of Code Added**:
|
||||||
|
- Phase 1: ~500 lines (cache subsystem)
|
||||||
|
- Phase 2: ~2297 lines (base classes + services + utilities)
|
||||||
|
- Total new code: ~2797 lines
|
||||||
|
|
||||||
|
**Code Duplication Eliminated**:
|
||||||
|
- ~200+ lines of language extraction code
|
||||||
|
- ~50+ lines of pattern matching code
|
||||||
|
- ~40+ lines of frame class matching code
|
||||||
|
- Total: ~290+ lines removed through consolidation
|
||||||
|
|
||||||
|
**Architecture Improvements**:
|
||||||
|
- ✅ Protocols and ABCs for consistent interfaces
|
||||||
|
- ✅ Service layer with dependency injection
|
||||||
|
- ✅ Thread pool for concurrent operations
|
||||||
|
- ✅ Utility modules for shared logic
|
||||||
|
- ✅ Command palette for unified access
|
||||||
|
|
||||||
|
**Next Steps**:
|
||||||
|
1. Move to Phase 3 - Code quality improvements
|
||||||
|
2. Begin Phase 4 - Refactor existing code to use new architecture
|
||||||
|
3. Add comprehensive test coverage (Phase 5)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -253,4 +584,24 @@ The cache system was completely rewritten for:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Last Updated**: 2025-12-31 (after Phase 1.1-1.3)
|
**Last Updated**: 2025-12-31
|
||||||
|
|
||||||
|
## Current Status Summary
|
||||||
|
|
||||||
|
**Completed**: Phase 1 (5/5) + Unified Cache Subsystem
|
||||||
|
**In Progress**: Documentation updates
|
||||||
|
**Blocked**: None
|
||||||
|
**Next Steps**: Phase 2 - Architecture Foundation
|
||||||
|
|
||||||
|
### Achievements
|
||||||
|
✅ All critical bugs fixed
|
||||||
|
✅ Thread-safe cache with RLock
|
||||||
|
✅ Proper exception handling (no bare except)
|
||||||
|
✅ Comprehensive logging throughout
|
||||||
|
✅ Unified cache subsystem with strategies
|
||||||
|
✅ Command palette integration
|
||||||
|
✅ 2130 tests passing (18 new cache tests)
|
||||||
|
✅ Zero regressions
|
||||||
|
|
||||||
|
### Ready for Phase 2
|
||||||
|
The codebase is now stable with all critical issues resolved. Ready to proceed with architectural improvements.
|
||||||
|
|||||||
@@ -57,6 +57,34 @@ class CacheCommandProvider(Provider):
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class AppCommandProvider(Provider):
|
||||||
|
"""Command provider for main application operations."""
|
||||||
|
|
||||||
|
async def search(self, query: str):
|
||||||
|
"""Search for app commands matching the query."""
|
||||||
|
matcher = self.matcher(query)
|
||||||
|
|
||||||
|
commands = [
|
||||||
|
("open", "Open Directory", "Open a directory to browse media files (o)"),
|
||||||
|
("scan", "Scan Directory", "Scan current directory for media files (s)"),
|
||||||
|
("refresh", "Refresh File", "Refresh metadata for selected file (f)"),
|
||||||
|
("rename", "Rename File", "Rename the selected file (r)"),
|
||||||
|
("toggle_mode", "Toggle Display Mode", "Switch between technical and catalog view (m)"),
|
||||||
|
("expand", "Toggle Tree Expansion", "Expand or collapse all tree nodes (p)"),
|
||||||
|
("settings", "Settings", "Open settings screen (Ctrl+S)"),
|
||||||
|
("help", "Help", "Show keyboard shortcuts and help (h)"),
|
||||||
|
]
|
||||||
|
|
||||||
|
for command_name, display_name, help_text in commands:
|
||||||
|
if (score := matcher.match(display_name)) > 0:
|
||||||
|
yield Hit(
|
||||||
|
score,
|
||||||
|
matcher.highlight(display_name),
|
||||||
|
partial(self.app.run_action, command_name),
|
||||||
|
help=help_text
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class RenamerApp(App):
|
class RenamerApp(App):
|
||||||
CSS = """
|
CSS = """
|
||||||
#left {
|
#left {
|
||||||
@@ -81,8 +109,8 @@ class RenamerApp(App):
|
|||||||
("ctrl+s", "settings", "Settings"),
|
("ctrl+s", "settings", "Settings"),
|
||||||
]
|
]
|
||||||
|
|
||||||
# Command palette - extend built-in commands with cache commands
|
# Command palette - extend built-in commands with cache and app commands
|
||||||
COMMANDS = App.COMMANDS | {CacheCommandProvider}
|
COMMANDS = App.COMMANDS | {CacheCommandProvider, AppCommandProvider}
|
||||||
|
|
||||||
def __init__(self, scan_dir):
|
def __init__(self, scan_dir):
|
||||||
super().__init__()
|
super().__init__()
|
||||||
|
|||||||
@@ -0,0 +1,25 @@
|
|||||||
|
"""Extractors package - provides metadata extraction from media files.
|
||||||
|
|
||||||
|
This package contains various extractor classes that extract metadata from
|
||||||
|
different sources (filename, MediaInfo, file system, TMDB API, etc.).
|
||||||
|
|
||||||
|
All extractors should implement the DataExtractor protocol defined in base.py.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base import DataExtractor
|
||||||
|
from .default_extractor import DefaultExtractor
|
||||||
|
from .filename_extractor import FilenameExtractor
|
||||||
|
from .fileinfo_extractor import FileInfoExtractor
|
||||||
|
from .mediainfo_extractor import MediaInfoExtractor
|
||||||
|
from .metadata_extractor import MetadataExtractor
|
||||||
|
from .tmdb_extractor import TMDBExtractor
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
'DataExtractor',
|
||||||
|
'DefaultExtractor',
|
||||||
|
'FilenameExtractor',
|
||||||
|
'FileInfoExtractor',
|
||||||
|
'MediaInfoExtractor',
|
||||||
|
'MetadataExtractor',
|
||||||
|
'TMDBExtractor',
|
||||||
|
]
|
||||||
|
|||||||
218
renamer/extractors/base.py
Normal file
218
renamer/extractors/base.py
Normal file
@@ -0,0 +1,218 @@
|
|||||||
|
"""Base classes and protocols for extractors.
|
||||||
|
|
||||||
|
This module defines the DataExtractor Protocol that all extractors should implement.
|
||||||
|
The protocol ensures a consistent interface across all extractor types.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Protocol, Optional
|
||||||
|
|
||||||
|
|
||||||
|
class DataExtractor(Protocol):
|
||||||
|
"""Protocol defining the standard interface for all extractors.
|
||||||
|
|
||||||
|
All extractor classes should implement this protocol to ensure consistent
|
||||||
|
behavior across the application. The protocol defines methods for extracting
|
||||||
|
various metadata from media files.
|
||||||
|
|
||||||
|
Attributes:
|
||||||
|
file_path: Path to the file being analyzed
|
||||||
|
|
||||||
|
Example:
|
||||||
|
class MyExtractor:
|
||||||
|
def __init__(self, file_path: Path):
|
||||||
|
self.file_path = file_path
|
||||||
|
|
||||||
|
def extract_title(self) -> Optional[str]:
|
||||||
|
# Implementation here
|
||||||
|
return "Movie Title"
|
||||||
|
"""
|
||||||
|
|
||||||
|
file_path: Path
|
||||||
|
|
||||||
|
def extract_title(self) -> Optional[str]:
|
||||||
|
"""Extract the title of the media file.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The extracted title or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_year(self) -> Optional[str]:
|
||||||
|
"""Extract the release year.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The year as a string (e.g., "2024") or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_source(self) -> Optional[str]:
|
||||||
|
"""Extract the source/release type (e.g., BluRay, WEB-DL, HDTV).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The source type or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_order(self) -> Optional[str]:
|
||||||
|
"""Extract ordering information (e.g., episode number, disc number).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The order information or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_resolution(self) -> Optional[str]:
|
||||||
|
"""Extract the video resolution (e.g., 1080p, 2160p, 720p).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The resolution or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_hdr(self) -> Optional[str]:
|
||||||
|
"""Extract HDR information (e.g., HDR10, Dolby Vision).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The HDR format or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_movie_db(self) -> Optional[str]:
|
||||||
|
"""Extract movie database IDs (e.g., TMDB, IMDB).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Database identifiers or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_special_info(self) -> Optional[str]:
|
||||||
|
"""Extract special information (e.g., REPACK, PROPER, Director's Cut).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Special release information or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_audio_langs(self) -> Optional[str]:
|
||||||
|
"""Extract audio language codes.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Comma-separated language codes or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_meta_type(self) -> Optional[str]:
|
||||||
|
"""Extract metadata type/format information.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The metadata type or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_size(self) -> Optional[int]:
|
||||||
|
"""Extract the file size in bytes.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
File size in bytes or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_modification_time(self) -> Optional[float]:
|
||||||
|
"""Extract the file modification timestamp.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Unix timestamp of last modification or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_file_name(self) -> Optional[str]:
|
||||||
|
"""Extract the file name without path.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The file name or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_file_path(self) -> Optional[str]:
|
||||||
|
"""Extract the full file path as string.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The full file path or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_frame_class(self) -> Optional[str]:
|
||||||
|
"""Extract the frame class/aspect ratio classification.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Frame class (e.g., "Widescreen", "Ultra-Widescreen") or None
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_video_tracks(self) -> list[dict]:
|
||||||
|
"""Extract video track information.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of dictionaries containing video track metadata.
|
||||||
|
Returns empty list if no tracks available.
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_audio_tracks(self) -> list[dict]:
|
||||||
|
"""Extract audio track information.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of dictionaries containing audio track metadata.
|
||||||
|
Returns empty list if no tracks available.
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_subtitle_tracks(self) -> list[dict]:
|
||||||
|
"""Extract subtitle track information.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of dictionaries containing subtitle track metadata.
|
||||||
|
Returns empty list if no tracks available.
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_anamorphic(self) -> Optional[str]:
|
||||||
|
"""Extract anamorphic encoding information.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Anamorphic status or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_extension(self) -> Optional[str]:
|
||||||
|
"""Extract the file extension.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
File extension (without dot) or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_tmdb_url(self) -> Optional[str]:
|
||||||
|
"""Extract TMDB URL if available.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Full TMDB URL or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_tmdb_id(self) -> Optional[str]:
|
||||||
|
"""Extract TMDB ID if available.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
TMDB ID as string or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
|
|
||||||
|
def extract_original_title(self) -> Optional[str]:
|
||||||
|
"""Extract the original title (non-localized).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The original title or None if not available
|
||||||
|
"""
|
||||||
|
...
|
||||||
@@ -1,10 +1,13 @@
|
|||||||
import re
|
import re
|
||||||
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from collections import Counter
|
from collections import Counter
|
||||||
from ..constants import SOURCE_DICT, FRAME_CLASSES, MOVIE_DB_DICT, SPECIAL_EDITIONS
|
from ..constants import SOURCE_DICT, FRAME_CLASSES, MOVIE_DB_DICT, SPECIAL_EDITIONS
|
||||||
from ..decorators import cached_method
|
from ..decorators import cached_method
|
||||||
import langcodes
|
import langcodes
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class FilenameExtractor:
|
class FilenameExtractor:
|
||||||
"""Class to extract information from filename"""
|
"""Class to extract information from filename"""
|
||||||
@@ -324,8 +327,9 @@ class FilenameExtractor:
|
|||||||
lang_obj = langcodes.Language.get(lang_code)
|
lang_obj = langcodes.Language.get(lang_code)
|
||||||
iso3_code = lang_obj.to_alpha3()
|
iso3_code = lang_obj.to_alpha3()
|
||||||
langs.extend([iso3_code] * count)
|
langs.extend([iso3_code] * count)
|
||||||
except:
|
except (LookupError, ValueError, AttributeError) as e:
|
||||||
# Skip invalid language codes
|
# Skip invalid language codes
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# Second, look for standalone language codes outside brackets
|
# Second, look for standalone language codes outside brackets
|
||||||
@@ -381,8 +385,9 @@ class FilenameExtractor:
|
|||||||
lang_obj = langcodes.Language.get(lang_code)
|
lang_obj = langcodes.Language.get(lang_code)
|
||||||
iso3_code = lang_obj.to_alpha3()
|
iso3_code = lang_obj.to_alpha3()
|
||||||
langs.append(iso3_code)
|
langs.append(iso3_code)
|
||||||
except:
|
except (LookupError, ValueError, AttributeError) as e:
|
||||||
# Skip invalid language codes
|
# Skip invalid language codes
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
pass
|
pass
|
||||||
|
|
||||||
if not langs:
|
if not langs:
|
||||||
@@ -455,8 +460,9 @@ class FilenameExtractor:
|
|||||||
lang_obj = langcodes.Language.get(lang_code)
|
lang_obj = langcodes.Language.get(lang_code)
|
||||||
iso3_code = lang_obj.to_alpha3()
|
iso3_code = lang_obj.to_alpha3()
|
||||||
tracks.append({'language': iso3_code})
|
tracks.append({'language': iso3_code})
|
||||||
except:
|
except (LookupError, ValueError, AttributeError) as e:
|
||||||
# Skip invalid language codes
|
# Skip invalid language codes
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# Second, look for standalone language codes outside brackets
|
# Second, look for standalone language codes outside brackets
|
||||||
@@ -512,8 +518,9 @@ class FilenameExtractor:
|
|||||||
lang_obj = langcodes.Language.get(lang_code)
|
lang_obj = langcodes.Language.get(lang_code)
|
||||||
iso3_code = lang_obj.to_alpha3()
|
iso3_code = lang_obj.to_alpha3()
|
||||||
tracks.append({'language': iso3_code})
|
tracks.append({'language': iso3_code})
|
||||||
except:
|
except (LookupError, ValueError, AttributeError) as e:
|
||||||
# Skip invalid language codes
|
# Skip invalid language codes
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
pass
|
pass
|
||||||
|
|
||||||
return tracks
|
return tracks
|
||||||
@@ -4,6 +4,9 @@ from collections import Counter
|
|||||||
from ..constants import FRAME_CLASSES, MEDIA_TYPES
|
from ..constants import FRAME_CLASSES, MEDIA_TYPES
|
||||||
from ..decorators import cached_method
|
from ..decorators import cached_method
|
||||||
import langcodes
|
import langcodes
|
||||||
|
import logging
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class MediaInfoExtractor:
|
class MediaInfoExtractor:
|
||||||
@@ -17,7 +20,8 @@ class MediaInfoExtractor:
|
|||||||
self.video_tracks = [t for t in self.media_info.tracks if t.track_type == 'Video']
|
self.video_tracks = [t for t in self.media_info.tracks if t.track_type == 'Video']
|
||||||
self.audio_tracks = [t for t in self.media_info.tracks if t.track_type == 'Audio']
|
self.audio_tracks = [t for t in self.media_info.tracks if t.track_type == 'Audio']
|
||||||
self.sub_tracks = [t for t in self.media_info.tracks if t.track_type == 'Text']
|
self.sub_tracks = [t for t in self.media_info.tracks if t.track_type == 'Text']
|
||||||
except Exception:
|
except Exception as e:
|
||||||
|
logger.warning(f"Failed to parse media info for {file_path}: {e}")
|
||||||
self.media_info = None
|
self.media_info = None
|
||||||
self.video_tracks = []
|
self.video_tracks = []
|
||||||
self.audio_tracks = []
|
self.audio_tracks = []
|
||||||
@@ -165,8 +169,9 @@ class MediaInfoExtractor:
|
|||||||
lang_obj = langcodes.Language.get(lang_code.lower())
|
lang_obj = langcodes.Language.get(lang_code.lower())
|
||||||
alpha3 = lang_obj.to_alpha3()
|
alpha3 = lang_obj.to_alpha3()
|
||||||
langs.append(alpha3)
|
langs.append(alpha3)
|
||||||
except:
|
except (LookupError, ValueError, AttributeError) as e:
|
||||||
# If conversion fails, use the original code
|
# If conversion fails, use the original code
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
langs.append(lang_code.lower()[:3])
|
langs.append(lang_code.lower()[:3])
|
||||||
|
|
||||||
lang_counts = Counter(langs)
|
lang_counts = Counter(langs)
|
||||||
|
|||||||
@@ -1,8 +1,11 @@
|
|||||||
import mutagen
|
import mutagen
|
||||||
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from ..constants import MEDIA_TYPES
|
from ..constants import MEDIA_TYPES
|
||||||
from ..decorators import cached_method
|
from ..decorators import cached_method
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class MetadataExtractor:
|
class MetadataExtractor:
|
||||||
"""Class to extract information from file metadata"""
|
"""Class to extract information from file metadata"""
|
||||||
@@ -12,7 +15,8 @@ class MetadataExtractor:
|
|||||||
self._cache = {} # Internal cache for method results
|
self._cache = {} # Internal cache for method results
|
||||||
try:
|
try:
|
||||||
self.info = mutagen.File(file_path) # type: ignore
|
self.info = mutagen.File(file_path) # type: ignore
|
||||||
except Exception:
|
except Exception as e:
|
||||||
|
logger.debug(f"Failed to read metadata from {file_path}: {e}")
|
||||||
self.info = None
|
self.info = None
|
||||||
|
|
||||||
@cached_method()
|
@cached_method()
|
||||||
@@ -52,5 +56,6 @@ class MetadataExtractor:
|
|||||||
if info['mime'] == mime:
|
if info['mime'] == mime:
|
||||||
return info['meta_type']
|
return info['meta_type']
|
||||||
return 'Unknown'
|
return 'Unknown'
|
||||||
except Exception:
|
except Exception as e:
|
||||||
|
logger.debug(f"Failed to detect MIME type for {self.file_path}: {e}")
|
||||||
return 'Unknown'
|
return 'Unknown'
|
||||||
@@ -50,7 +50,8 @@ class TMDBExtractor:
|
|||||||
response = requests.get(url, headers=headers, params=params, timeout=10)
|
response = requests.get(url, headers=headers, params=params, timeout=10)
|
||||||
response.raise_for_status()
|
response.raise_for_status()
|
||||||
return response.json()
|
return response.json()
|
||||||
except (requests.RequestException, ValueError):
|
except (requests.RequestException, ValueError) as e:
|
||||||
|
logging.warning(f"TMDB API request failed for {url}: {e}")
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def _search_movie_by_title_year(self, title: str, year: Optional[str] = None) -> Optional[Dict[str, Any]]:
|
def _search_movie_by_title_year(self, title: str, year: Optional[str] = None) -> Optional[Dict[str, Any]]:
|
||||||
@@ -279,5 +280,6 @@ class TMDBExtractor:
|
|||||||
# Cache image
|
# Cache image
|
||||||
local_path = self.cache.set_image(cache_key, image_data, self.ttl_seconds)
|
local_path = self.cache.set_image(cache_key, image_data, self.ttl_seconds)
|
||||||
return str(local_path) if local_path else None
|
return str(local_path) if local_path else None
|
||||||
except requests.RequestException:
|
except requests.RequestException as e:
|
||||||
|
logging.warning(f"Failed to download poster from {poster_url}: {e}")
|
||||||
return None
|
return None
|
||||||
|
|||||||
@@ -1 +1,44 @@
|
|||||||
# Formatters package
|
"""Formatters package - provides value formatting for display.
|
||||||
|
|
||||||
|
This package contains various formatter classes that transform raw values
|
||||||
|
into display-ready strings with optional styling.
|
||||||
|
|
||||||
|
All formatters should inherit from the Formatter ABC defined in base.py.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base import (
|
||||||
|
Formatter,
|
||||||
|
DataFormatter,
|
||||||
|
TextFormatter as TextFormatterBase,
|
||||||
|
MarkupFormatter,
|
||||||
|
CompositeFormatter
|
||||||
|
)
|
||||||
|
from .text_formatter import TextFormatter
|
||||||
|
from .duration_formatter import DurationFormatter
|
||||||
|
from .size_formatter import SizeFormatter
|
||||||
|
from .date_formatter import DateFormatter
|
||||||
|
from .extension_formatter import ExtensionFormatter
|
||||||
|
from .resolution_formatter import ResolutionFormatter
|
||||||
|
from .track_formatter import TrackFormatter
|
||||||
|
from .special_info_formatter import SpecialInfoFormatter
|
||||||
|
from .formatter import FormatterApplier
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
# Base classes
|
||||||
|
'Formatter',
|
||||||
|
'DataFormatter',
|
||||||
|
'TextFormatterBase',
|
||||||
|
'MarkupFormatter',
|
||||||
|
'CompositeFormatter',
|
||||||
|
|
||||||
|
# Concrete formatters
|
||||||
|
'TextFormatter',
|
||||||
|
'DurationFormatter',
|
||||||
|
'SizeFormatter',
|
||||||
|
'DateFormatter',
|
||||||
|
'ExtensionFormatter',
|
||||||
|
'ResolutionFormatter',
|
||||||
|
'TrackFormatter',
|
||||||
|
'SpecialInfoFormatter',
|
||||||
|
'FormatterApplier',
|
||||||
|
]
|
||||||
148
renamer/formatters/base.py
Normal file
148
renamer/formatters/base.py
Normal file
@@ -0,0 +1,148 @@
|
|||||||
|
"""Base classes for formatters.
|
||||||
|
|
||||||
|
This module defines the Formatter Abstract Base Class (ABC) that all formatters
|
||||||
|
should inherit from. This ensures a consistent interface and enables type checking.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from abc import ABC, abstractmethod
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
class Formatter(ABC):
|
||||||
|
"""Abstract base class for all formatters.
|
||||||
|
|
||||||
|
All formatter classes should inherit from this base class and implement
|
||||||
|
the format() method. Formatters are responsible for transforming raw values
|
||||||
|
into display-ready strings.
|
||||||
|
|
||||||
|
The Formatter ABC supports three categories of formatters:
|
||||||
|
1. Data formatters: Transform raw data (e.g., bytes to "1.2 GB")
|
||||||
|
2. Text formatters: Transform text content (e.g., uppercase, lowercase)
|
||||||
|
3. Markup formatters: Add visual styling (e.g., bold, colored text)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
class MyFormatter(Formatter):
|
||||||
|
@staticmethod
|
||||||
|
def format(value: Any) -> str:
|
||||||
|
return str(value).upper()
|
||||||
|
|
||||||
|
Note:
|
||||||
|
All formatter methods should be static methods to allow
|
||||||
|
usage without instantiation and composition in FormatterApplier.
|
||||||
|
"""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
@abstractmethod
|
||||||
|
def format(value: Any) -> str:
|
||||||
|
"""Format a value for display.
|
||||||
|
|
||||||
|
This is the core method that all formatters must implement.
|
||||||
|
It takes a raw value and returns a formatted string.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
value: The value to format (type depends on formatter)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The formatted string representation
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If the value cannot be formatted
|
||||||
|
TypeError: If the value type is incompatible
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> class SizeFormatter(Formatter):
|
||||||
|
... @staticmethod
|
||||||
|
... def format(value: int) -> str:
|
||||||
|
... return f"{value / 1024:.1f} KB"
|
||||||
|
>>> SizeFormatter.format(2048)
|
||||||
|
'2.0 KB'
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class DataFormatter(Formatter):
|
||||||
|
"""Base class for data formatters.
|
||||||
|
|
||||||
|
Data formatters transform raw data values into human-readable formats.
|
||||||
|
Examples include:
|
||||||
|
- File sizes (bytes to "1.2 GB")
|
||||||
|
- Durations (seconds to "1h 23m")
|
||||||
|
- Dates (timestamp to "2024-01-15")
|
||||||
|
- Resolutions (width/height to "1920x1080")
|
||||||
|
|
||||||
|
Data formatters should be applied first in the formatting pipeline,
|
||||||
|
before text transformations and markup.
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class TextFormatter(Formatter):
|
||||||
|
"""Base class for text formatters.
|
||||||
|
|
||||||
|
Text formatters transform text content without adding markup.
|
||||||
|
Examples include:
|
||||||
|
- Case transformations (uppercase, lowercase, camelcase)
|
||||||
|
- Text replacements
|
||||||
|
- String truncation
|
||||||
|
|
||||||
|
Text formatters should be applied after data formatters but before
|
||||||
|
markup formatters in the formatting pipeline.
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class MarkupFormatter(Formatter):
|
||||||
|
"""Base class for markup formatters.
|
||||||
|
|
||||||
|
Markup formatters add visual styling using markup tags.
|
||||||
|
Examples include:
|
||||||
|
- Color formatting ([red]text[/red])
|
||||||
|
- Style formatting ([bold]text[/bold])
|
||||||
|
- Link formatting ([link=url]text[/link])
|
||||||
|
|
||||||
|
Markup formatters should be applied last in the formatting pipeline,
|
||||||
|
after all data and text transformations are complete.
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class CompositeFormatter(Formatter):
|
||||||
|
"""Formatter that applies multiple formatters in sequence.
|
||||||
|
|
||||||
|
This class allows chaining multiple formatters together in a specific order.
|
||||||
|
Useful for creating complex formatting pipelines.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> formatters = [SizeFormatter, BoldFormatter, GreenFormatter]
|
||||||
|
>>> composite = CompositeFormatter(formatters)
|
||||||
|
>>> composite.format(1024)
|
||||||
|
'[bold green]1.0 KB[/bold green]'
|
||||||
|
|
||||||
|
Attributes:
|
||||||
|
formatters: List of formatter functions to apply in order
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, formatters: list[callable]):
|
||||||
|
"""Initialize the composite formatter.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
formatters: List of formatter functions to apply in order
|
||||||
|
"""
|
||||||
|
self.formatters = formatters
|
||||||
|
|
||||||
|
def format(self, value: Any) -> str:
|
||||||
|
"""Apply all formatters in sequence.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
value: The value to format
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The result after applying all formatters
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
Exception: If any formatter in the chain raises an exception
|
||||||
|
"""
|
||||||
|
result = value
|
||||||
|
for formatter in self.formatters:
|
||||||
|
result = formatter(result)
|
||||||
|
return result
|
||||||
21
renamer/services/__init__.py
Normal file
21
renamer/services/__init__.py
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
"""Services package - business logic layer for the Renamer application.
|
||||||
|
|
||||||
|
This package contains service classes that encapsulate business logic and
|
||||||
|
coordinate between different components. Services provide a clean separation
|
||||||
|
of concerns and make the application more testable and maintainable.
|
||||||
|
|
||||||
|
Services:
|
||||||
|
- FileTreeService: Manages file tree operations (scanning, building, filtering)
|
||||||
|
- MetadataService: Coordinates metadata extraction with caching and threading
|
||||||
|
- RenameService: Handles file rename operations with validation
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .file_tree_service import FileTreeService
|
||||||
|
from .metadata_service import MetadataService
|
||||||
|
from .rename_service import RenameService
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
'FileTreeService',
|
||||||
|
'MetadataService',
|
||||||
|
'RenameService',
|
||||||
|
]
|
||||||
280
renamer/services/file_tree_service.py
Normal file
280
renamer/services/file_tree_service.py
Normal file
@@ -0,0 +1,280 @@
|
|||||||
|
"""File tree service for managing directory scanning and tree building.
|
||||||
|
|
||||||
|
This service encapsulates all file system operations related to building
|
||||||
|
and managing the file tree display.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional, Callable
|
||||||
|
from rich.markup import escape
|
||||||
|
|
||||||
|
from renamer.constants import MEDIA_TYPES
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class FileTreeService:
|
||||||
|
"""Service for managing file tree operations.
|
||||||
|
|
||||||
|
This service handles:
|
||||||
|
- Directory scanning and validation
|
||||||
|
- File tree construction with filtering
|
||||||
|
- File type filtering based on media types
|
||||||
|
- Permission error handling
|
||||||
|
|
||||||
|
Example:
|
||||||
|
service = FileTreeService()
|
||||||
|
files = service.scan_directory(Path("/media/movies"))
|
||||||
|
service.build_tree(Path("/media/movies"), tree_node)
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, media_types: Optional[set[str]] = None):
|
||||||
|
"""Initialize the file tree service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
media_types: Set of file extensions to include (without dot).
|
||||||
|
If None, uses MEDIA_TYPES from constants.
|
||||||
|
"""
|
||||||
|
self.media_types = media_types or MEDIA_TYPES
|
||||||
|
logger.debug(f"FileTreeService initialized with {len(self.media_types)} media types")
|
||||||
|
|
||||||
|
def validate_directory(self, path: Path) -> tuple[bool, Optional[str]]:
|
||||||
|
"""Validate that a path is a valid directory.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
path: The path to validate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (is_valid, error_message). If valid, error_message is None.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> service = FileTreeService()
|
||||||
|
>>> is_valid, error = service.validate_directory(Path("/tmp"))
|
||||||
|
>>> if is_valid:
|
||||||
|
... print("Directory is valid")
|
||||||
|
"""
|
||||||
|
if not path:
|
||||||
|
return False, "No directory specified"
|
||||||
|
|
||||||
|
if not path.exists():
|
||||||
|
return False, f"Directory does not exist: {path}"
|
||||||
|
|
||||||
|
if not path.is_dir():
|
||||||
|
return False, f"Path is not a directory: {path}"
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Test if we can read the directory
|
||||||
|
list(path.iterdir())
|
||||||
|
return True, None
|
||||||
|
except PermissionError:
|
||||||
|
return False, f"Permission denied: {path}"
|
||||||
|
except Exception as e:
|
||||||
|
return False, f"Error accessing directory: {e}"
|
||||||
|
|
||||||
|
def scan_directory(self, path: Path, recursive: bool = True) -> list[Path]:
|
||||||
|
"""Scan a directory and return all media files.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
path: The directory to scan
|
||||||
|
recursive: If True, scan subdirectories recursively
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of Path objects for all media files found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> service = FileTreeService()
|
||||||
|
>>> files = service.scan_directory(Path("/media/movies"))
|
||||||
|
>>> print(f"Found {len(files)} media files")
|
||||||
|
"""
|
||||||
|
is_valid, error = self.validate_directory(path)
|
||||||
|
if not is_valid:
|
||||||
|
logger.warning(f"Cannot scan directory: {error}")
|
||||||
|
return []
|
||||||
|
|
||||||
|
media_files = []
|
||||||
|
try:
|
||||||
|
for item in sorted(path.iterdir()):
|
||||||
|
try:
|
||||||
|
if item.is_dir():
|
||||||
|
# Skip hidden directories and system directories
|
||||||
|
if item.name.startswith(".") or item.name == "lost+found":
|
||||||
|
continue
|
||||||
|
|
||||||
|
if recursive:
|
||||||
|
# Recursively scan subdirectories
|
||||||
|
media_files.extend(self.scan_directory(item, recursive=True))
|
||||||
|
elif item.is_file():
|
||||||
|
# Check if file has a media extension
|
||||||
|
if self._is_media_file(item):
|
||||||
|
media_files.append(item)
|
||||||
|
logger.debug(f"Found media file: {item}")
|
||||||
|
except PermissionError:
|
||||||
|
logger.debug(f"Permission denied: {item}")
|
||||||
|
continue
|
||||||
|
except PermissionError:
|
||||||
|
logger.warning(f"Permission denied scanning directory: {path}")
|
||||||
|
|
||||||
|
return media_files
|
||||||
|
|
||||||
|
def build_tree(
|
||||||
|
self,
|
||||||
|
path: Path,
|
||||||
|
node,
|
||||||
|
add_node_callback: Optional[Callable] = None
|
||||||
|
):
|
||||||
|
"""Build a tree structure from a directory.
|
||||||
|
|
||||||
|
This method recursively builds a tree by adding directories and media files
|
||||||
|
to the provided node. Uses a callback to add nodes to maintain compatibility
|
||||||
|
with different tree implementations.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
path: The directory path to build tree from
|
||||||
|
node: The tree node to add children to
|
||||||
|
add_node_callback: Optional callback(node, label, data) to add a child node.
|
||||||
|
If None, uses node.add(label, data=data)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> from textual.widgets import Tree
|
||||||
|
>>> tree = Tree("Files")
|
||||||
|
>>> service = FileTreeService()
|
||||||
|
>>> service.build_tree(Path("/media"), tree.root)
|
||||||
|
"""
|
||||||
|
if add_node_callback is None:
|
||||||
|
# Default implementation for Textual Tree
|
||||||
|
add_node_callback = lambda parent, label, data: parent.add(label, data=data)
|
||||||
|
|
||||||
|
try:
|
||||||
|
for item in sorted(path.iterdir()):
|
||||||
|
try:
|
||||||
|
if item.is_dir():
|
||||||
|
# Skip hidden and system directories
|
||||||
|
if item.name.startswith(".") or item.name == "lost+found":
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Add directory node
|
||||||
|
subnode = add_node_callback(node, escape(item.name), item)
|
||||||
|
# Recursively build tree for subdirectory
|
||||||
|
self.build_tree(item, subnode, add_node_callback)
|
||||||
|
|
||||||
|
elif item.is_file() and self._is_media_file(item):
|
||||||
|
# Add media file node
|
||||||
|
logger.debug(f"Adding file to tree: {item.name!r} (full path: {item})")
|
||||||
|
add_node_callback(node, escape(item.name), item)
|
||||||
|
|
||||||
|
except PermissionError:
|
||||||
|
logger.debug(f"Permission denied: {item}")
|
||||||
|
continue
|
||||||
|
except PermissionError:
|
||||||
|
logger.warning(f"Permission denied building tree: {path}")
|
||||||
|
|
||||||
|
def find_node_by_path(self, root_node, target_path: Path):
|
||||||
|
"""Find a tree node by file path.
|
||||||
|
|
||||||
|
Recursively searches the tree for a node with matching data path.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
root_node: The root node to start searching from
|
||||||
|
target_path: The Path to search for
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The matching node or None if not found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> node = service.find_node_by_path(tree.root, Path("/media/movie.mkv"))
|
||||||
|
>>> if node:
|
||||||
|
... node.label = "New Name.mkv"
|
||||||
|
"""
|
||||||
|
# Check if this node matches
|
||||||
|
if hasattr(root_node, 'data') and root_node.data == target_path:
|
||||||
|
return root_node
|
||||||
|
|
||||||
|
# Recursively search children
|
||||||
|
if hasattr(root_node, 'children'):
|
||||||
|
for child in root_node.children:
|
||||||
|
result = self.find_node_by_path(child, target_path)
|
||||||
|
if result:
|
||||||
|
return result
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def count_media_files(self, path: Path) -> int:
|
||||||
|
"""Count the number of media files in a directory.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
path: The directory to count files in
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Number of media files found (including subdirectories)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> count = service.count_media_files(Path("/media/movies"))
|
||||||
|
>>> print(f"Found {count} media files")
|
||||||
|
"""
|
||||||
|
return len(self.scan_directory(path, recursive=True))
|
||||||
|
|
||||||
|
def _is_media_file(self, path: Path) -> bool:
|
||||||
|
"""Check if a file is a media file based on extension.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
path: The file path to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if the file has a media extension
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> service._is_media_file(Path("movie.mkv"))
|
||||||
|
True
|
||||||
|
>>> service._is_media_file(Path("readme.txt"))
|
||||||
|
False
|
||||||
|
"""
|
||||||
|
extension = path.suffix.lower()
|
||||||
|
# Remove the leading dot and check against media types
|
||||||
|
return extension.lstrip('.') in {ext.lower() for ext in self.media_types}
|
||||||
|
|
||||||
|
def get_directory_stats(self, path: Path) -> dict[str, int]:
|
||||||
|
"""Get statistics about a directory.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
path: The directory to analyze
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary with stats: total_files, total_dirs, media_files
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> stats = service.get_directory_stats(Path("/media"))
|
||||||
|
>>> print(f"Media files: {stats['media_files']}")
|
||||||
|
"""
|
||||||
|
stats = {
|
||||||
|
'total_files': 0,
|
||||||
|
'total_dirs': 0,
|
||||||
|
'media_files': 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
is_valid, _ = self.validate_directory(path)
|
||||||
|
if not is_valid:
|
||||||
|
return stats
|
||||||
|
|
||||||
|
try:
|
||||||
|
for item in path.iterdir():
|
||||||
|
try:
|
||||||
|
if item.is_dir():
|
||||||
|
if not item.name.startswith(".") and item.name != "lost+found":
|
||||||
|
stats['total_dirs'] += 1
|
||||||
|
# Recursively count subdirectories
|
||||||
|
sub_stats = self.get_directory_stats(item)
|
||||||
|
stats['total_files'] += sub_stats['total_files']
|
||||||
|
stats['total_dirs'] += sub_stats['total_dirs']
|
||||||
|
stats['media_files'] += sub_stats['media_files']
|
||||||
|
elif item.is_file():
|
||||||
|
stats['total_files'] += 1
|
||||||
|
if self._is_media_file(item):
|
||||||
|
stats['media_files'] += 1
|
||||||
|
except PermissionError:
|
||||||
|
continue
|
||||||
|
except PermissionError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
return stats
|
||||||
325
renamer/services/metadata_service.py
Normal file
325
renamer/services/metadata_service.py
Normal file
@@ -0,0 +1,325 @@
|
|||||||
|
"""Metadata service for coordinating metadata extraction and caching.
|
||||||
|
|
||||||
|
This service manages the extraction of metadata from media files with:
|
||||||
|
- Thread pool for concurrent extraction
|
||||||
|
- Cache integration for performance
|
||||||
|
- Formatter coordination for display
|
||||||
|
- Error handling and recovery
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional, Callable
|
||||||
|
from concurrent.futures import ThreadPoolExecutor, Future
|
||||||
|
from threading import Lock
|
||||||
|
|
||||||
|
from renamer.cache import Cache
|
||||||
|
from renamer.settings import Settings
|
||||||
|
from renamer.extractors.extractor import MediaExtractor
|
||||||
|
from renamer.formatters.media_formatter import MediaFormatter
|
||||||
|
from renamer.formatters.catalog_formatter import CatalogFormatter
|
||||||
|
from renamer.formatters.proposed_name_formatter import ProposedNameFormatter
|
||||||
|
from renamer.formatters.text_formatter import TextFormatter
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class MetadataService:
|
||||||
|
"""Service for managing metadata extraction and formatting.
|
||||||
|
|
||||||
|
This service coordinates:
|
||||||
|
- Metadata extraction from media files
|
||||||
|
- Caching of extracted metadata
|
||||||
|
- Thread pool management for concurrent operations
|
||||||
|
- Formatting for different display modes (technical/catalog)
|
||||||
|
- Proposed name generation
|
||||||
|
|
||||||
|
The service uses a thread pool to extract metadata concurrently while
|
||||||
|
maintaining thread safety with proper locking mechanisms.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
cache = Cache()
|
||||||
|
settings = Settings()
|
||||||
|
service = MetadataService(cache, settings, max_workers=3)
|
||||||
|
|
||||||
|
# Extract metadata
|
||||||
|
result = service.extract_metadata(Path("/media/movie.mkv"))
|
||||||
|
if result:
|
||||||
|
print(result['formatted_info'])
|
||||||
|
|
||||||
|
# Cleanup when done
|
||||||
|
service.shutdown()
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
cache: Cache,
|
||||||
|
settings: Settings,
|
||||||
|
max_workers: int = 3
|
||||||
|
):
|
||||||
|
"""Initialize the metadata service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
cache: Cache instance for storing extracted metadata
|
||||||
|
settings: Settings instance for user preferences
|
||||||
|
max_workers: Maximum number of concurrent extraction threads
|
||||||
|
"""
|
||||||
|
self.cache = cache
|
||||||
|
self.settings = settings
|
||||||
|
self.max_workers = max_workers
|
||||||
|
|
||||||
|
# Thread pool for concurrent extraction
|
||||||
|
self.executor = ThreadPoolExecutor(
|
||||||
|
max_workers=max_workers,
|
||||||
|
thread_name_prefix="metadata_"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Lock for thread-safe operations
|
||||||
|
self._lock = Lock()
|
||||||
|
|
||||||
|
# Track active futures for cancellation
|
||||||
|
self._active_futures: dict[Path, Future] = {}
|
||||||
|
|
||||||
|
logger.info(f"MetadataService initialized with {max_workers} workers")
|
||||||
|
|
||||||
|
def extract_metadata(
|
||||||
|
self,
|
||||||
|
file_path: Path,
|
||||||
|
callback: Optional[Callable] = None,
|
||||||
|
error_callback: Optional[Callable] = None
|
||||||
|
) -> Optional[dict]:
|
||||||
|
"""Extract metadata from a media file.
|
||||||
|
|
||||||
|
This method can be called synchronously (returns result immediately) or
|
||||||
|
asynchronously (uses callbacks when complete).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
file_path: Path to the media file
|
||||||
|
callback: Optional callback(result_dict) called when extraction completes
|
||||||
|
error_callback: Optional callback(error_message) called on error
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary with 'formatted_info' and 'proposed_name' if synchronous,
|
||||||
|
None if using callbacks (async mode)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
# Synchronous
|
||||||
|
result = service.extract_metadata(path)
|
||||||
|
print(result['formatted_info'])
|
||||||
|
|
||||||
|
# Asynchronous
|
||||||
|
service.extract_metadata(
|
||||||
|
path,
|
||||||
|
callback=lambda r: print(r['formatted_info']),
|
||||||
|
error_callback=lambda e: print(f"Error: {e}")
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
if callback or error_callback:
|
||||||
|
# Asynchronous mode - submit to thread pool
|
||||||
|
future = self.executor.submit(
|
||||||
|
self._extract_metadata_internal,
|
||||||
|
file_path
|
||||||
|
)
|
||||||
|
|
||||||
|
# Track the future
|
||||||
|
with self._lock:
|
||||||
|
# Cancel any existing extraction for this file
|
||||||
|
if file_path in self._active_futures:
|
||||||
|
self._active_futures[file_path].cancel()
|
||||||
|
self._active_futures[file_path] = future
|
||||||
|
|
||||||
|
# Add callback handlers
|
||||||
|
def done_callback(f: Future):
|
||||||
|
with self._lock:
|
||||||
|
# Remove from active futures
|
||||||
|
self._active_futures.pop(file_path, None)
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = f.result()
|
||||||
|
if callback:
|
||||||
|
callback(result)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error extracting metadata for {file_path}: {e}")
|
||||||
|
if error_callback:
|
||||||
|
error_callback(str(e))
|
||||||
|
|
||||||
|
future.add_done_callback(done_callback)
|
||||||
|
return None
|
||||||
|
else:
|
||||||
|
# Synchronous mode - extract directly
|
||||||
|
return self._extract_metadata_internal(file_path)
|
||||||
|
|
||||||
|
def _extract_metadata_internal(self, file_path: Path) -> dict:
|
||||||
|
"""Internal method to extract and format metadata.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
file_path: Path to the media file
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary with 'formatted_info' and 'proposed_name'
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
Exception: If extraction fails
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
# Initialize extractor (uses cache internally via decorators)
|
||||||
|
extractor = MediaExtractor(file_path)
|
||||||
|
|
||||||
|
# Get current mode from settings
|
||||||
|
mode = self.settings.get("mode")
|
||||||
|
|
||||||
|
# Format based on mode
|
||||||
|
if mode == "technical":
|
||||||
|
formatter = MediaFormatter(extractor)
|
||||||
|
formatted_info = formatter.file_info_panel()
|
||||||
|
else: # catalog
|
||||||
|
formatter = CatalogFormatter(extractor)
|
||||||
|
formatted_info = formatter.format_catalog_info()
|
||||||
|
|
||||||
|
# Generate proposed name
|
||||||
|
proposed_formatter = ProposedNameFormatter(extractor)
|
||||||
|
proposed_name = proposed_formatter.rename_line_formatted(file_path)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'formatted_info': formatted_info,
|
||||||
|
'proposed_name': proposed_name,
|
||||||
|
'mode': mode,
|
||||||
|
}
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to extract metadata for {file_path}: {e}")
|
||||||
|
return {
|
||||||
|
'formatted_info': TextFormatter.red(f"Error extracting details: {str(e)}"),
|
||||||
|
'proposed_name': "",
|
||||||
|
'mode': self.settings.get("mode"),
|
||||||
|
}
|
||||||
|
|
||||||
|
def extract_for_display(
|
||||||
|
self,
|
||||||
|
file_path: Path,
|
||||||
|
display_callback: Callable[[str, str], None],
|
||||||
|
error_callback: Optional[Callable[[str], None]] = None
|
||||||
|
):
|
||||||
|
"""Extract metadata and update display via callback.
|
||||||
|
|
||||||
|
Convenience method that extracts metadata and calls the display callback
|
||||||
|
with the formatted info and proposed name.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
file_path: Path to the media file
|
||||||
|
display_callback: Callback(formatted_info, proposed_name) to update UI
|
||||||
|
error_callback: Optional callback(error_message) for errors
|
||||||
|
|
||||||
|
Example:
|
||||||
|
def update_ui(info, proposed):
|
||||||
|
details_widget.update(info)
|
||||||
|
proposed_widget.update(proposed)
|
||||||
|
|
||||||
|
service.extract_for_display(path, update_ui)
|
||||||
|
"""
|
||||||
|
def on_success(result: dict):
|
||||||
|
display_callback(result['formatted_info'], result['proposed_name'])
|
||||||
|
|
||||||
|
def on_error(error_message: str):
|
||||||
|
if error_callback:
|
||||||
|
error_callback(error_message)
|
||||||
|
else:
|
||||||
|
display_callback(
|
||||||
|
TextFormatter.red(f"Error: {error_message}"),
|
||||||
|
""
|
||||||
|
)
|
||||||
|
|
||||||
|
self.extract_metadata(file_path, callback=on_success, error_callback=on_error)
|
||||||
|
|
||||||
|
def cancel_extraction(self, file_path: Path) -> bool:
|
||||||
|
"""Cancel an ongoing extraction for a file.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
file_path: Path to the file whose extraction should be canceled
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if an extraction was canceled, False if none was active
|
||||||
|
|
||||||
|
Example:
|
||||||
|
# User selected a different file
|
||||||
|
service.cancel_extraction(old_path)
|
||||||
|
service.extract_metadata(new_path, callback=update_ui)
|
||||||
|
"""
|
||||||
|
with self._lock:
|
||||||
|
future = self._active_futures.get(file_path)
|
||||||
|
if future and not future.done():
|
||||||
|
future.cancel()
|
||||||
|
self._active_futures.pop(file_path, None)
|
||||||
|
logger.debug(f"Canceled extraction for {file_path}")
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
def cancel_all_extractions(self):
|
||||||
|
"""Cancel all ongoing extractions.
|
||||||
|
|
||||||
|
Useful when closing the application or switching directories.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
# User closing app
|
||||||
|
service.cancel_all_extractions()
|
||||||
|
service.shutdown()
|
||||||
|
"""
|
||||||
|
with self._lock:
|
||||||
|
canceled_count = 0
|
||||||
|
for file_path, future in list(self._active_futures.items()):
|
||||||
|
if not future.done():
|
||||||
|
future.cancel()
|
||||||
|
canceled_count += 1
|
||||||
|
self._active_futures.clear()
|
||||||
|
|
||||||
|
if canceled_count > 0:
|
||||||
|
logger.info(f"Canceled {canceled_count} active extractions")
|
||||||
|
|
||||||
|
def get_active_extraction_count(self) -> int:
|
||||||
|
"""Get the number of currently active extractions.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Number of extractions in progress
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> count = service.get_active_extraction_count()
|
||||||
|
>>> print(f"{count} extractions in progress")
|
||||||
|
"""
|
||||||
|
with self._lock:
|
||||||
|
return sum(1 for f in self._active_futures.values() if not f.done())
|
||||||
|
|
||||||
|
def shutdown(self, wait: bool = True):
|
||||||
|
"""Shutdown the metadata service.
|
||||||
|
|
||||||
|
Cancels all pending extractions and shuts down the thread pool.
|
||||||
|
Should be called when the application is closing.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
wait: If True, wait for all threads to complete. If False, cancel immediately.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
# Clean shutdown
|
||||||
|
service.shutdown(wait=True)
|
||||||
|
|
||||||
|
# Force shutdown
|
||||||
|
service.shutdown(wait=False)
|
||||||
|
"""
|
||||||
|
logger.info("Shutting down MetadataService")
|
||||||
|
|
||||||
|
# Cancel all active extractions
|
||||||
|
self.cancel_all_extractions()
|
||||||
|
|
||||||
|
# Shutdown thread pool
|
||||||
|
self.executor.shutdown(wait=wait)
|
||||||
|
|
||||||
|
logger.info("MetadataService shutdown complete")
|
||||||
|
|
||||||
|
def __enter__(self):
|
||||||
|
"""Context manager support."""
|
||||||
|
return self
|
||||||
|
|
||||||
|
def __exit__(self, exc_type, exc_val, exc_tb):
|
||||||
|
"""Context manager cleanup."""
|
||||||
|
self.shutdown(wait=True)
|
||||||
|
return False
|
||||||
346
renamer/services/rename_service.py
Normal file
346
renamer/services/rename_service.py
Normal file
@@ -0,0 +1,346 @@
|
|||||||
|
"""Rename service for handling file rename operations.
|
||||||
|
|
||||||
|
This service manages the process of renaming files with:
|
||||||
|
- Name validation and sanitization
|
||||||
|
- Proposed name generation
|
||||||
|
- Conflict detection
|
||||||
|
- Atomic rename operations
|
||||||
|
- Error handling and rollback
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional, Callable
|
||||||
|
|
||||||
|
from renamer.extractors.extractor import MediaExtractor
|
||||||
|
from renamer.formatters.proposed_name_formatter import ProposedNameFormatter
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class RenameService:
|
||||||
|
"""Service for managing file rename operations.
|
||||||
|
|
||||||
|
This service handles:
|
||||||
|
- Proposed name generation from metadata
|
||||||
|
- Name validation and sanitization
|
||||||
|
- File conflict detection
|
||||||
|
- Atomic file rename operations
|
||||||
|
- Rollback on errors
|
||||||
|
|
||||||
|
Example:
|
||||||
|
service = RenameService()
|
||||||
|
|
||||||
|
# Propose a new name
|
||||||
|
new_name = service.propose_name(Path("/media/movie.mkv"))
|
||||||
|
print(f"Proposed: {new_name}")
|
||||||
|
|
||||||
|
# Rename file
|
||||||
|
success, message = service.rename_file(
|
||||||
|
Path("/media/movie.mkv"),
|
||||||
|
new_name
|
||||||
|
)
|
||||||
|
if success:
|
||||||
|
print(f"Renamed successfully")
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Invalid characters for filenames (Windows + Unix)
|
||||||
|
INVALID_CHARS = r'[<>:"|?*\x00-\x1f]'
|
||||||
|
|
||||||
|
# Invalid characters for paths
|
||||||
|
INVALID_PATH_CHARS = r'[<>"|?*\x00-\x1f]'
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
"""Initialize the rename service."""
|
||||||
|
logger.debug("RenameService initialized")
|
||||||
|
|
||||||
|
def propose_name(
|
||||||
|
self,
|
||||||
|
file_path: Path,
|
||||||
|
extractor: Optional[MediaExtractor] = None
|
||||||
|
) -> Optional[str]:
|
||||||
|
"""Generate a proposed new filename based on metadata.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
file_path: Current file path
|
||||||
|
extractor: Optional pre-initialized MediaExtractor. If None, creates new one.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Proposed filename (without path) or None if generation fails
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> service = RenameService()
|
||||||
|
>>> new_name = service.propose_name(Path("/media/movie.2024.mkv"))
|
||||||
|
>>> print(new_name)
|
||||||
|
'Movie Title (2024) [1080p].mkv'
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
if extractor is None:
|
||||||
|
extractor = MediaExtractor(file_path)
|
||||||
|
|
||||||
|
formatter = ProposedNameFormatter(extractor)
|
||||||
|
# Get the formatted rename line
|
||||||
|
rename_line = formatter.rename_line_formatted(file_path)
|
||||||
|
|
||||||
|
# Extract just the filename from the rename line
|
||||||
|
# Format is typically: "Rename to: [bold]filename[/bold]"
|
||||||
|
if "→" in rename_line:
|
||||||
|
# New format with arrow
|
||||||
|
parts = rename_line.split("→")
|
||||||
|
if len(parts) == 2:
|
||||||
|
# Remove markup tags
|
||||||
|
proposed = self._strip_markup(parts[1].strip())
|
||||||
|
return proposed
|
||||||
|
elif "Rename to:" in rename_line:
|
||||||
|
# Old format
|
||||||
|
parts = rename_line.split("Rename to:")
|
||||||
|
if len(parts) == 2:
|
||||||
|
proposed = self._strip_markup(parts[1].strip())
|
||||||
|
return proposed
|
||||||
|
|
||||||
|
# Fallback: use the whole line after stripping markup
|
||||||
|
return self._strip_markup(rename_line)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to propose name for {file_path}: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def sanitize_filename(self, filename: str) -> str:
|
||||||
|
"""Sanitize a filename by removing invalid characters.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
filename: The filename to sanitize
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Sanitized filename safe for all filesystems
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> service.sanitize_filename('Movie: Title?')
|
||||||
|
'Movie Title'
|
||||||
|
"""
|
||||||
|
# Remove invalid characters
|
||||||
|
sanitized = re.sub(self.INVALID_CHARS, '', filename)
|
||||||
|
|
||||||
|
# Replace multiple spaces with single space
|
||||||
|
sanitized = re.sub(r'\s+', ' ', sanitized)
|
||||||
|
|
||||||
|
# Strip leading/trailing whitespace and dots
|
||||||
|
sanitized = sanitized.strip('. ')
|
||||||
|
|
||||||
|
return sanitized
|
||||||
|
|
||||||
|
def validate_filename(self, filename: str) -> tuple[bool, Optional[str]]:
|
||||||
|
"""Validate that a filename is safe and legal.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
filename: The filename to validate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (is_valid, error_message). If valid, error_message is None.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> is_valid, error = service.validate_filename("movie.mkv")
|
||||||
|
>>> if not is_valid:
|
||||||
|
... print(f"Invalid: {error}")
|
||||||
|
"""
|
||||||
|
if not filename:
|
||||||
|
return False, "Filename cannot be empty"
|
||||||
|
|
||||||
|
if len(filename) > 255:
|
||||||
|
return False, "Filename too long (max 255 characters)"
|
||||||
|
|
||||||
|
# Check for invalid characters
|
||||||
|
if re.search(self.INVALID_CHARS, filename):
|
||||||
|
return False, f"Filename contains invalid characters: {filename}"
|
||||||
|
|
||||||
|
# Check for reserved names (Windows)
|
||||||
|
reserved_names = {
|
||||||
|
'CON', 'PRN', 'AUX', 'NUL',
|
||||||
|
'COM1', 'COM2', 'COM3', 'COM4', 'COM5', 'COM6', 'COM7', 'COM8', 'COM9',
|
||||||
|
'LPT1', 'LPT2', 'LPT3', 'LPT4', 'LPT5', 'LPT6', 'LPT7', 'LPT8', 'LPT9',
|
||||||
|
}
|
||||||
|
name_without_ext = Path(filename).stem.upper()
|
||||||
|
if name_without_ext in reserved_names:
|
||||||
|
return False, f"Filename uses reserved name: {name_without_ext}"
|
||||||
|
|
||||||
|
# Check for names ending with dot or space (Windows)
|
||||||
|
if filename.endswith('.') or filename.endswith(' '):
|
||||||
|
return False, "Filename cannot end with dot or space"
|
||||||
|
|
||||||
|
return True, None
|
||||||
|
|
||||||
|
def check_name_conflict(
|
||||||
|
self,
|
||||||
|
source_path: Path,
|
||||||
|
new_filename: str
|
||||||
|
) -> tuple[bool, Optional[str]]:
|
||||||
|
"""Check if a new filename would conflict with existing files.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
source_path: Current file path
|
||||||
|
new_filename: Proposed new filename
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (has_conflict, conflict_message)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> has_conflict, msg = service.check_name_conflict(
|
||||||
|
... Path("/media/old.mkv"),
|
||||||
|
... "new.mkv"
|
||||||
|
... )
|
||||||
|
>>> if has_conflict:
|
||||||
|
... print(msg)
|
||||||
|
"""
|
||||||
|
# Build the new path
|
||||||
|
new_path = source_path.parent / new_filename
|
||||||
|
|
||||||
|
# Check if it's the same file (case-insensitive on some systems)
|
||||||
|
if source_path.resolve() == new_path.resolve():
|
||||||
|
return False, None
|
||||||
|
|
||||||
|
# Check if target already exists
|
||||||
|
if new_path.exists():
|
||||||
|
return True, f"File already exists: {new_filename}"
|
||||||
|
|
||||||
|
return False, None
|
||||||
|
|
||||||
|
def rename_file(
|
||||||
|
self,
|
||||||
|
source_path: Path,
|
||||||
|
new_filename: str,
|
||||||
|
dry_run: bool = False
|
||||||
|
) -> tuple[bool, str]:
|
||||||
|
"""Rename a file to a new filename.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
source_path: Current file path
|
||||||
|
new_filename: New filename (without path)
|
||||||
|
dry_run: If True, validate but don't actually rename
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (success, message). Message contains error or success info.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> success, msg = service.rename_file(
|
||||||
|
... Path("/media/old.mkv"),
|
||||||
|
... "new.mkv"
|
||||||
|
... )
|
||||||
|
>>> print(msg)
|
||||||
|
"""
|
||||||
|
# Validate source file exists
|
||||||
|
if not source_path.exists():
|
||||||
|
error_msg = f"Source file does not exist: {source_path}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
return False, error_msg
|
||||||
|
|
||||||
|
if not source_path.is_file():
|
||||||
|
error_msg = f"Source is not a file: {source_path}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
return False, error_msg
|
||||||
|
|
||||||
|
# Sanitize the new filename
|
||||||
|
sanitized_filename = self.sanitize_filename(new_filename)
|
||||||
|
|
||||||
|
# Validate the new filename
|
||||||
|
is_valid, error = self.validate_filename(sanitized_filename)
|
||||||
|
if not is_valid:
|
||||||
|
logger.error(f"Invalid filename: {error}")
|
||||||
|
return False, error
|
||||||
|
|
||||||
|
# Check for conflicts
|
||||||
|
has_conflict, conflict_msg = self.check_name_conflict(source_path, sanitized_filename)
|
||||||
|
if has_conflict:
|
||||||
|
logger.warning(f"Name conflict: {conflict_msg}")
|
||||||
|
return False, conflict_msg
|
||||||
|
|
||||||
|
# Build the new path
|
||||||
|
new_path = source_path.parent / sanitized_filename
|
||||||
|
|
||||||
|
# Dry run mode - don't actually rename
|
||||||
|
if dry_run:
|
||||||
|
success_msg = f"Would rename: {source_path.name} → {sanitized_filename}"
|
||||||
|
logger.info(success_msg)
|
||||||
|
return True, success_msg
|
||||||
|
|
||||||
|
# Perform the rename
|
||||||
|
try:
|
||||||
|
source_path.rename(new_path)
|
||||||
|
success_msg = f"Renamed: {source_path.name} → {sanitized_filename}"
|
||||||
|
logger.info(success_msg)
|
||||||
|
return True, success_msg
|
||||||
|
|
||||||
|
except PermissionError as e:
|
||||||
|
error_msg = f"Permission denied: {e}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
return False, error_msg
|
||||||
|
|
||||||
|
except OSError as e:
|
||||||
|
error_msg = f"OS error during rename: {e}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
return False, error_msg
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
error_msg = f"Unexpected error during rename: {e}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
return False, error_msg
|
||||||
|
|
||||||
|
def rename_with_callback(
|
||||||
|
self,
|
||||||
|
source_path: Path,
|
||||||
|
new_filename: str,
|
||||||
|
success_callback: Optional[Callable[[Path], None]] = None,
|
||||||
|
error_callback: Optional[Callable[[str], None]] = None,
|
||||||
|
dry_run: bool = False
|
||||||
|
):
|
||||||
|
"""Rename a file with callbacks for success/error.
|
||||||
|
|
||||||
|
Convenience method that performs the rename and calls appropriate callbacks.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
source_path: Current file path
|
||||||
|
new_filename: New filename (without path)
|
||||||
|
success_callback: Called with new_path on success
|
||||||
|
error_callback: Called with error_message on failure
|
||||||
|
dry_run: If True, validate but don't actually rename
|
||||||
|
|
||||||
|
Example:
|
||||||
|
def on_success(new_path):
|
||||||
|
print(f"File renamed to: {new_path}")
|
||||||
|
update_tree_node(new_path)
|
||||||
|
|
||||||
|
def on_error(error):
|
||||||
|
show_error_dialog(error)
|
||||||
|
|
||||||
|
service.rename_with_callback(
|
||||||
|
path, new_name,
|
||||||
|
success_callback=on_success,
|
||||||
|
error_callback=on_error
|
||||||
|
)
|
||||||
|
"""
|
||||||
|
success, message = self.rename_file(source_path, new_filename, dry_run)
|
||||||
|
|
||||||
|
if success:
|
||||||
|
if success_callback:
|
||||||
|
new_path = source_path.parent / self.sanitize_filename(new_filename)
|
||||||
|
success_callback(new_path)
|
||||||
|
else:
|
||||||
|
if error_callback:
|
||||||
|
error_callback(message)
|
||||||
|
|
||||||
|
def _strip_markup(self, text: str) -> str:
|
||||||
|
"""Strip Textual markup tags from text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text with markup tags
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Plain text without markup
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> service._strip_markup('[bold]text[/bold]')
|
||||||
|
'text'
|
||||||
|
"""
|
||||||
|
# Remove all markup tags like [bold], [/bold], [green], etc.
|
||||||
|
return re.sub(r'\[/?[^\]]+\]', '', text)
|
||||||
21
renamer/utils/__init__.py
Normal file
21
renamer/utils/__init__.py
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
"""Utils package - shared utility functions for the Renamer application.
|
||||||
|
|
||||||
|
This package contains utility modules that provide common functionality
|
||||||
|
used across multiple parts of the application. This eliminates code
|
||||||
|
duplication and provides a single source of truth for shared logic.
|
||||||
|
|
||||||
|
Modules:
|
||||||
|
- language_utils: Language code extraction and conversion
|
||||||
|
- pattern_utils: Regex pattern matching and extraction
|
||||||
|
- frame_utils: Frame class/aspect ratio matching
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .language_utils import LanguageCodeExtractor
|
||||||
|
from .pattern_utils import PatternExtractor
|
||||||
|
from .frame_utils import FrameClassMatcher
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
'LanguageCodeExtractor',
|
||||||
|
'PatternExtractor',
|
||||||
|
'FrameClassMatcher',
|
||||||
|
]
|
||||||
348
renamer/utils/frame_utils.py
Normal file
348
renamer/utils/frame_utils.py
Normal file
@@ -0,0 +1,348 @@
|
|||||||
|
"""Frame class and aspect ratio matching utilities.
|
||||||
|
|
||||||
|
This module provides centralized logic for determining frame class
|
||||||
|
(resolution classification) based on video dimensions.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from renamer.constants import FRAME_CLASSES
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class FrameClassMatcher:
|
||||||
|
"""Shared frame class matching logic.
|
||||||
|
|
||||||
|
This class centralizes the logic for determining frame class
|
||||||
|
(e.g., "1080p", "720p") from video dimensions.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.match_by_dimensions(1920, 1080, scan_type='p')
|
||||||
|
'1080p'
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Tolerance for matching dimensions (pixels)
|
||||||
|
HEIGHT_TOLERANCE_LARGE = 50 # For initial height matching
|
||||||
|
HEIGHT_TOLERANCE_SMALL = 20 # For closest match
|
||||||
|
WIDTH_TOLERANCE = 5 # For width matching
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
"""Initialize the frame class matcher."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def match_by_dimensions(
|
||||||
|
self,
|
||||||
|
width: int,
|
||||||
|
height: int,
|
||||||
|
scan_type: str = 'p'
|
||||||
|
) -> Optional[str]:
|
||||||
|
"""Match frame class by width and height dimensions.
|
||||||
|
|
||||||
|
Uses a multi-step matching algorithm:
|
||||||
|
1. Try width-based matching with typical widths
|
||||||
|
2. Fall back to effective height calculation
|
||||||
|
3. Try exact height match
|
||||||
|
4. Find closest standard height
|
||||||
|
5. Return custom frame class if no match
|
||||||
|
|
||||||
|
Args:
|
||||||
|
width: Video width in pixels
|
||||||
|
height: Video height in pixels
|
||||||
|
scan_type: 'p' for progressive, 'i' for interlaced
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Frame class string (e.g., "1080p") or None if invalid input
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.match_by_dimensions(1920, 1080, 'p')
|
||||||
|
'1080p'
|
||||||
|
>>> matcher.match_by_dimensions(1280, 720, 'p')
|
||||||
|
'720p'
|
||||||
|
"""
|
||||||
|
if not width or not height:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Calculate effective height for aspect ratio consideration
|
||||||
|
aspect_ratio = 16 / 9
|
||||||
|
if height > width:
|
||||||
|
# Portrait mode - unlikely for video but handle it
|
||||||
|
effective_height = height / aspect_ratio
|
||||||
|
else:
|
||||||
|
effective_height = height
|
||||||
|
|
||||||
|
# Step 1: Try to match width to typical widths
|
||||||
|
width_match = self._match_by_width_and_aspect(
|
||||||
|
width, height, scan_type
|
||||||
|
)
|
||||||
|
if width_match:
|
||||||
|
return width_match
|
||||||
|
|
||||||
|
# Step 2: Try exact match with standard frame classes
|
||||||
|
frame_class = f"{int(round(effective_height))}{scan_type}"
|
||||||
|
if frame_class in FRAME_CLASSES:
|
||||||
|
return frame_class
|
||||||
|
|
||||||
|
# Step 3: Find closest standard height match
|
||||||
|
closest_match = self._match_by_closest_height(
|
||||||
|
effective_height, scan_type
|
||||||
|
)
|
||||||
|
if closest_match:
|
||||||
|
return closest_match
|
||||||
|
|
||||||
|
# Step 4: Return custom frame class for non-standard resolutions
|
||||||
|
return frame_class
|
||||||
|
|
||||||
|
def match_by_height(self, height: int) -> Optional[str]:
|
||||||
|
"""Get frame class from video height only.
|
||||||
|
|
||||||
|
Tries exact match first, then finds closest match within tolerance.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
height: Video height in pixels
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Frame class string or None if no match within tolerance
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.match_by_height(1080)
|
||||||
|
'1080p'
|
||||||
|
>>> matcher.match_by_height(1078) # Close to 1080
|
||||||
|
'1080p'
|
||||||
|
"""
|
||||||
|
if not height:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Try exact match first
|
||||||
|
for frame_class, info in FRAME_CLASSES.items():
|
||||||
|
if height == info['nominal_height']:
|
||||||
|
return frame_class
|
||||||
|
|
||||||
|
# Find closest match
|
||||||
|
closest = None
|
||||||
|
min_diff = float('inf')
|
||||||
|
|
||||||
|
for frame_class, info in FRAME_CLASSES.items():
|
||||||
|
diff = abs(height - info['nominal_height'])
|
||||||
|
if diff < min_diff:
|
||||||
|
min_diff = diff
|
||||||
|
closest = frame_class
|
||||||
|
|
||||||
|
# Only return if difference is within tolerance
|
||||||
|
if min_diff <= self.HEIGHT_TOLERANCE_LARGE:
|
||||||
|
return closest
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _match_by_width_and_aspect(
|
||||||
|
self,
|
||||||
|
width: int,
|
||||||
|
height: int,
|
||||||
|
scan_type: str
|
||||||
|
) -> Optional[str]:
|
||||||
|
"""Match frame class by width and aspect ratio.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
width: Video width in pixels
|
||||||
|
height: Video height in pixels
|
||||||
|
scan_type: 'p' or 'i'
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Frame class string or None if no match
|
||||||
|
"""
|
||||||
|
width_matches = []
|
||||||
|
|
||||||
|
for frame_class, info in FRAME_CLASSES.items():
|
||||||
|
# Only consider frame classes with matching scan type
|
||||||
|
if not frame_class.endswith(scan_type):
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check if width matches any typical width for this frame class
|
||||||
|
for typical_width in info['typical_widths']:
|
||||||
|
if abs(width - typical_width) <= self.WIDTH_TOLERANCE:
|
||||||
|
# Calculate height difference for this match
|
||||||
|
height_diff = abs(height - info['nominal_height'])
|
||||||
|
width_matches.append((frame_class, height_diff))
|
||||||
|
|
||||||
|
if width_matches:
|
||||||
|
# Choose the frame class with smallest height difference
|
||||||
|
width_matches.sort(key=lambda x: x[1])
|
||||||
|
return width_matches[0][0]
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _match_by_closest_height(
|
||||||
|
self,
|
||||||
|
height: float,
|
||||||
|
scan_type: str
|
||||||
|
) -> Optional[str]:
|
||||||
|
"""Find closest standard frame class by height.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
height: Effective video height in pixels (can be float)
|
||||||
|
scan_type: 'p' or 'i'
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Frame class string or None if no match within tolerance
|
||||||
|
"""
|
||||||
|
closest_class = None
|
||||||
|
min_diff = float('inf')
|
||||||
|
|
||||||
|
for frame_class, info in FRAME_CLASSES.items():
|
||||||
|
# Only consider frame classes with matching scan type
|
||||||
|
if not frame_class.endswith(scan_type):
|
||||||
|
continue
|
||||||
|
|
||||||
|
diff = abs(height - info['nominal_height'])
|
||||||
|
if diff < min_diff:
|
||||||
|
min_diff = diff
|
||||||
|
closest_class = frame_class
|
||||||
|
|
||||||
|
# Only return if within tolerance
|
||||||
|
if closest_class and min_diff <= self.HEIGHT_TOLERANCE_SMALL:
|
||||||
|
return closest_class
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def get_nominal_height(self, frame_class: str) -> Optional[int]:
|
||||||
|
"""Get the nominal height for a frame class.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
frame_class: Frame class string (e.g., "1080p")
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Nominal height in pixels or None if not found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.get_nominal_height("1080p")
|
||||||
|
1080
|
||||||
|
"""
|
||||||
|
if frame_class in FRAME_CLASSES:
|
||||||
|
return FRAME_CLASSES[frame_class]['nominal_height']
|
||||||
|
return None
|
||||||
|
|
||||||
|
def get_typical_widths(self, frame_class: str) -> list[int]:
|
||||||
|
"""Get typical widths for a frame class.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
frame_class: Frame class string (e.g., "1080p")
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of typical widths in pixels
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.get_typical_widths("1080p")
|
||||||
|
[1920, 1440, 1280]
|
||||||
|
"""
|
||||||
|
if frame_class in FRAME_CLASSES:
|
||||||
|
return FRAME_CLASSES[frame_class]['typical_widths']
|
||||||
|
return []
|
||||||
|
|
||||||
|
def is_standard_resolution(self, width: int, height: int) -> bool:
|
||||||
|
"""Check if dimensions match a standard resolution.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
width: Video width in pixels
|
||||||
|
height: Video height in pixels
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if dimensions are close to a standard resolution
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.is_standard_resolution(1920, 1080)
|
||||||
|
True
|
||||||
|
>>> matcher.is_standard_resolution(1234, 567)
|
||||||
|
False
|
||||||
|
"""
|
||||||
|
# Try to match with either scan type
|
||||||
|
match_p = self.match_by_dimensions(width, height, 'p')
|
||||||
|
match_i = self.match_by_dimensions(width, height, 'i')
|
||||||
|
|
||||||
|
# If we got a match that exists in FRAME_CLASSES, it's standard
|
||||||
|
if match_p and match_p in FRAME_CLASSES:
|
||||||
|
return True
|
||||||
|
if match_i and match_i in FRAME_CLASSES:
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
def detect_scan_type(self, interlaced: Optional[str]) -> str:
|
||||||
|
"""Detect scan type from interlaced flag.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
interlaced: Interlaced flag (e.g., "Yes", "No", None)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
'i' for interlaced, 'p' for progressive
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.detect_scan_type("Yes")
|
||||||
|
'i'
|
||||||
|
>>> matcher.detect_scan_type("No")
|
||||||
|
'p'
|
||||||
|
"""
|
||||||
|
if interlaced and str(interlaced).lower() in ['yes', 'true', '1']:
|
||||||
|
return 'i'
|
||||||
|
return 'p'
|
||||||
|
|
||||||
|
def calculate_aspect_ratio(self, width: int, height: int) -> Optional[float]:
|
||||||
|
"""Calculate aspect ratio from dimensions.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
width: Video width in pixels
|
||||||
|
height: Video height in pixels
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Aspect ratio as float (e.g., 1.777 for 16:9) or None if invalid
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> ratio = matcher.calculate_aspect_ratio(1920, 1080)
|
||||||
|
>>> round(ratio, 2)
|
||||||
|
1.78
|
||||||
|
"""
|
||||||
|
if not width or not height or height == 0:
|
||||||
|
return None
|
||||||
|
return width / height
|
||||||
|
|
||||||
|
def format_aspect_ratio(self, ratio: float) -> str:
|
||||||
|
"""Format aspect ratio as a string.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
ratio: Aspect ratio as float
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted string (e.g., "16:9", "21:9")
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> matcher = FrameClassMatcher()
|
||||||
|
>>> matcher.format_aspect_ratio(1.777)
|
||||||
|
'16:9'
|
||||||
|
>>> matcher.format_aspect_ratio(2.35)
|
||||||
|
'21:9'
|
||||||
|
"""
|
||||||
|
# Common aspect ratios
|
||||||
|
common_ratios = {
|
||||||
|
1.33: "4:3",
|
||||||
|
1.78: "16:9",
|
||||||
|
1.85: "1.85:1",
|
||||||
|
2.35: "21:9",
|
||||||
|
2.39: "2.39:1",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Find closest match
|
||||||
|
closest = min(common_ratios.keys(), key=lambda x: abs(x - ratio))
|
||||||
|
if abs(closest - ratio) < 0.05: # Within 5% tolerance
|
||||||
|
return common_ratios[closest]
|
||||||
|
|
||||||
|
# Return as decimal if no match
|
||||||
|
return f"{ratio:.2f}:1"
|
||||||
332
renamer/utils/language_utils.py
Normal file
332
renamer/utils/language_utils.py
Normal file
@@ -0,0 +1,332 @@
|
|||||||
|
"""Language code extraction and conversion utilities.
|
||||||
|
|
||||||
|
This module provides centralized logic for extracting and converting language codes
|
||||||
|
from filenames and metadata. This eliminates the ~150+ lines of duplicated code
|
||||||
|
between FilenameExtractor and MediaInfoExtractor.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
from typing import Optional
|
||||||
|
import langcodes
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class LanguageCodeExtractor:
|
||||||
|
"""Shared language code extraction logic.
|
||||||
|
|
||||||
|
This class centralizes all language code detection and conversion logic,
|
||||||
|
eliminating duplication across multiple extractors.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> langs = extractor.extract_from_brackets("[2xUKR_ENG]")
|
||||||
|
>>> print(langs) # ['ukr', 'ukr', 'eng']
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Comprehensive set of known ISO 639-1/639-2/639-3 language codes
|
||||||
|
KNOWN_CODES = {
|
||||||
|
# Most common codes
|
||||||
|
'eng', 'ukr', 'rus', 'fra', 'deu', 'spa', 'ita', 'por', 'nor', 'swe',
|
||||||
|
'dan', 'fin', 'pol', 'cze', 'hun', 'tur', 'ara', 'heb', 'hin', 'jpn',
|
||||||
|
'kor', 'chi', 'tha', 'vie', 'und',
|
||||||
|
|
||||||
|
# European languages
|
||||||
|
'dut', 'nld', 'bel', 'bul', 'hrv', 'ces', 'est', 'ell', 'ind',
|
||||||
|
'lav', 'lit', 'mkd', 'ron', 'slk', 'slv', 'srp', 'zho',
|
||||||
|
|
||||||
|
# South Asian languages
|
||||||
|
'arb', 'ben', 'mar', 'tam', 'tel', 'urd', 'guj', 'kan', 'mal', 'ori',
|
||||||
|
'pan', 'asm', 'mai', 'bho', 'nep', 'sin', 'san', 'tib', 'mon',
|
||||||
|
|
||||||
|
# Central Asian languages
|
||||||
|
'kaz', 'uzb', 'kir', 'tuk', 'aze', 'kat', 'hye', 'geo',
|
||||||
|
|
||||||
|
# Balkan languages
|
||||||
|
'sqi', 'bos', 'alb', 'mol',
|
||||||
|
|
||||||
|
# Nordic languages
|
||||||
|
'isl', 'fao',
|
||||||
|
|
||||||
|
# Other Asian languages
|
||||||
|
'per', 'kur', 'pus', 'div', 'lao', 'khm', 'mya', 'msa',
|
||||||
|
'yue', 'wuu', 'nan', 'hak', 'gan', 'hsn',
|
||||||
|
|
||||||
|
# Various other codes
|
||||||
|
'awa', 'mag',
|
||||||
|
}
|
||||||
|
|
||||||
|
# Language codes that are allowed in title case (to avoid false positives)
|
||||||
|
ALLOWED_TITLE_CASE = {
|
||||||
|
'ukr', 'nor', 'eng', 'rus', 'fra', 'deu', 'spa', 'ita', 'por', 'swe',
|
||||||
|
'dan', 'fin', 'pol', 'cze', 'hun', 'tur', 'ara', 'heb', 'hin', 'jpn',
|
||||||
|
'kor', 'chi', 'tha', 'vie', 'und'
|
||||||
|
}
|
||||||
|
|
||||||
|
# Words to skip (common English words, file extensions, quality indicators)
|
||||||
|
SKIP_WORDS = {
|
||||||
|
# Common English words
|
||||||
|
'the', 'and', 'for', 'are', 'but', 'not', 'you', 'all', 'can', 'had',
|
||||||
|
'her', 'was', 'one', 'our', 'out', 'day', 'get', 'has', 'him', 'his',
|
||||||
|
'how', 'its', 'may', 'new', 'now', 'old', 'see', 'two', 'way', 'who',
|
||||||
|
'boy', 'did', 'let', 'put', 'say', 'she', 'too', 'use',
|
||||||
|
|
||||||
|
# File extensions
|
||||||
|
'avi', 'mkv', 'mp4', 'mpg', 'mov', 'wmv', 'flv', 'webm', 'm4v',
|
||||||
|
'm2ts', 'ts', 'vob', 'iso', 'img',
|
||||||
|
|
||||||
|
# Quality/resolution indicators
|
||||||
|
'sd', 'hd', 'lq', 'qhd', 'uhd', 'p', 'i', 'hdr', 'sdr', '4k', '8k',
|
||||||
|
'2160p', '1080p', '720p', '480p', '360p', '240p', '144p',
|
||||||
|
|
||||||
|
# Source/encoding indicators
|
||||||
|
'web', 'dl', 'rip', 'bluray', 'dvd', 'hdtv', 'bdrip', 'dvdrip',
|
||||||
|
'xvid', 'divx', 'h264', 'h265', 'x264', 'x265', 'hevc', 'avc',
|
||||||
|
|
||||||
|
# Audio codecs
|
||||||
|
'ma', 'atmos', 'dts', 'aac', 'ac3', 'mp3', 'flac', 'wav', 'wma',
|
||||||
|
'ogg', 'opus',
|
||||||
|
|
||||||
|
# Subtitle indicator
|
||||||
|
'sub', 'subs', 'subtitle',
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
"""Initialize the language code extractor."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def extract_from_brackets(self, text: str) -> list[str]:
|
||||||
|
"""Extract language codes from bracketed content.
|
||||||
|
|
||||||
|
Handles patterns like:
|
||||||
|
- [UKR_ENG] → ['ukr', 'eng']
|
||||||
|
- [2xUKR_ENG] → ['ukr', 'ukr', 'eng']
|
||||||
|
- [4xUKR,ENG] → ['ukr', 'ukr', 'ukr', 'ukr', 'eng']
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text containing bracketed language codes
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of ISO 639-3 language codes (3-letter)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> extractor.extract_from_brackets("[2xUKR_ENG]")
|
||||||
|
['ukr', 'ukr', 'eng']
|
||||||
|
"""
|
||||||
|
langs = []
|
||||||
|
|
||||||
|
# Find all bracketed content
|
||||||
|
bracket_pattern = r'\[([^\]]+)\]'
|
||||||
|
brackets = re.findall(bracket_pattern, text)
|
||||||
|
|
||||||
|
for bracket in brackets:
|
||||||
|
bracket_lower = bracket.lower()
|
||||||
|
|
||||||
|
# Skip brackets containing movie database patterns
|
||||||
|
if any(db in bracket_lower for db in ['imdb', 'tmdb', 'tvdb']):
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Parse items separated by commas or underscores
|
||||||
|
items = re.split(r'[,_]', bracket)
|
||||||
|
items = [item.strip() for item in items]
|
||||||
|
|
||||||
|
for item in items:
|
||||||
|
# Skip empty items or too short
|
||||||
|
if not item or len(item) < 2:
|
||||||
|
continue
|
||||||
|
|
||||||
|
item_lower = item.lower()
|
||||||
|
|
||||||
|
# Skip subtitle indicators
|
||||||
|
if item_lower in self.SKIP_WORDS:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Pattern: optional number + optional 'x' + language code
|
||||||
|
lang_match = re.search(r'(?:(\d+)x?)?([a-z]{2,3})$', item_lower)
|
||||||
|
if lang_match:
|
||||||
|
count = int(lang_match.group(1)) if lang_match.group(1) else 1
|
||||||
|
lang_code = lang_match.group(2)
|
||||||
|
|
||||||
|
# Skip quality/resolution indicators
|
||||||
|
if lang_code in self.SKIP_WORDS:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Validate prefix (only digits and 'x' allowed)
|
||||||
|
prefix = item_lower[:-len(lang_code)]
|
||||||
|
if not re.match(r'^(?:\d+x?)?$', prefix):
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Convert to ISO 639-3 code
|
||||||
|
iso3_code = self._convert_to_iso3(lang_code)
|
||||||
|
if iso3_code:
|
||||||
|
langs.extend([iso3_code] * count)
|
||||||
|
|
||||||
|
return langs
|
||||||
|
|
||||||
|
def extract_standalone(self, text: str) -> list[str]:
|
||||||
|
"""Extract standalone language codes from text.
|
||||||
|
|
||||||
|
Looks for language codes outside of brackets in various formats:
|
||||||
|
- Uppercase: ENG, UKR, NOR
|
||||||
|
- Title case: Ukr, Nor, Eng
|
||||||
|
- Lowercase: ukr, nor, eng
|
||||||
|
- Dot-separated: .ukr. .eng.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to extract language codes from
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of ISO 639-3 language codes (3-letter)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> extractor.extract_standalone("Movie.2024.UKR.ENG.1080p.mkv")
|
||||||
|
['ukr', 'eng']
|
||||||
|
"""
|
||||||
|
langs = []
|
||||||
|
|
||||||
|
# Remove bracketed content first
|
||||||
|
text_without_brackets = re.sub(r'\[([^\]]+)\]', '', text)
|
||||||
|
|
||||||
|
# Split on dots, spaces, and underscores
|
||||||
|
parts = re.split(r'[.\s_]+', text_without_brackets)
|
||||||
|
|
||||||
|
for part in parts:
|
||||||
|
part = part.strip()
|
||||||
|
if not part or len(part) < 2:
|
||||||
|
continue
|
||||||
|
|
||||||
|
part_lower = part.lower()
|
||||||
|
|
||||||
|
# Check if this is a 2-3 letter code
|
||||||
|
if re.match(r'^[a-zA-Z]{2,3}$', part):
|
||||||
|
# Skip title case 2-letter words to avoid false positives
|
||||||
|
if part.istitle() and len(part) == 2:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# For title case, only allow known language codes
|
||||||
|
if part.istitle() and part_lower not in self.ALLOWED_TITLE_CASE:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Skip common words and non-language codes
|
||||||
|
if part_lower in self.SKIP_WORDS:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check if it's a known language code
|
||||||
|
if part_lower in self.KNOWN_CODES:
|
||||||
|
iso3_code = self._convert_to_iso3(part_lower)
|
||||||
|
if iso3_code:
|
||||||
|
langs.append(iso3_code)
|
||||||
|
|
||||||
|
return langs
|
||||||
|
|
||||||
|
def extract_all(self, text: str) -> list[str]:
|
||||||
|
"""Extract all language codes from text (both bracketed and standalone).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to extract language codes from
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of ISO 639-3 language codes (3-letter), duplicates removed
|
||||||
|
while preserving order
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> extractor.extract_all("Movie [UKR_ENG] 2024.rus.mkv")
|
||||||
|
['ukr', 'eng', 'rus']
|
||||||
|
"""
|
||||||
|
# Extract from both sources
|
||||||
|
bracketed = self.extract_from_brackets(text)
|
||||||
|
standalone = self.extract_standalone(text)
|
||||||
|
|
||||||
|
# Combine while removing duplicates but preserving order
|
||||||
|
seen = set()
|
||||||
|
result = []
|
||||||
|
|
||||||
|
for lang in bracketed + standalone:
|
||||||
|
if lang not in seen:
|
||||||
|
seen.add(lang)
|
||||||
|
result.append(lang)
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
def format_lang_counts(self, langs: list[str]) -> str:
|
||||||
|
"""Format language list with counts like MediaInfo.
|
||||||
|
|
||||||
|
Formats like: "2ukr,eng" for 2 Ukrainian tracks and 1 English track.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
langs: List of language codes (can have duplicates)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted string with counts
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> extractor.format_lang_counts(['ukr', 'ukr', 'eng'])
|
||||||
|
'2ukr,eng'
|
||||||
|
"""
|
||||||
|
if not langs:
|
||||||
|
return ''
|
||||||
|
|
||||||
|
# Count occurrences while preserving order of first appearance
|
||||||
|
lang_counts = {}
|
||||||
|
lang_order = []
|
||||||
|
|
||||||
|
for lang in langs:
|
||||||
|
if lang not in lang_counts:
|
||||||
|
lang_counts[lang] = 0
|
||||||
|
lang_order.append(lang)
|
||||||
|
lang_counts[lang] += 1
|
||||||
|
|
||||||
|
# Format with counts
|
||||||
|
formatted = []
|
||||||
|
for lang in lang_order:
|
||||||
|
count = lang_counts[lang]
|
||||||
|
formatted.append(f"{count}{lang}" if count > 1 else lang)
|
||||||
|
|
||||||
|
return ','.join(formatted)
|
||||||
|
|
||||||
|
def _convert_to_iso3(self, lang_code: str) -> Optional[str]:
|
||||||
|
"""Convert a language code to ISO 639-3 (3-letter code).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
lang_code: 2 or 3 letter language code
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
ISO 639-3 code or None if invalid
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> extractor._convert_to_iso3('en')
|
||||||
|
'eng'
|
||||||
|
>>> extractor._convert_to_iso3('ukr')
|
||||||
|
'ukr'
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
lang_obj = langcodes.Language.get(lang_code)
|
||||||
|
return lang_obj.to_alpha3()
|
||||||
|
except (LookupError, ValueError, AttributeError) as e:
|
||||||
|
logger.debug(f"Invalid language code '{lang_code}': {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def is_valid_code(self, code: str) -> bool:
|
||||||
|
"""Check if a code is a valid language code.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
code: The code to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if valid language code
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = LanguageCodeExtractor()
|
||||||
|
>>> extractor.is_valid_code('eng')
|
||||||
|
True
|
||||||
|
>>> extractor.is_valid_code('xyz')
|
||||||
|
False
|
||||||
|
"""
|
||||||
|
return self._convert_to_iso3(code) is not None
|
||||||
350
renamer/utils/pattern_utils.py
Normal file
350
renamer/utils/pattern_utils.py
Normal file
@@ -0,0 +1,350 @@
|
|||||||
|
"""Pattern extraction utilities.
|
||||||
|
|
||||||
|
This module provides centralized regex pattern matching and extraction logic
|
||||||
|
for common patterns found in media filenames.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
from typing import Optional, Dict
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
from renamer.constants import MOVIE_DB_DICT
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class PatternExtractor:
|
||||||
|
"""Shared regex pattern extraction logic.
|
||||||
|
|
||||||
|
This class centralizes pattern matching for:
|
||||||
|
- Movie database IDs (TMDB, IMDB, etc.)
|
||||||
|
- Year detection and validation
|
||||||
|
- Quality indicators
|
||||||
|
- Source indicators
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> db_info = extractor.extract_movie_db_ids("[tmdbid-12345]")
|
||||||
|
>>> print(db_info) # {'type': 'tmdb', 'id': '12345'}
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Year validation constants
|
||||||
|
CURRENT_YEAR = datetime.now().year
|
||||||
|
YEAR_FUTURE_BUFFER = 10 # Allow up to 10 years in the future
|
||||||
|
MIN_VALID_YEAR = 1900
|
||||||
|
|
||||||
|
# Common quality indicators
|
||||||
|
QUALITY_PATTERNS = {
|
||||||
|
'2160p', '1080p', '720p', '480p', '360p', '240p', '144p',
|
||||||
|
'4K', '8K', 'SD', 'HD', 'UHD', 'QHD', 'LQ'
|
||||||
|
}
|
||||||
|
|
||||||
|
# Source indicators
|
||||||
|
SOURCE_PATTERNS = {
|
||||||
|
'BluRay', 'BDRip', 'BRRip', 'DVDRip', 'WEB-DL', 'WEBRip',
|
||||||
|
'HDTV', 'PDTV', 'HDRip', 'CAM', 'TS', 'TC', 'R5', 'DVD'
|
||||||
|
}
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
"""Initialize the pattern extractor."""
|
||||||
|
self.max_valid_year = self.CURRENT_YEAR + self.YEAR_FUTURE_BUFFER
|
||||||
|
|
||||||
|
def extract_movie_db_ids(self, text: str) -> Optional[dict[str, str]]:
|
||||||
|
"""Extract movie database IDs from text.
|
||||||
|
|
||||||
|
Supports patterns like:
|
||||||
|
- [tmdbid-123456]
|
||||||
|
- {imdb-tt1234567}
|
||||||
|
- [imdbid-tt123]
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search for database IDs
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary with 'type' and 'id' keys, or None if not found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.extract_movie_db_ids("[tmdbid-12345]")
|
||||||
|
{'type': 'tmdb', 'id': '12345'}
|
||||||
|
"""
|
||||||
|
# Match patterns like [tmdbid-123456] or {imdb-tt1234567}
|
||||||
|
pattern = r'[\[\{]([a-zA-Z]+(?:id)?)[-\s]*([a-zA-Z0-9]+)[\]\}]'
|
||||||
|
matches = re.findall(pattern, text)
|
||||||
|
|
||||||
|
if matches:
|
||||||
|
# Take the last match (closest to end of filename)
|
||||||
|
db_type, db_id = matches[-1]
|
||||||
|
|
||||||
|
# Normalize database type
|
||||||
|
db_type_lower = db_type.lower()
|
||||||
|
|
||||||
|
for db_key, db_info in MOVIE_DB_DICT.items():
|
||||||
|
if any(db_type_lower.startswith(pattern.rstrip('-'))
|
||||||
|
for pattern in db_info['patterns']):
|
||||||
|
return {'type': db_key, 'id': db_id}
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def extract_year(self, text: str, validate: bool = True) -> Optional[str]:
|
||||||
|
"""Extract year from text with optional validation.
|
||||||
|
|
||||||
|
Looks for 4-digit years in parentheses or standalone.
|
||||||
|
Validates that the year is within a reasonable range.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to extract year from
|
||||||
|
validate: If True, validate year is within MIN_VALID_YEAR and max_valid_year
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Year as string (e.g., "2024") or None if not found/invalid
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.extract_year("Movie Title (2024)")
|
||||||
|
'2024'
|
||||||
|
>>> extractor.extract_year("Movie (1899)") # Too old
|
||||||
|
None
|
||||||
|
"""
|
||||||
|
# Look for year in parentheses first (most common)
|
||||||
|
year_pattern = r'\((\d{4})\)'
|
||||||
|
match = re.search(year_pattern, text)
|
||||||
|
|
||||||
|
if match:
|
||||||
|
year = match.group(1)
|
||||||
|
if validate:
|
||||||
|
year_int = int(year)
|
||||||
|
if self.MIN_VALID_YEAR <= year_int <= self.max_valid_year:
|
||||||
|
return year
|
||||||
|
else:
|
||||||
|
logger.debug(f"Year {year} outside valid range "
|
||||||
|
f"{self.MIN_VALID_YEAR}-{self.max_valid_year}")
|
||||||
|
return None
|
||||||
|
return year
|
||||||
|
|
||||||
|
# Fall back to standalone 4-digit number
|
||||||
|
standalone_pattern = r'\b(\d{4})\b'
|
||||||
|
matches = re.findall(standalone_pattern, text)
|
||||||
|
|
||||||
|
for potential_year in matches:
|
||||||
|
if validate:
|
||||||
|
year_int = int(potential_year)
|
||||||
|
if self.MIN_VALID_YEAR <= year_int <= self.max_valid_year:
|
||||||
|
return potential_year
|
||||||
|
else:
|
||||||
|
return potential_year
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def find_year_position(self, text: str) -> Optional[int]:
|
||||||
|
"""Find the position of the year in text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Character index of the year, or None if not found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.find_year_position("Movie (2024) 1080p")
|
||||||
|
6 # Position of '(' before year
|
||||||
|
"""
|
||||||
|
year_pattern = r'\((\d{4})\)'
|
||||||
|
match = re.search(year_pattern, text)
|
||||||
|
|
||||||
|
if match:
|
||||||
|
year = match.group(1)
|
||||||
|
year_int = int(year)
|
||||||
|
if self.MIN_VALID_YEAR <= year_int <= self.max_valid_year:
|
||||||
|
return match.start()
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def extract_quality(self, text: str) -> Optional[str]:
|
||||||
|
"""Extract quality indicator from text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Quality string (e.g., "1080p") or None
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.extract_quality("Movie.1080p.BluRay")
|
||||||
|
'1080p'
|
||||||
|
"""
|
||||||
|
text_upper = text.upper()
|
||||||
|
|
||||||
|
for quality in self.QUALITY_PATTERNS:
|
||||||
|
# Case-insensitive search
|
||||||
|
pattern = r'\b' + re.escape(quality) + r'\b'
|
||||||
|
if re.search(pattern, text_upper, re.IGNORECASE):
|
||||||
|
return quality
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def find_quality_position(self, text: str) -> Optional[int]:
|
||||||
|
"""Find the position of quality indicator in text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Character index of quality indicator, or None if not found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.find_quality_position("Movie 1080p BluRay")
|
||||||
|
6
|
||||||
|
"""
|
||||||
|
for quality in self.QUALITY_PATTERNS:
|
||||||
|
pattern = r'\b' + re.escape(quality) + r'\b'
|
||||||
|
match = re.search(pattern, text, re.IGNORECASE)
|
||||||
|
if match:
|
||||||
|
return match.start()
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def extract_source(self, text: str) -> Optional[str]:
|
||||||
|
"""Extract source indicator from text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Source string (e.g., "BluRay") or None
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.extract_source("Movie.BluRay.1080p")
|
||||||
|
'BluRay'
|
||||||
|
"""
|
||||||
|
for source in self.SOURCE_PATTERNS:
|
||||||
|
pattern = r'\b' + re.escape(source) + r'\b'
|
||||||
|
if re.search(pattern, text, re.IGNORECASE):
|
||||||
|
return source
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def find_source_position(self, text: str) -> Optional[int]:
|
||||||
|
"""Find the position of source indicator in text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Character index of source indicator, or None if not found
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.find_source_position("Movie BluRay 1080p")
|
||||||
|
6
|
||||||
|
"""
|
||||||
|
for source in self.SOURCE_PATTERNS:
|
||||||
|
pattern = r'\b' + re.escape(source) + r'\b'
|
||||||
|
match = re.search(pattern, text, re.IGNORECASE)
|
||||||
|
if match:
|
||||||
|
return match.start()
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def extract_bracketed_content(self, text: str) -> list[str]:
|
||||||
|
"""Extract all content from square brackets.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to search
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of strings found in brackets
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.extract_bracketed_content("[UKR] Movie [ENG]")
|
||||||
|
['UKR', 'ENG']
|
||||||
|
"""
|
||||||
|
bracket_pattern = r'\[([^\]]+)\]'
|
||||||
|
return re.findall(bracket_pattern, text)
|
||||||
|
|
||||||
|
def remove_bracketed_content(self, text: str) -> str:
|
||||||
|
"""Remove all bracketed content from text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to clean
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Text with brackets and their content removed
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.remove_bracketed_content("[UKR] Movie [ENG]")
|
||||||
|
' Movie '
|
||||||
|
"""
|
||||||
|
return re.sub(r'\[([^\]]+)\]', '', text)
|
||||||
|
|
||||||
|
def split_on_delimiters(self, text: str) -> list[str]:
|
||||||
|
"""Split text on common delimiters (dots, spaces, underscores).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to split
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of parts
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.split_on_delimiters("Movie.Title.2024")
|
||||||
|
['Movie', 'Title', '2024']
|
||||||
|
"""
|
||||||
|
return re.split(r'[.\s_]+', text)
|
||||||
|
|
||||||
|
def sanitize_for_regex(self, text: str) -> str:
|
||||||
|
"""Escape special regex characters in text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to sanitize
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Escaped text safe for use in regex patterns
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.sanitize_for_regex("Movie (2024)")
|
||||||
|
'Movie \\(2024\\)'
|
||||||
|
"""
|
||||||
|
return re.escape(text)
|
||||||
|
|
||||||
|
def is_quality_indicator(self, text: str) -> bool:
|
||||||
|
"""Check if text is a quality indicator.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if text is a known quality indicator
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.is_quality_indicator("1080p")
|
||||||
|
True
|
||||||
|
"""
|
||||||
|
return text.upper() in self.QUALITY_PATTERNS
|
||||||
|
|
||||||
|
def is_source_indicator(self, text: str) -> bool:
|
||||||
|
"""Check if text is a source indicator.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Text to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if text is a known source indicator
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> extractor = PatternExtractor()
|
||||||
|
>>> extractor.is_source_indicator("BluRay")
|
||||||
|
True
|
||||||
|
"""
|
||||||
|
return any(source.lower() == text.lower() for source in self.SOURCE_PATTERNS)
|
||||||
Reference in New Issue
Block a user