TorchCodec 0.14: HDR Video Decoding for CPU and CUDA, and Fast Wav Decoder
TorchCodec v0.14: High-Performance HDR Video and Rapid Wav Decoding
Released on June 3rd by NicolasHug, the v0.14.0 update to torchcodec introduces significant enhancements to how audio and video are handled, specifically targeting speed and color precision.
🚀 Core Feature Highlights
The latest release focuses on two primary additions: a streamlined audio decoder and expanded capabilities for high-dynamic-range video.
1. Accelerated Wav Decoding
The new WavDecoder is designed for maximum efficiency. Unlike previous methods, it relies on FFmpeg completely bypasses FFmpeg, reading WAV data directly from the source.
- Versatility: It can process data from files, raw bytes, or file-like objects.
- Format Support: Compatible with various sample formats, including
int16,int32, andfloat32.
Implementation Example:
from torchcodec.decoders import WavDecoder
# Initialize the decoder with a wav file
decoder = WavDecoder("audio.wav")
# Extract all samples (returns AudioSamples object with data and sample_rate)
samples = decoder.get_all_samples()
2. HDR Video Support
The VideoDecoder has been upgraded to support High Dynamic Range (HDR) video. This ensures that the rich color detail of HDR content is not lost during the decoding process.
To maintain full precision, users must set the output_dtype to torch.float32. This results in RGB frames mapped to the mathematical range:
Implementation Example:
from torchcodec.decoders import VideoDecoder
# Load HDR video ensuring float32 precision
decoder = VideoDecoder("hdr_video.mp4", output_dtype=torch.float32)
# Access the first frame with full HDR precision
frame = decoder[0]
⚠️ Note: This HDR functionality is currently in the beta stage. Expect potential behavioral adjustments as the team incorporates user feedback.
🛠️ Technical Improvements & Bug Fixes
The update also streamlines the installation process and resolves critical stability issues.
System Changes
| Category | Change | Impact |
|---|---|---|
| Audio | Optimized seeking in AudioDecoder | Significantly faster navigation (#1449) |
| Dependencies | Simplified CUDA installation/usage |
Resolved Issues
- CUDA Stability: Fixed a rare crash that occurred during the process teardown phase of the CUDA decoder (#1441).
- Dimension Handling: Resolved an issue where CUDA decoding failed for videos with odd-numbered dimensions (#1462).
📊 Summary Checklist & Workflow
Release Progress:
- Implement
WavDecoder(FFmpeg-free) - Enable HDR
float32decoding - Optimize
AudioDecoderseeking - Remove NPP dependency
- Patch CUDA odd-dimension bug
Decoding Logic Flow: