A fast, memory-efficient Rust tool that converts comic book archives (CBR, CBZ, CBT) to PDF files with parallel processing.
- Multi-format support: CBR (RAR), CBZ (ZIP), and CBT (TAR.GZ) archives
- Intelligent file detection: Uses magic bytes with extension fallback
- Directory grouping: Automatically groups files by directory and processes only the first file (others treated as parts)
- Parallel processing: Uses 4 threads for simultaneous volume conversion
- Memory efficient: Minimal memory footprint with streaming image processing
- Quality preservation:
- No lossy re-encoding of JPEG files
- Preserves PNG transparency (alpha channels)
- Direct embedding of original image data
- Automatic PDF naming: Creates PDFs named after the source directory
- MacOS compatibility: Filters out
._
and__MACOSX
metadata files
- Rust 1.70+ with Cargo
- System dependencies for archive formats:
unrar
library for CBR fileszlib
for ZIP/CBZ files
git clone <repository-url>
cd cb2pdf
cargo build --release
The project includes comprehensive GitHub Actions workflows:
- Multi-platform builds: Linux, Windows, macOS
- Code quality:
cargo fmt
,cargo clippy
- Security audit:
cargo audit
- Code coverage: Using
cargo-tarpaulin
- Artifact uploads: Build binaries for each platform
- Tagged releases: Triggered on version tags (e.g.,
v1.0.0
) - Cross-platform binaries: Automatic GitHub releases with binaries
- Installation instructions: Included in release notes
- Dependabot: Weekly dependency updates
- Security monitoring: Automated vulnerability scanning
cb2pdf [OPTIONS] <PATTERN>
-t, --threads <NUM>
: Number of threads to use (1 to number of CPU cores)-h, --help
: Show help information
Process all comic book archives in subdirectories:
cb2pdf "comics/*/*.cbr"
Process specific volume directories:
cb2pdf "files/Volume*/*.cbr"
Use 8 threads for faster processing:
cb2pdf -t 8 "files/*/*.cbr"
Single-threaded processing:
cb2pdf --threads 1 "files/*/*.cbr"
The program will:
- Find all CBR/CBZ/CBT files matching the pattern
- Group them by directory name
- Process only the first file from each directory (treating others as parts)
- Create PDFs named after each directory (e.g.,
Volume 50.pdf
) - Use parallel threads for processing (default: 4 threads or number of CPU cores, whichever is smaller)
- Magic bytes: Primary detection using file headers
- RAR:
Rar!
- ZIP:
PK
- TAR.GZ:
\x1f\x8b
- RAR:
- Extension fallback: Falls back to file extension if magic bytes fail
- Uses
ImageReader::into_dimensions()
to get image size without full decoding - Processes images one at a time rather than loading all into memory
- Direct file data embedding with
RawImage::decode_from_bytes()
- JPEG files: Uses original file bytes directly (no re-compression)
- PNG files: Preserves transparency and original quality
- Other formats: Converts to high-quality JPEG (95% quality) only when necessary
- Creates pages sized to match image dimensions (96 DPI)
- Uses
XObjectTransform
for proper image scaling and positioning - Embeds images as XObjects for efficient PDF structure
- Parallel processing: Configurable threads (1 to number of CPU cores)
- Memory usage: ~50-100MB per thread regardless of archive size
- Speed: Processes typical 20-page volumes in 10-30 seconds
- Output size: Reasonable expansion (2.7MB CBR → 7.2MB PDF is typical)
- Thread safety: Validates thread count and warns about excessive values
[dependencies]
glob = "0.3.2" # File pattern matching
image = "0.25.6" # Image processing
printpdf = { version = "0.8.2", features = ["jpeg"] } # PDF generation
tempfile = "3.20.0" # Temporary directories
unrar = "0.5.8" # RAR extraction
zip = "2.2.0" # ZIP extraction
tar = "0.4.41" # TAR extraction
flate2 = "1.0.34" # GZIP decompression
rayon = "1.7.0" # Parallel processing
clap = { version = "4.4.0", features = ["derive"] } # Command-line argument parsing
num_cpus = "1.16.0" # CPU core detection
src/
├── main.rs # Main application logic
│ ├── Archive format detection
│ ├── Multi-format extraction (CBR/CBZ/CBT)
│ ├── Directory grouping and parallel processing
│ └── Memory-efficient PDF creation
└── Cargo.toml # Dependencies and build config
- Graceful failures: Individual volume failures don't stop batch processing
- Detailed logging: Shows progress and warnings for each volume
- Format validation: Validates archive formats before processing
- Image filtering: Skips invalid images and MacOS metadata
- RAR dependency: Requires
unrar
library for CBR files - Memory usage: Still loads full image data during PDF creation (printpdf limitation)
- Single archive per directory: Assumes first file contains all content
- No streaming PDF: PDF must be fully constructed before writing (printpdf limitation)
$ cb2pdf -t 8 "files/*/*.cbr"
Using 8 threads for parallel processing (CPU cores: 32)
Found 44 volumes to process
Volume 28: "files/Volume 28/245.cbr"
Volume 29: "files/Volume 29/254.cbr"
...
Processing Volume 50: "files/Volume 50/464.cbr"
Detected format: Zip
Found 17 images
Creating PDF with 17 images, memory-efficient processing
✅ Successfully processed Volume 50
🎉 All volumes processed!
This project is licensed under the MIT License - see the LICENSE file for details.