CB2PDF - Comic Book Archive to PDF Converter

A fast, memory-efficient Rust tool that converts comic book archives (CBR, CBZ, CBT) to PDF files with parallel processing.

Features

Multi-format support: CBR (RAR), CBZ (ZIP), and CBT (TAR.GZ) archives
Intelligent file detection: Uses magic bytes with extension fallback
Directory grouping: Automatically groups files by directory and processes only the first file (others treated as parts)
Parallel processing: Uses 4 threads for simultaneous volume conversion
Memory efficient: Minimal memory footprint with streaming image processing
Quality preservation:
- No lossy re-encoding of JPEG files
- Preserves PNG transparency (alpha channels)
- Direct embedding of original image data
Automatic PDF naming: Creates PDFs named after the source directory
MacOS compatibility: Filters out ._ and __MACOSX metadata files

Installation

Prerequisites

Rust 1.70+ with Cargo
System dependencies for archive formats:
- unrar library for CBR files
- zlib for ZIP/CBZ files

Build from source

git clone <repository-url>
cd cb2pdf
cargo build --release

CI/CD

The project includes comprehensive GitHub Actions workflows:

Continuous Integration (`.github/workflows/ci.yml`)

Multi-platform builds: Linux, Windows, macOS
Code quality: cargo fmt, cargo clippy
Security audit: cargo audit
Code coverage: Using cargo-tarpaulin
Artifact uploads: Build binaries for each platform

Release Automation (`.github/workflows/release.yml`)

Tagged releases: Triggered on version tags (e.g., v1.0.0)
Cross-platform binaries: Automatic GitHub releases with binaries
Installation instructions: Included in release notes

Dependency Management

Dependabot: Weekly dependency updates
Security monitoring: Automated vulnerability scanning

Usage

cb2pdf [OPTIONS] <PATTERN>

Options

-t, --threads <NUM>: Number of threads to use (1 to number of CPU cores)
-h, --help: Show help information

Examples

Process all comic book archives in subdirectories:

cb2pdf "comics/*/*.cbr"

Process specific volume directories:

cb2pdf "files/Volume*/*.cbr"

Use 8 threads for faster processing:

cb2pdf -t 8 "files/*/*.cbr"

Single-threaded processing:

cb2pdf --threads 1 "files/*/*.cbr"

The program will:

Find all CBR/CBZ/CBT files matching the pattern
Group them by directory name
Process only the first file from each directory (treating others as parts)
Create PDFs named after each directory (e.g., Volume 50.pdf)
Use parallel threads for processing (default: 4 threads or number of CPU cores, whichever is smaller)

Architecture

Archive Format Detection

Magic bytes: Primary detection using file headers
- RAR: Rar!
- ZIP: PK
- TAR.GZ: \x1f\x8b
Extension fallback: Falls back to file extension if magic bytes fail

Memory Efficiency

Uses ImageReader::into_dimensions() to get image size without full decoding
Processes images one at a time rather than loading all into memory
Direct file data embedding with RawImage::decode_from_bytes()

Quality Preservation

JPEG files: Uses original file bytes directly (no re-compression)
PNG files: Preserves transparency and original quality
Other formats: Converts to high-quality JPEG (95% quality) only when necessary

PDF Creation

Creates pages sized to match image dimensions (96 DPI)
Uses XObjectTransform for proper image scaling and positioning
Embeds images as XObjects for efficient PDF structure

Performance

Parallel processing: Configurable threads (1 to number of CPU cores)
Memory usage: ~50-100MB per thread regardless of archive size
Speed: Processes typical 20-page volumes in 10-30 seconds
Output size: Reasonable expansion (2.7MB CBR → 7.2MB PDF is typical)
Thread safety: Validates thread count and warns about excessive values

Dependencies

[dependencies]
glob = "0.3.2"                                    # File pattern matching
image = "0.25.6"                                  # Image processing
printpdf = { version = "0.8.2", features = ["jpeg"] } # PDF generation
tempfile = "3.20.0"                               # Temporary directories
unrar = "0.5.8"                                   # RAR extraction
zip = "2.2.0"                                     # ZIP extraction  
tar = "0.4.41"                                    # TAR extraction
flate2 = "1.0.34"                                 # GZIP decompression
rayon = "1.7.0"                                   # Parallel processing
clap = { version = "4.4.0", features = ["derive"] } # Command-line argument parsing
num_cpus = "1.16.0"                               # CPU core detection

File Structure

src/
├── main.rs                 # Main application logic
│   ├── Archive format detection
│   ├── Multi-format extraction (CBR/CBZ/CBT)
│   ├── Directory grouping and parallel processing
│   └── Memory-efficient PDF creation
└── Cargo.toml              # Dependencies and build config

Error Handling

Graceful failures: Individual volume failures don't stop batch processing
Detailed logging: Shows progress and warnings for each volume
Format validation: Validates archive formats before processing
Image filtering: Skips invalid images and MacOS metadata

Limitations

RAR dependency: Requires unrar library for CBR files
Memory usage: Still loads full image data during PDF creation (printpdf limitation)
Single archive per directory: Assumes first file contains all content
No streaming PDF: PDF must be fully constructed before writing (printpdf limitation)

Example Output

$ cb2pdf -t 8 "files/*/*.cbr"
Using 8 threads for parallel processing (CPU cores: 32)
Found 44 volumes to process
  Volume 28: "files/Volume 28/245.cbr"
  Volume 29: "files/Volume 29/254.cbr"
  ...
Processing Volume 50: "files/Volume 50/464.cbr"
  Detected format: Zip
  Found 17 images
Creating PDF with 17 images, memory-efficient processing
✅ Successfully processed Volume 50
🎉 All volumes processed!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CB2PDF - Comic Book Archive to PDF Converter

Features

Installation

Prerequisites

Build from source

CI/CD

Continuous Integration (`.github/workflows/ci.yml`)

Release Automation (`.github/workflows/release.yml`)

Dependency Management

Usage

Options

Examples

Architecture

Archive Format Detection

Memory Efficiency

Quality Preservation

PDF Creation

Performance

Dependencies

File Structure

Error Handling

Limitations

Example Output

License

About

Uh oh!

Releases

Languages

License

3axap4eHko/cb2pdf

Folders and files

Latest commit

History

Repository files navigation

CB2PDF - Comic Book Archive to PDF Converter

Features

Installation

Prerequisites

Build from source

CI/CD

Continuous Integration (.github/workflows/ci.yml)

Release Automation (.github/workflows/release.yml)

Dependency Management

Usage

Options

Examples

Architecture

Archive Format Detection

Memory Efficiency

Quality Preservation

PDF Creation

Performance

Dependencies

File Structure

Error Handling

Limitations

Example Output

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages

Continuous Integration (`.github/workflows/ci.yml`)

Release Automation (`.github/workflows/release.yml`)