Quick Start

This page provides minimal end-to-end examples to get you running quickly. For detailed explanations of each step, see the Pipeline Guide. For a full list of all CLI options, see the CLI Reference.

Prerequisites

  • A GPU with at least 16 GB VRAM is recommended (see Resource Requirements)
  • Raw FASTA or FASTQ files from your sequencing run
  • A reference genome FASTA for alignment
  • A barcode whitelist TSV (for whitelist-based workflow) or none (for whitelist-free workflow)

Available Models

Tranquillyzer ships with pre-trained models for common 10x Genomics protocols:

Parameter 10x5p_sc_ont 10x3p_sc_ont
Batch Size 128 128
Training Fraction 0.8 0.8
Vocab Size 5 5
Embedding Dimension 128 128
Conv Layers 4 3
Conv Filters 128 128
Conv Kernel Size 25 25
Dilation Rates [1, 1, 1, 1] [1, 3, 5]
LSTM Layers 1 1
LSTM Units 96 96
Bidirectional True True
CRF Layer True True
Attention Heads 0 0
Dropout Rate 0.35 0.35
Regularization 0.01 0.01
Learning Rate 0.001 0.001
Epochs 5 5

Use tranquillyzer availablemodels to see all installed models and their configurations.

Whitelist-Based Pipeline

The most common workflow — you have a barcode whitelist from your library prep kit.

# 1. Preprocess raw reads into length-binned Parquets
tranquillyzer preprocess \
    --threads 12 \
    /path/to/fastq_dir \
    /path/to/output

# 2. Annotate reads + correct barcodes + demultiplex (single pass)
tranquillyzer annotate-reads \
    --model-name 10x3p_sc_ont_013 \
    --gpu-mem 48 \
    --threads 12 \
    --run-barcode-correction \
    --run-demux \
    --output-fmt fasta \
    /path/to/output \
    /path/to/whitelist.tsv

# 3. Align to reference genome
tranquillyzer align \
    --threads 12 \
    /path/to/output \
    /path/to/reference.fa \
    /path/to/output

# 4. Mark PCR duplicates
tranquillyzer dedup \
    --threads 12 \
    /path/to/output

# 5. Generate QC report
tranquillyzer qc-metrics \
    --threads 4 \
    --bam /path/to/output/aligned_files/dup_marked.bam \
    /path/to/output

Whitelist-Free Pipeline

Use this when you do not have a barcode whitelist — Tranquillyzer discovers cell barcodes from the data.

# 1. Preprocess
tranquillyzer preprocess \
    --threads 12 \
    /path/to/fastq_dir \
    /path/to/output

# 2. Annotate reads (no whitelist)
tranquillyzer annotate-reads \
    --model-name 10x3p_sc_ont_013 \
    --gpu-mem 48 \
    --threads 12 \
    /path/to/output

# 3. Discover cell barcodes via knee-point detection
tranquillyzer generate-whitelist \
    --model-name 10x3p_sc_ont_013 \
    --expected-cells 5000 \
    /path/to/output

# 4. Correct barcodes using the discovered whitelist + demultiplex
tranquillyzer barcode-correct \
    --run-demux \
    --output-fmt fasta \
    --threads 12 \
    /path/to/output \
    /path/to/output/annotation_metadata/discovered_whitelist.tsv

# 5. Align, dedup, and QC (same as whitelist-based)
tranquillyzer align --threads 12 /path/to/output /path/to/reference.fa /path/to/output
tranquillyzer dedup --threads 12 /path/to/output
tranquillyzer qc-metrics --threads 4 --bam /path/to/output/aligned_files/dup_marked.bam /path/to/output

Software Dependencies

  • Docker (recommended): handles all dependencies and provides portability across systems. Singularity and Apptainer are also supported.
  • Manual install: requires mamba or conda. Dependencies are listed in environment.yml.
  • TensorFlow: has its own GPU requirements — see TensorFlow’s documentation.

See the Install page for detailed setup instructions.