Alignment and Deduplication
After demultiplexing, reads are aligned to a reference genome and PCR duplicates are marked.
Alignment
The align command runs minimap2 for alignment and samtools for sorting and indexing, producing a coordinate-sorted BAM:
tranquillyzer align \
--threads 12 \
INPUT_DIR \
REFERENCE \
OUTPUT_DIRWhere INPUT_DIR is the directory containing demuxed_fasta/demuxed.fasta (typically the same as your annotation output directory), and REFERENCE is the reference genome FASTA.
Alignment Options
| Option | Default | Description | When to change |
|---|---|---|---|
INPUT_DIR |
required | Directory with demuxed reads | |
REFERENCE |
required | Reference genome FASTA | |
OUTPUT_DIR |
required | Where to write BAM output | |
--preset |
splice |
minimap2 preset (-ax <preset>) |
Use map-ont for non-spliced alignment |
--filt-flag |
260 | samtools filter flag (-F) |
Default filters secondary + unmapped |
--mapq |
0 | Minimum MAPQ threshold | Increase for stricter filtering |
--threads |
12 | CPU threads for minimap2/samtools | Match your available cores |
--add-minimap-args |
None | Additional minimap2 arguments | For protocol-specific settings |
Output
aligned_files/demuxed_aligned.bam:coordinate-sorted BAMaligned_files/demuxed_aligned.bam.bai:BAM index
Duplicate Marking
The dedup command marks PCR duplicates in the aligned BAM. A set of reads are considered PCR duplicates if all of the following are true:
- Their start and end positions fall within a defined window of each other
- They have the same strand orientation
- They have the same corrected cell barcode
- Their UMIs match within a Levenshtein edit distance threshold
One read from each duplicate set is kept as the “original”; the others are flagged as PCR/optical duplicates using standard SAM flags and auxiliary tags.
tranquillyzer dedup \
--threads 12 \
INPUT_DIRDedup Options
| Option | Default | Description | When to change |
|---|---|---|---|
INPUT_DIR |
required | Directory containing aligned_files/demuxed_aligned.bam |
|
--lv-threshold |
2 | Levenshtein distance for UMI similarity | Increase for noisier UMIs |
--stranded / --no-stranded |
--stranded |
Directional library | Use --no-stranded for non-directional libraries |
--per-cell / --no-per-cell |
--per-cell |
Deduplicate per cell barcode | Disable for bulk experiments |
--threads |
12 | CPU threads | Match your available cores |
Output
aligned_files/dup_marked.bam:deduplicated BAM with duplicate flagsaligned_files/dup_marked.bam.bai:BAM index
Both alignment and deduplication can run on CPU-only machines.