The biscuitSifter Pipeline
The biscuitSifter pipeline combines BISCUIT, dupsifter, and samtools to align reads, mark duplicates, sort, and index the aligned reads in an easy two-step process.
biscuit align -@ NTHREADS -R "my_rg" /path/to/my_reference.fa read1.fq.gz read2.fq.gz | \
dupsifter /path/to/my_reference.fa | \
samtools sort -@ NTHREADS -o my_output.bam -O BAM -
samtools index my_output.bam
where NTHREADS
is the number of threads, "my_rg"
is the read group (if applicable) to be used, /path/to/my_reference.fa
is the FASTA file for the reference genome, read*.fq.gz
are the read1 and read2 FASTQ files from the sequencing run, and my_output.bam
is the name of the output BAM file.