Link Search Menu Expand Document

BISCUIT - Understand Sequencing Data with Bisulfite Conversion

Get started now View it on GitHub


BISulfite-seq CUI Toolkit (BISCUIT) is a utility suite for analyzing sodium bisulfite- or enzyme-based DNA methylation/modification data. It was written to perform read alignment, DNA methylation and mutation calling, and allele specific methylation from bisulfite or bisulfite-like sequencing data.

BISCUIT was developed by Wanding Zhou while he was a member of the Shen Lab at Van Andel Institute. He now holds a faculty position at University of Pennsylvania and Children’s Hospital of Philadelphia. BISCUIT is currently maintained by Jacob Morrison (who also developed the User’s Guide website) in the Shen Lab. The development of BISCUIT was done at https://github.com/zhou-lab/biscuit, but is now maintained at https://github.com/huishenlab/biscuit.

Quick Start

In order to get started right away with performing analyses with BISCUIT, precompiled binaries are available for download on the BISCUIT release page. Note, binaries are only available for Linux and macOS. (See Download and Install for more information about downloading and installing BISCUIT).

The basic workflow to align and extract methylation information using BISCUIT is:

  1. Create an index of the reference genome (only needs to be done once for each reference).
  2. Align sequencing reads to the reference.
  3. Create a pileup VCF of DNA methylation and genetic information.
  4. Extract DNA methylation into BED format.

Practically, the commands to run are:

# Create index of the reference genome (only needs to be run once for each reference)
# Gzipped FASTA references can also be used
biscuit index my_reference.fa

# Align sequencing reads to the reference
# Gzipped FASTQ files can also be used
biscuit align -R "my_rg" /path/to/my_reference.fa read1.fastq read2.fastq |
    samblaster | samtools sort --write-index -o my_output.bam -O BAM -

# Create a pileup VCF of DNA methylation and genetic information
# Also compresses and indexes the VCF
biscuit pileup -o my_pileup.vcf /path/to/my_reference.fa my_output.bam
bgzip my_pileup.vcf
tabix -p vcf my_pileup.vcf.gz

# Extract DNA methylation into BED format
# Also compresses and indexes the BED
biscuit vcf2bed my_pileup.vcf.gz > my_methylation_data.bed
bgzip my_methylation_data.bed
tabix -p bed my_methylation_data.bed.gz

This basic order of commands will all the necessary files needed to read data into R using the R/Bioconductor companion package, biscuiteer.

An overview of all available functionalities can be found below in the Overview of Functionalities section.

Download and Install

BISCUIT is available as a precompiled binary (for macOS and Linux), as source code for compilation on your own machine, as a conda recipe, or as a Docker container.

Download Precompiled Binaries

Precompiled binaries can be found on the latest release page on GitHub. Currently, there are only precompiled binaries for the latest versions of Linux and macOS. You can also download the binaries directly from the terminal using the following one-liner:

On macOS,

curl -OL $(curl -s https://api.github.com/repos/huishenlab/biscuit/releases/latest |
    grep browser_download_url | grep darwin_amd64 | cut -d '"' -f 4)
mv biscuit_* biscuit
chmod +x biscuit

On Linux,

curl -OL $(curl -s https://api.github.com/repos/huishenlab/biscuit/releases/latest |
    grep browser_download_url | grep linux_amd64 | cut -d '"' -f 4)
mv biscuit_* biscuit
chmod +x biscuit

To download the scripts to generate the QC asset and data files, run

# QC asset build
curl -OL $(curl -s https://api.github.com/repos/huishenlab/biscuit/releases/latest |
    grep browser_download_url | grep build_biscuit_QC_assets.pl | cut -d '"' -f 4

# QC bash script
curl -OL $(curl -s https://api.github.com/repos/huishenlab/biscuit/releases/latest |
    grep browser_download_url | grep QC.sh | cut -d '"' -f 4

These commands work on both macOS and Linux.

Download Source Code and Compile

The source code for BISCUIT can be downloaded using either git or curl. Compilation requires that zlib and ncurses are in the PATH environment variable.

Using git,

git clone --recursive git@github.com:huishenlab/biscuit.git
cd biscuit
make

Note, after v0.2.0, if downloading via git, make sure to use the --recursive flag to get the submodules. If an SSH key has not been set up, and you receive a “permission denied” error, replace the first line with

git clone --recursive https://github.com/huishenlab/biscuit.git

Using curl,

curl -OL $(curl -s https://api.github.com/repos/huishenlab/biscuit/releases/latest |
    grep browser_download_url | grep release-source.zip | cut -d '"' -f 4)
unzip release-source.zip
cd biscuit-release
make

The QC related scripts can be found in the scripts/ directory.

Download with Conda

Note, this requires that conda has been installed. To download with conda, run:

conda install -c bioconda biscuit

This will also install both QC.sh and build_biscuit_QC_assets.pl.

Download the Docker Container

The Docker container can be downloaded from GitHub via:

git clone git@github.com:huishenlab/sv_calling_docker.git

For more information about the docker container, see Structural Variant Calling.

Overview of Functionalities

The following list provides an overview of the different subcommands and the various functionalities provided by biscuit. You can also find much of this by typing biscuit in the terminal. Help for each subcommand can be found on the BISCUIT Subcommands page or by typing biscuit (subcommand) in the terminal.

Read Mapping

  • index Index reference genome (see Read Mapping)
  • align Map bisulfite converted short reads to reference (see Read Mapping)

BAM Operation

  • tview View read mapping in terminal with bisulfite coloring (see Visualization under the Read Mapping tab)
  • bsstrand Investigate bisulfite conversion strand label (see Quality Control under the Read Mapping tab)
  • bsconv Investigate bisulfite conversion rate (see Quality Control under the Read Mapping tab)
  • cinread Print cytosine-read pair in a long form (see Quality Control under the Read Mapping tab)

Methylation and SNP Extraction

Epi-read & Epi-allele

Other

  • version Print biscuit and library versions
  • qc Generate QC files from BAM (see Quality Control)

About the project

This package is made by the folks from Van Andel Institute with help from prior code base from the internet.

Acknowledgement

  • lib/aln was adapted from Heng Li’s BWA-mem code.
  • lib/htslib was submoduled from the htslib library.
  • lib/klib was submoduled from Heng Li’s klib.
  • This work is supported by NIH/NCI R37CA230748 and U24CA210969.

Reference

In preparation