Quickstart

Quick start example

Run whole pipeline:

tb-profiler profile -1 /path/to/reads_1.fastq.gz -2 /path/to/reads_2.fastq.gz -p prefix

The -p argument allows you to provide a prefix to the resulting output files. This is useful when you need to run more that one sample. This will store BAM, VCF and result files in respective directories. Results are output in json and text format.

Example run

mkdir test_run; cd test_run
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR166/009/ERR1664619/ERR1664619_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR166/009/ERR1664619/ERR1664619_2.fastq.gz
tb-profiler profile -1 ERR1664619_1.fastq.gz -2 ERR1664619_2.fastq.gz -t 4 -p ERR1664619
cat results/ERR1664619.results.json

Running with an existing BAM file

By using the -a option you can specify to use an existing BAM file instead of fastq files.

tb-profiler profile -a /path/to/bam -p test

Warning

The BAM files must have been created using the version of the genome as the database which can be downloaded here. Confusingly, this genome has multiple accession numbers (ASM19595v2, NC_000962.3, GCF_000195955.2, etc...). If you believe your reference to be the exact same sequence (length should be 4411532) then you can create a database with the same sequence name as used in your BAM file. For example if your sequence name is "NC_000962.3" you can do this by executing the following:

tb-profiler update_tbdb --match_ref /path/to/your/reference.fasta