#
Find structural variants
The snp module identifies single nucleotide polymorphisms (SNP) and small indels, but you may want to (and should!) leverage the linked-read data to identify larger structural variants (SV) like large deletions, duplications, and inversions. Harpy provides two linked-read variant callers to do exactly that:
#
LEVIATHAN
LEVIATHAN relies on split-read information in the sequence alignments to call variants. It requires less preprocessing work to get it up and running, so it's a great place to start.
#
NAIBR
While our testing shows that NAIBR tends to find known inversions that LEVIATHAN misses, the program requires haplotype
phased bam files as input. That means the alignments have a PS
or HP
tag that indicate
which haplotype the read/alignment belongs to. If your alignments don't have phasing tags (none of the current aligners in Harpy do this),
then you will need to use the
phase
module to phase your SNPs into haplotypes so
the
sv naibr
module can use that to phase your input alignments and proceed as planned.