#
Resolve barcodes shared by different molecules
- paired-end reads from an Illumina sequencer in FASTQ format
gzip recommended
- sample name: a-z 0-9 . _ - case insensitive
- forward: _F .F .1 _1 _R1_001 .R1_001 _R1 .R1
- reverse: _R .R .2 _2 _R2_001 .R2_001 _R2 .R2
- fastq extension: .fq .fastq case insensitive
Running
deconvolve
is optional. In the alignment
workflows (
align bwa
align strobe
), Harpy already uses a distance-based approach to
deconvolve barcodes and assign MI
tags (Molecular Identifier). This workflow uses a reference-free method,
QuickDeconvolution, which uses k-mers to look at "read clouds" (all reads with the same linked-read barcode)
and decide which ones likely originate from different molecules. Regardless of whether you run
this workflow or not,
harpy align
will still perform its own deconvolution.
Also in harpy qc
This method of deconvolution is also available as an option in the qc workflow
harpy deconvolve OPTIONS... INPUTS...
#
Running Options
#
Resulting Barcodes
After deconvolution, some barcodes may have a hyphenated suffix like -1
or -2
(e.g. A01C33B41D93-1
).
This is how deconvolution methods create unique variants of barcodes to denote that identical barcodes
do not come from the same original molecules. QuickDeconvolution adds the -0
suffix to barcodes it was unable
to deconvolve.
#
Harpy Deconvolution Nuances
Some of the downstream linked-read tools Harpy uses expect linked read barcodes to either look like the 16-base 10X
variety or a standard haplotag (AxxCxxBxxDxx). Their pattern-matching would not recognize barcodes deconvoluted with
hyphens. To remedy this, MI
assignment in
align bwa
and
align strobe
will assign the deconvolved (hyphenated) barcode to a DX:Z
tag and restore the original barcode as the BX:Z
tag.