Convert between FASTQ formats

In 
Convert between linked-read FASTQ data formats

In the event you need your linked-read data converted into a different linked-read format, djinn has you covered. We might disagree on the fragmented format landscape, but that doesn't mean you shouldn't be able to use your data how and where you want to. This command converts a paired-end read set of FASTQ files between the common linked-read FASTQ types.

usage
djinn fastq convert PREFIX TARGET FQ1 FQ2
example | tellseq → stlfr
djinn fastq convert data/orcs_stlfr stlfr data/orcs.R1.fq.gz data/orcs.R2.fq.gz

Auto-detects the input format as one of 10X, haplotagging, TELLseq, or stLFR, and converts it to the format provided as the TARGET positional argument. If the input data is in 10X format, where the barcode is the first [usually] 16 bases in the R1 sequence, you will need to provide a --barcodes file that lets Djinn know what barcodes to look for. In all cases, a file will be created with the barcode conversion map.

ATGGAAGCCGTAGTTA
ACGGAAGCCGTAGTTC
ATGGAAGAAATAGTTA
ATGTTTGCCGTAGTTT

Conversion targets

TARGET barcode format example
10x the first N base pairs of R1, given --barcodes  
haplotagging a BX:Z:ACBD SAM tag in the sequence header @SEQID BX:Z:A01C93B56D11
stlfr #1_2_3 format appended to the sequence ID @SEQID#1_2_3
tellseq :ATCG format appended to the sequence ID @SEQID:GGCAAATATCGAGAAGTC

Running Options

argument description
PREFIX required output filename prefix
TARGET required target format for output FASTQ files
FQ1 required forward reads of FASTQ pair
FQ2 required reverse reads of FASTQ pair
-b --barcodes conditional file of nucleotide barcodes (one per line) to identify inline barcodes in input 10X data
-c --cache-size hidden number of reads to store before writing (bigger is faster, default: 10000)
-t --threads Number of threads to use for writing compressed output fastq files (default: 2)