#
Other Harpy modules
On this page you'll find Harpy functions that aren't standalone workflows. These may create necessary inputs, continue where you left off, or view important workflow files.
#
Other modules
#
imputeparams
Create a template parameter file for the impute module. The file is formatted correctly and serves as a starting point for using parameters that make sense for your study.
harpy imputeparams -o OUTPUTFILE
harpy imputeparams -o params.stitch
#
arguments
Typically, one runs STITCH multiple times, exploring how results vary with different model parameters. The solution Harpy uses for this is to have the user provide a tab-delimited dataframe file where the columns are the 6 STITCH model parameters and the rows are the values for those parameters. To make formatting easier, a template file is generated for you, just replace the values and add/remove rows as necessary. See the section for the impute module for details on these parameters. The template file will look like:
name model usebx bxlimit k s ngen
k10_ng50 diploid TRUE 50000 3 2 10
k1_ng30 diploid TRUE 50000 3 1 5
high_ngen diploid TRUE 50000 15 1 100
#
resume
When calling a workflow (e.g.
qc
), Harpy performs various file checks
and validations, sets up the Snakemake command, output folder(s), etc. In the event you want to continue a
failed or manually terminated workflow without overwriting the workflow files (e.g. config.yaml
),
you can use
harpy resume
. using resume
also skips all input validations.
harpy resume [--conda] DIRECTORY
#
arguments
The DIRECTORY
is the output directory of a previous harpy-invoked workflow, which must have the workflow/config.yaml
file.
For example, if you previously ran harpy align bwa -o align-bwa ...
, then you would use harpy resume align-bwa
,
which would have the necessary workflow/config.yaml
(and other necessary things) required to successfully continue the workflow.
Using
resume
does not overwrite any preprocessing files in the target directory (whereas rerunning the workflow would),
which means you can also manually modify the config.yaml
file (advanced, not recommended unless you are confident with what you're doing).
resume
also requires an existing and populated workdir/envs/
directory in the target directory, like the kind all
main harpy
workflows would create. If one is not present, you can use --conda
to create one.
#
popgroup
Creates a sample grouping file for variant calling
harpy popgroup -o OUTPUTFILE INPUTS
harpy popgroup -o samples.groups data/
#
arguments
This optional file is useful if you want SNP variant calling to happen on a per-population level via harpy snp or on samples pooled-as-populations via harpy sv .
- takes the format of sample tab group
- all the samples will be assigned to group
pop1
since file names don't always provide grouping information- so make sure to edit the second column to reflect your data correctly.
- the file will look like:
sample1 pop1
sample2 pop1
sample3 pop2
sample4 pop1
sample5 pop3
#
view
This convenience command lets you view the latest workflow log file
of a Harpy output directory. Use --snakefile
or --config
to view the workflow
snakefile or config.yaml file instead, respectively. Output is printed to the screen via less
and
accepts the typical keyboard shortcuts to navigate the output.
harpy view [-s] [-c] DIRECTORY
harpy view Align/bwa