strandedness
- description
- Derives the experimental strandedness protocol used to generate the input RNA-Seq BAM file. Reports evidence supporting final results.
- outputs
- {'strandedness_file': 'TSV file containing the
ngsderive strandedness
report', 'strandedness_string': 'The derived strandedness, in string format'}
Inputs
Required
_runtime
(Any, required)bam
(File, required): Input BAM format file to derive strandedness forbam_index
(File, required): BAM index file corresponding to the input BAMgene_model
(File, required): Gene model as a GFF/GTF file
Defaults
min_mapq
(Int, default=30); description: Minimum MAPQ to consider for supporting reads; common: truemin_reads_per_gene
(Int, default=10); description: Filter any genes that don't have at leastmin_reads_per_gene
reads mapping to them; common: truemodify_disk_size_gb
(Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.num_genes
(Int, default=1000); description: How many genes to sample; common: trueoutfile_name
(String, default=basename(bam,".bam") + ".strandedness.tsv"): Name for the strandedness TSV filesplit_by_rg
(Boolean, default=false); description: Contain one entry in the output TSV per read group, in addition to anoverall
entry; common: true
Outputs
strandedness_file
(File)strandedness_string
(String)
instrument
- description
- Derives the instrument used to sequence the input BAM file. Reports evidence supporting final results.
- outputs
- {'instrument_file': 'TSV file containing the
ngsderive isntrument
report for the input BAM file', 'instrument_string': 'The derived instrument, in string format'}
Inputs
Required
_runtime
(Any, required)bam
(File, required): Input BAM format file to derive instrument for
Defaults
modify_disk_size_gb
(Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.num_reads
(Int, default=10000); description: How many reads to analyze from the start of the file. Any n < 1 to parse whole file.; common: trueoutfile_name
(String, default=basename(bam,".bam") + ".instrument.tsv"): Name for the instrument TSV file
Outputs
instrument_file
(File)instrument_string
(String)
read_length
- description
- Derives the original experimental read length of the input BAM. Reports evidence supporting final results.
- outputs
- {'read_length_file': 'TSV file containing the
ngsderive readlen
report for the input BAM file'}
Inputs
Required
_runtime
(Any, required)bam
(File, required): Input BAM format file to derive read length forbam_index
(File, required): BAM index file corresponding to the input BAM
Defaults
majority_vote_cutoff
(Float, default=0.7); description: To call a majority readlen, the maximum read length must have at leastmajority-vote-cutoff
% reads in support; common: truemodify_disk_size_gb
(Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.num_reads
(Int, default=-1); description: How many reads to analyze from the start of the file. Any n < 1 to parse whole file.; common: trueoutfile_name
(String, default=basename(bam,".bam") + ".readlength.tsv"): Name for the readlen TSV file
Outputs
read_length_file
(File)
encoding
- description
- Derives the encoding of the input NGS file(s). Reports evidence supporting final results.
- outputs
- {'inferred_encoding': 'The most permissive encoding found among the input files, in string format', 'encoding_file': 'TSV file containing the
ngsderive encoding
report for all input files'}
Inputs
Required
_runtime
(Any, required)ngs_files
(Array[File], required): An array of FASTQs and/or BAMs for which to derive encodingoutfile_name
(String, required): Name for the encoding TSV file
Defaults
modify_disk_size_gb
(Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.num_reads
(Int, default=1000000); description: How many reads to analyze from the start of the file(s). Any n < 1 to parse whole file(s).; common: true
Outputs
inferred_encoding
(String)encoding_file
(File)
junction_annotation
- description
- Annotates junctions found in an RNA-Seq BAM as known, novel, or partially novel
- external_help
- https://stjudecloud.github.io/ngsderive/subcommands/junction_annotation/
- outputs
- {'junction_summary': 'TSV file containing the
ngsderive junction-annotation
summary', 'junctions': 'TSV file containing a detailed list of annotated junctions'}
Inputs
Required
_runtime
(Any, required)bam
(File, required): Input BAM format file to annotate junctions forbam_index
(File, required): BAM index file corresponding to the input BAMgene_model
(File, required): Gene model as a GFF/GTF file
Defaults
fuzzy_junction_match_range
(Int, default=0); description: Consider found splices within+-k
bases of a known splice event annotated; common: truemin_intron
(Int, default=50); description: Minimum size of intron to be considered a splice; common: truemin_mapq
(Int, default=30); description: Minimum MAPQ to consider for supporting reads; common: truemin_reads
(Int, default=2); description: Filter any junctions that don't have at leastmin_reads
reads supporting them; common: truemodify_disk_size_gb
(Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.prefix
(String, default=basename(bam,".bam")): Prefix for the summary TSV and junction files. The extensions.junction_summary.tsv
and.junctions.tsv
will be added.
Outputs
junction_summary
(File)junctions
(File)
endedness
- description
- Derives the endedness of the input BAM file. Reports evidence for final result.
- outputs
- {'endedness_file': 'TSV file containing the
ngsderive endedness
report'}
Inputs
Required
_runtime
(Any, required)bam
(File, required): Input BAM format file to derive endedness from
Defaults
calc_rpt
(Boolean, default=false); description: Calculate and output Reads-Per-Template. This will produce a more sophisticated estimate for endedness, but uses substantially more memory (can reach up to 200% of BAM size in memory consumption for some inputs).; common: truelenient
(Boolean, default=false); description: Return a zero exit code on unknown results; common: truemodify_disk_size_gb
(Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.modify_memory_gb
(Int, default=0): Add to or subtract from dynamic memory allocation. Default memory is determined by value ofcalc_rpt
and the size of the input. Specified in GB.num_reads
(Int, default=-1); description: How many reads to analyze from the start of the file. Any n < 1 to parse whole file.; common: trueoutfile_name
(String, default=basename(bam,".bam") + ".endedness.tsv"): Name for the endedness TSV filepaired_deviance
(Float, default=0.0); description: Distance from 0.5 split between number of f+l- reads and f-l+ reads allowed to be called 'Paired-End'. Default of0.0
only appropriate if the whole file is being processed.; common: trueround_rpt
(Boolean, default=false); description: Round RPT to the nearest INT before comparing to expected values. Appropriate if using--num-reads
> 0.; common: truesplit_by_rg
(Boolean, default=false); description: Contain one entry per read group; common: true
Outputs
endedness_file
(File)