librarian

librarian

description
Runs the librarian tool to derive the likely Illumina library preparation protocol used to generate a pair of FASTQ files.
help
WARNING this tool is not guaranteed to work on all data, and may produce nonsensical results. librarian was trained on a limited set of GEO read data (Gene Expression Oriented). This means the input data should be Paired-End, of mouse or human origin, read length should be >50bp, and derived from a library prep kit that is in the librarian database. This version of librarian has been trained on "read one" data of Paired-End sequencing data. It is not intended for use with Single-End data, even though it only accepts a single FASTQ.
outputs
{'report': 'A tar archive containing the librarian report and raw data.', 'raw_data': 'The raw data that can be processed by MultiQC.'}

Inputs

Required

  • _runtime (Any, required)
  • read_one_fastq (File, required): Read one FASTQ of a Paired-End sample to analyze. May be uncompressed or gzipped.

Defaults

  • modify_disk_size_gb (Int, default=0): Add to or subtract from dynamic disk space allocation. Default disk size is determined by the size of the inputs. Specified in GB.
  • prefix (String, default=sub(basename(read_one_fastq),"([_\.][rR][12])?(\.subsampled)?\.(fastq|fq)(\.gz)?$","") + ".librarian"): Name of the output tar archive. The extension .tar.gz will be added.

Outputs

  • report (File)
  • raw_data (File)