LENS Outputs

RAFT outputs are available in a project’s outputs/ directory. The outputs/ directory can contain a variety of subdirectories but generally includes at least qc/, reports/, and samples/.

Tool-specific outputs

LENS runs a multitude tools to predict tumor antigens. The outputs from these tools are available in the outputs/samples/ directory (described below). Some of these outputs include:

  • Trimmed FASTQs (fastp)

  • BAMs (bwa-mem2 (DNA alignment), samblaster (duplicate marked), star (RNA alignment))

  • Transcript counts (salmon)

  • Somatic variants (mutect2, varscan2, strelka2)

  • Annotated somatic variants (snpeff)

  • Germline variants (deepvariant)

  • Phased variants (jacquard)

  • HLA calls (seq2hla)

  • HLA allele-specific expression (seq2hla)

  • Splice variants (snaf)

  • Expressed viruses (modified VirDetect)

  • Fusion variants (starfusion)

  • Tumor purity (sequenza)

  • Copy Number Alterations (cnvkit)

  • Expressed ERVs

  • Gene signatures (binfotron)

Hierarchical Output Directory Structure

Directories containing sample-level outputs (qc/ and samples/) follow:

samples/
  \__<DATASET>
        \__<PAT_NAME>
              \__<RUN_NAME>

Therefore, samples outputs are defined by the Dataset, Pat_Name, and Run_Name columns from the user-provided manifest.

Some situations, like variant calling, may involve multiple samples from the same patient. In this case, the <RUN_NAME>/ directory is defined as <RUN_NAME_1>_<RUN_NAME2>/.

samples/

Sample-level outputs can be found within the samples/ directory. Outputs are partitioned by sample where appropriate. For example, the outputs for samples ar-279 for patient Pt01 in datasets Foo_2024 can be found in outputs/samples/Foo_2024/Pt01/ar-279/.

ls samples/Foo_2024/Pt01/ar-279/
fastp
salmon_aln_quant
samtools_coverage
seq2hla
seqtk_subseq
starfusion

Each subdirectory, (e.g. starfusion/) contains symbolic links to the output files contained within the project’s work/ dirctory.

qc/

Quality control data for each sample can be found in the qc/ directory. These outputs are generally tool-level reported metrics. The funnel analyses for each tumor antigen source can also be found in the qc/ directory.

reports/

The reports/ directory contains reports generated by Nextflow:

  • dag.dot - A directed acyclic graph dot file that can be used to understand the flow among processes. The .dot file can be converted to a graph using graphviz.

  • report.html - A report with a wealth of information regarding each process run by the workflow. Useful for understanding resource limitations within the workflow.

  • timeline.html - A report showing a graphical representation of each process’s run time.

  • trace.txt - A text file containing information about each process run. Useful for troubleshooting as process names can be grepped. This file exists within the project’s log/ directory while the workflow is running.