Interpreting the Report

The LENS workflow concludes by generating a report consisting of predicted pMHCs. The report is a tab-separated file (*.report.tsv) located in the project’s outputs/lens/<DATASET>/<PAT_NAME>/<RUN_NAME>/ directory.

Tip

The interactive RAFT Output Viewer (launched with raft generate-reports) provides the best way to explore the LENS report — with pMHC cards, filtering, visualizations, and QC dashboards. See Accessing and Reviewing Outputs for details.

This page provides a surface-level guide to reading and using the LENS report. For an exhaustive column-by-column reference, see Report Column Reference.

What the report contains

Each row in the report represents a predicted pMHC (peptide-MHC complex). A single report combines candidates from all seven antigen sources:

  • Somatic Nucleotide Variants (SNVs)

  • Somatic Insertion and Deletion Variants (InDels)

  • Splice Variants

  • Fusion Events

  • Endogenous Retroviruses (ERVs)

  • Viruses

  • Cancer-testis Antigens (CTAs)

Because the report combines candidates from several antigen sources, many columns only apply to one source. For example, fusion breakpoint columns are populated only for fusion-derived pMHCs, and InDel allele columns are populated only for InDel-derived pMHCs.

Key columns

The following columns are the most important for initial review:

Column

Description

candidate_id

Stable row identifier used to join report rows, review scores, and review evidence JSON.

allele

The HLA allele presenting the peptide.

peptide

The peptide sequence.

antigen_source

The source of the antigen (SNV, INDEL, SPLICE, FUSION, ERV, VIRUS, or CTA/SELF).

mhcflurry_2.1.1.aff

Binding affinity (nM) from MHCflurry. Lower values indicate stronger binding.

priority_score_recommended

Recommended priority score for cross-source ranking (higher is better).

priority_score_basis

Which scoring model was used for priority_score_recommended.

Prioritizing candidates

The priority_score_recommended column provides a single score for ranking pMHCs across all antigen sources. It is calculated from binding affinity, allele-specific transcript abundance, and cancer cell fraction (CCF, when available).

Use priority_score_recommended as the primary cross-source sorting score. The companion priority_score_basis column records which source score was selected for each row (e.g., MHCFLURRY_WITH_CCF, MHCFLURRY_NO_CCF, NETMHCPAN_WITH_CCF, etc.).

Agretopicity and CCF-aware priority scores are most meaningful for variant sources where a matched wildtype sequence and tumor context are available. For non-variant sources, and for variant rows without a matched wildtype peptide or CCF input, these fields can be missing or marked as not computed. This is why priority_score_recommended exists — it selects the best available score for each row regardless of source.

For the prioritization formula, see the Frequently Asked Questions.

Understanding missing values

LENS uses explicit values for missing or non-applicable data:

Value

Meaning

NOT_APPLICABLE

The field does not apply to the candidate’s antigen source or context.

NOT_COMPUTED

The computation did not run or the required input was unavailable.

NO_DETECTED_MUTATION

A mutation scan ran and did not detect a mutation in the requested gene.

NO_MATCHED_WT

A matched wildtype peptide or sequence could not be identified.

NO_LOHHLA_RESULT_FOR_ALLELE

LOHHLA ran but did not report a p-value for the candidate’s presented allele.

NA

Legacy or tool-derived missing value. Check the column and antigen source before interpreting.

LOHHLA p-values are allele-level annotations. Use lohhla_allele_loss_status to distinguish an available p-value from an allele that did not have a LOHHLA result. Missing p-values should not be interpreted as evidence against HLA loss.

Review columns

The final report includes review scoring columns produced by the review evidence workflow:

Column

Description

candidate_id

Stable row identifier used to join report rows, review scores, and review evidence JSON.

alignment_confidence_score

Review score summarizing alignment and read-level confidence evidence.

final_tier

Review tier assigned to the candidate (PASS, REVIEW, or FAIL).

reasons

Semicolon-delimited review flags that contributed to the score or tier. Can be empty.

Common filtering strategies

Filter by antigen source

To focus on a specific antigen source:

awk -F'\t' '$antigen_source_col == "SNV"' report.tsv

Replace $antigen_source_col with the column number for antigen_source. Valid values: SNV, INDEL, SPLICE, FUSION, ERV, VIRUS, CTA/SELF.

Filter by binding affinity

LENS filters pMHCs by default at < 500 nM binding affinity. To apply a stricter threshold (e.g., < 50 nM):

awk -F'\t' '$aff_col < 50' report.tsv

Sort by priority score

To rank candidates across all antigen sources:

sort -t$'\t' -k priority_score_recommended_col -nr report.tsv

Run QC summary

Next to the final report, LENS writes *.run_qc_summary.tsv — a compact, long-format TSV with section, metric, value, level, and detail columns. It summarizes:

  • Report row counts

  • Candidate ID integrity

  • Antigen-source counts

  • Missingness for important report columns

  • Final-report validation messages

Execution trace status is not included in this summary because Nextflow writes trace/report files after the workflow DAG finishes. Use RAFT project status or raft generate-reports to inspect task status counts and failed or retried tasks from outputs/reports/trace*.txt.

Further reading