Report Column Reference

Note

For a surface-level guide to reading and using the LENS report, see Interpreting the Report. This page provides exhaustive column-by-column documentation for every antigen source.

The LENS workflow concludes by generating a report consisting of predicted pMHCs. The report can be found in the project’s outputs/lens/<DATASET>/<PAT_NAME>/<RUN_NAME>/ directory.

The LENS report, by default, includes information about each predicted pMHC from each antigen source (e.g. SNV, InDel, ERV, etc.). The inclusion of all pMHCs from each antigen source results in many columns not being relevant to each pMHC (for example, a left_gene column intended for a fusion-derived pMHC is not relevant to a SNV-derived pMHC). We have broken down each antigen source’s relevant columns below.

Note

The inclusion of pMHC descriptive columns (e.g. derived from external tools) is conditional upon that tool being specified within the workflow.

Final report semantics

The final *.report.tsv file is the reviewed LENS report. It includes a stable candidate_id for each row and the review scoring columns produced by the review evidence workflow. Intermediate unreviewed reports may be present in sample-level output directories, but the report under outputs/lens is the file intended for downstream review.

The same directory also contains *.run_qc_summary.tsv. This long-format summary records final report row counts, candidate ID integrity checks, candidate counts by antigen source, missingness for important columns, and final-report validation messages. It does not include Nextflow task status counts because trace files are produced after the workflow DAG finishes; use RAFT project status or raft generate-reports to inspect failed or retried tasks from outputs/reports/trace*.txt.

candidate_id values are unique within a report and are assigned in report row order when they are not already present. Evidence JSON files produced during review use the same identifiers, so candidate_id is the preferred key for auditing a report row back to detailed review evidence.

The final report includes these review columns:

Column Name

Description

candidate_id

Stable row identifier used to join report rows, review scores, and review evidence JSON.

alignment_confidence_score

Review score summarizing alignment and read-level confidence evidence.

final_tier

Review tier assigned to the candidate, such as PASS, REVIEW, or FAIL.

reasons

Semicolon-delimited review flags or explanations that contributed to the score or tier. This field can be empty for candidates without triggered review flags.

Missing and non-applicable values

Because the report combines candidates from several antigen sources, many columns only apply to one source. For example, fusion breakpoint columns are expected to be populated only for fusion-derived pMHCs, and InDel allele columns are expected to be populated only for InDel-derived pMHCs.

LENS uses explicit values where possible:

Value

Meaning

NOT_APPLICABLE

The field does not apply to the candidate’s antigen source or context.

NOT_COMPUTED

The computation did not run or the required input was unavailable.

NO_DETECTED_MUTATION

A mutation scan ran and did not detect a mutation in the requested gene.

NO_MATCHED_WT

A matched wildtype peptide or sequence could not be identified for the candidate.

NO_LOHHLA_RESULT_FOR_ALLELE

LOHHLA ran, but did not report a p-value for the candidate’s presented allele.

NA

Legacy or tool-derived missing value. Check the column and antigen source before interpreting it.

Agretopicity and CCF-aware priority scores are most meaningful for variant sources where a matched wildtype sequence and tumor context are available. For non-variant sources, and for variant rows without a matched wildtype peptide or CCF input, these fields can be missing or marked as not computed. Use priority_score_recommended as the primary cross-source sorting score. The companion priority_score_basis column records which source score was selected for that row.

LOHHLA p-values are allele-level annotations. Use lohhla_allele_loss_status to distinguish an available p-value from an allele that did not have a LOHHLA result. Missing p-values should not be interpreted as evidence against HLA loss.

Somatic Single Nucleotide Variants (SNVs)

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“SNV” for somatic single nucleotide variant pMHCs)

variant_callers

Variant callers which detected variant

variant_effect

Variant’s effect on protein

mut_aa_pos

Mutated amino acid position index within peptide sequence (0-indexed)

nt_context

Nucleotide sequence context around the coding variant (includes coding variant)

pep_context

Peptide sequence context around the variant amino acid (includes variant amino acid)

snv_alt_allele

Coding variant allele

snv_ref_allele

Reference allele

snv_type

SNV type (“missense” for relevant SNVs)

transcript_id

Ensembl transcript identifier

variant_coords

Genomic coordinates of coding variant

variant_position_in_cds

Where, within the transcript coding sequence, the variant occurs (0-index)

n_flank

The N flank protein sequence upstream of peptide

c_flank

The C flank protein sequence downstream of peptide

hlapollo_train_allele

Whether the pMHC’s allele is within HLApollo’s training set or not

hlapollo_mhc_pred_0

Logit score for the input example, higher numbers indicate more likely binding/presentation

hlapollo_mhc_pred_0_rank

Percentile Rank for the input example, lower numbers indicate more likely binding/presentation

pepsickle_0.2.1_max_pep_aa_cleav_score

The maximum cleavage score among the amino acids within the peptide of interest

pepsickle_0.2.1_plus1_aa_cleav_score

The cleavage score of the amino acid immediately downstream of peptide of interest

pepsickle_0.2.1_score

pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) – higher is better

rna_reads_covering_genomic_origin

Number of RNA tumor reads covering genomic origin of pMHC

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

proportion_rna_reads_covering_genomic_origin_with_peptide_cds

rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds

mhcflurry_agretopicity

Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence

mhcflurry_agretopicity_status

Status for MHCflurry agretopicity calculation: WT_MATCH_FOUND, NO_WT_MATCH, or WT_TOOL_SCORE_MISSING

vaf

Variant allele frequency

totcopynum

Total copy number of the variant allele

multiplicity

Multiplicity value used for cancer cell fraction (CCF) calculations

ccf

Cancer cell fraction (clonality)

gene_name

Gene symbol

gene_id

Ensembl gene identifier

mean_mtec_tpm

Mean expression (in TPM) of SNV-harboriing transcript in mTEC cells

median_mtec_tpm

Median expression (in TPM) of SNV-harboring transcript in mTEC cells

stdev_mtec_tpm

Standard deviation of expression (in TPM) of SNV-harboring transcript in mTEC cells

mean_mtec_num_reads

Mean count of RNA-seq reads for SNV-harboring transcript in mTEC cells

median_mtec_num_reads

Median count of RNA-seq reads for SNV-harboring transcript in mTEC cells

stdev_mtec_num_reads

Standard deviation of count of RNA-seq reads for SNV-harboring transcript in mTEC cells

gene_detectable_normal_tissues

Normal tissues with detectable expression of SNV-harboring transcript (likely irrelevant for pMHC prioritization)

gene_main_subcellular_location

Subcellular location of SNV-harboring transcript

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_recommended

Recommended priority score for cross-source ranking, selected from the best available priority score column for the row

priority_score_basis

Basis for priority_score_recommended, such as MHCFLURRY_PRIMARY_ALIGNMENTS_WITH_CCF, MHCFLURRY_WITH_CCF, MHCFLURRY_PRIMARY_ALIGNMENTS_NO_CCF, MHCFLURRY_NO_CCF, NETMHCPAN_PRIMARY_ALIGNMENTS_WITH_CCF, NETMHCPAN_WITH_CCF, NETMHCPAN_PRIMARY_ALIGNMENTS_NO_CCF, NETMHCPAN_NO_CCF, MAXIMUM_AVAILABLE_PRIORITY_SCORE, or NOT_COMPUTED

Somatic Insertions and Deletions (InDels)

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“INDEL” for somatic insertion and deletion pMHCs)

variant_callers

Variant callers which detected variant

variant_effect

Variant’s effect on protein

indel_alt_allele

Coding variant allele

indel_ref_allele

Reference allele

indel_type

Insertion or deletion type

nt_context

Nucleotide sequence context around the coding variant (includes coding variant)

pep_context

Peptide sequence context around the variant amino acid (includes variant amino acid)

transcript_id

Ensembl transcript identifier

valid_ref_orf

Confirms translated reading frame is correct

variant_coords

Genomic coordinates of coding variant

n_flank

The N flank protein sequence upstream of peptide

c_flank

The C flank protein sequence downstream of peptide

hlapollo_train_allele

Whether the pMHC’s allele is within HLApollo’s training set or not

hlapollo_mhc_pred_0

Logit score for the input example, higher numbers indicate more likely binding/presentation

hlapollo_mhc_pred_0_rank

Percentile Rank for the input example, lower numbers indicate more likely binding/presentation

pepsickle_0.2.1_max_pep_aa_cleav_score

The maximum cleavage score among the amino acids within the peptide of interest

pepsickle_0.2.1_plus1_aa_cleav_score

The cleavage score of the amino acid immediately downstream of peptide of interest

pepsickle_0.2.1_score

pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) – higher is better

rna_reads_covering_genomic_origin

Number of RNA tumor reads covering genomic origin of pMHC

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

proportion_rna_reads_covering_genomic_origin_with_peptide_cds

rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

mhcflurry_agretopicity

Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence

mhcflurry_agretopicity_status

Status for MHCflurry agretopicity calculation: WT_MATCH_FOUND, NO_WT_MATCH, or WT_TOOL_SCORE_MISSING

vaf

Variant allele frequency

totcopynum

Total copy number of the variant allele

multiplicity

Multiplicity value used for cancer cell fraction (CCF) calculations

ccf

Cancer cell fraction (clonality)

gene_name

Gene symbol

gene_id

Ensembl gene identifier

mean_mtec_tpm

Mean expression (in TPM) of InDel-harboriing transcript in mTEC cells

median_mtec_tpm

Median expression (in TPM) of InDel-harboring transcript in mTEC cells

stdev_mtec_tpm

Standard deviation of expression (in TPM) of InDel-harboring transcript in mTEC cells

mean_mtec_num_reads

Mean count of RNA-seq reads for InDel-harboring transcript in mTEC cells

median_mtec_num_reads

Median count of RNA-seq reads for InDel-harboring transcript in mTEC cells

stdev_mtec_num_reads

Standard deviation of count of RNA-seq reads for InDel-harboring transcript in mTEC cells

gene_detectable_normal_tissues

Normal tissues with detectable expression of InDel-harboring transcript (likely irrelevant for pMHC prioritization)

gene_main_subcellular_location

Subcellular location of InDel-harboring transcript

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

Splice variants

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“SPLICE” for splice variant pMHCs)

coding_sequence

Nucleotide coding sequence for peptide of interest

pep_context

Peptide sequence context around the variant amino acid (includes variant amino acid)

splice_coords

Coordinates of the splice event

splice_description

Description of the splice event (see SNAF documentation)

n_flank

The N flank protein sequence upstream of peptide

c_flank

The C flank protein sequence downstream of peptide

hlapollo_train_allele

Whether the pMHC’s allele is within HLApollo’s training set or not

hlapollo_mhc_pred_0

Logit score for the input example, higher numbers indicate more likely binding/presentation

hlapollo_mhc_pred_0_rank

Percentile Rank for the input example, lower numbers indicate more likely binding/presentation

pepsickle_0.2.1_max_pep_aa_cleav_score

The maximum cleavage score among the amino acids within the peptide of interest

pepsickle_0.2.1_plus1_aa_cleav_score

The cleavage score of the amino acid immediately downstream of peptide of interest

pepsickle_0.2.1_score

pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) – higher is better

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

gene_id

Ensembl gene identifier

origin_descriptor

Generic identifier

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

Fusions

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“FUSION” for fusion-derived pMHCs)

fusion_annotation

Annotation information about the fusion event

fusion_id

Fusion identifier

fusion_left_breakpoint

Fusion’s left breakpoint

fusion_left_gene

Fusion’s left gene symbol

fusion_left_transcript

Fusion’s left ensembl transcript identifier

fusion_right_breakpoint

Fusion’s right breakpoint

fusion_right_gene

Fusion’s right gene symbol

fusion_right_transcript

Fusion’s right ensembl transcript identifier

fusion_type

Type of fusion

nt_context

Nucleotide sequence context around the coding variant (includes coding variant)

pep_context

Peptide sequence context around the variant amino acid (includes variant amino acid)

n_flank

The N flank protein sequence upstream of peptide

c_flank

The C flank protein sequence downstream of peptide

hlapollo_train_allele

Whether the pMHC’s allele is within HLApollo’s training set or not

hlapollo_mhc_pred_0

Logit score for the input example, higher numbers indicate more likely binding/presentation

hlapollo_mhc_pred_0_rank

Percentile Rank for the input example, lower numbers indicate more likely binding/presentation

pepsickle_0.2.1_max_pep_aa_cleav_score

The maximum cleavage score among the amino acids within the peptide of interest

pepsickle_0.2.1_plus1_aa_cleav_score

The cleavage score of the amino acid immediately downstream of peptide of interest

pepsickle_0.2.1_score

pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) – higher is better

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

mhcflurry_agretopicity

Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence

mhcflurry_agretopicity_status

Status for MHCflurry agretopicity calculation: WT_MATCH_FOUND, NO_WT_MATCH, or WT_TOOL_SCORE_MISSING

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default; TPMs listed separately for each original transcript)

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

ERVs

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“ERV” for ERV-derived pMHCs)

erv_norm_cpm

ERV counts per million (normal RNA sample)

erv_orf_id

ERV open reading frame identifier (from gEVE)

erv_orf_raw_read_count

ERV open reading frame raw read count (tumor RNA sample)

erv_orf_tpm

ERV open reading frame transcripts per million

erv_tumor_cpm

ERV counts per million (tumor RNA sample)

erv_tumor_cpm_to_norm_cpm_delta

Difference between tumor CPM and normal CPM (log(tumor cpm + 1) - log(normal cpm + 1))

pep_context

Peptide sequence context around the variant amino acid (includes variant amino acid)

n_flank

The N flank protein sequence upstream of peptide

c_flank

The C flank protein sequence downstream of peptide

hlapollo_train_allele

Whether the pMHC’s allele is within HLApollo’s training set or not

hlapollo_mhc_pred_0

Logit score for the input example, higher numbers indicate more likely binding/presentation

hlapollo_mhc_pred_0_rank

Percentile Rank for the input example, lower numbers indicate more likely binding/presentation

pepsickle_0.2.1_max_pep_aa_cleav_score

The maximum cleavage score among the amino acids within the peptide of interest

pepsickle_0.2.1_plus1_aa_cleav_score

The cleavage score of the amino acid immediately downstream of peptide of interest

pepsickle_0.2.1_score

pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) – higher is better

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

erv_hervq_region

ERV open read frame’s hERVQuant region (if any)

erv_geve_annot

ERV open reading frame gEVE annotation

erv_ribo_cov_mean

ERV open reading frame ribo-seq mean coverage (external references)

erv_ribo_probe_count

ERV open reading frame ribo-seq probe count (external references)

erv_hervq_region_total_erv_orf_count

Total number of ERV open reading frames in corresponding hERVQuant region (if any) – more is better

erv_hervq_region_ribo_covd_erv_orf_count

Total number of ERV open reading frames in corresponding hERVQuant region with ribo-seq coverage (if any) – More is better

erv_mtec_exp_status

Whether ERV open reading frame expression in mTEC cells (no expression is better)

erv_norm_exp_status

Whether ERV open reading frame expression in normal cells (no expression is better)

erv_hervq_region_proteins_list

List of ERV proteins within ERV open reading frame’s corresponding hERVQuant region

erv_hervq_region_erv_uniq_proteins_count

Number of ERV proteins within ERV open reading frame’s corresponding hERVQuant region

erv_hervq_region_avg_exp_corr

Average expression correlation among ERV open reading frames within corresponding hERVQuant region (external references)

erv_hervq_region_pairwise_corr_count

Number of pairwise ERV open reading frame expression correlations used to calculate average expression correlation (external references)

erv_hervq_region_score

Score used for calculating ERV confidence score (1 if in hERVQuant region; 0 otherwise)

erv_annot_score

Score used for calculating ERV confidence score (1 if valid ERV protein; 0 otherwise)

erv_ribo_cov_mean_score

Score used for calculating ERV confidence score (1 if ribo-seq coverage; 0 otherwise)

erv_total_erv_count_score

Score used for calculating ERV confidence score (1 if >1 ERV ORFs in hERVQuant region; 0 otherwise)

erv_ribo_covd_erv_count_score

Score used for calculating ERV confidence score (1 if >1 ERV ORFs with ribo-seq coverage; 0 otherwise)

erv_mtec_exp_status_score

Score used for calculating ERV confidence score (1 if not expressed in mTECs; 0 otherwise)

erv_norm_exp_status_score

Score used for calculating ERV confidence score (1 if not expressed in normal tissues; 0 otherwise)

erv_uniq_proteins_count_score

Score used for calculating ERV confidence score (0 if no ERV proteins, 0.25 if one unique ERV protein in hERVQuant region; 0.5 if two unique ERV proteins in hERVQuant region; 0.75 if three unique ERV proteins in hERVQuant region; 1.0 if four unique ERV proteins in hERVQuant region)

erv_avg_exp_corr_within_hervq_region_score

Score used for calculating ERV confidence score (bound between [0, 1]; expression correlation value)

erv_raw_erv_orf_confidence_score

Raw ERV score

erv_normd_erv_orf_confidence_score

Normalized ERV score (bound between [0, 1])

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

Viruses

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“VIRUS” for viral pMHCs)

pep_context

Sequence context around peptide of interest

virus_id

Viral identifier

rna_reads_covering_genomic_origin

Number of RNA tumor reads covering genomic origin of pMHC

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

origin_descriptor

Generic identifier

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

Cancer-testis Antigens and Self-antigens

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“CTA/SELF” for CTA/Self-antigen pMHCs)

pep_context

Sequence context around peptide of interest

transcript_id

Ensembl transcript identifier

n_flank

The N flank protein sequence upstream of peptide

c_flank

The C flank protein sequence downstream of peptide

hlapollo_train_allele

Whether the pMHC’s allele is within HLApollo’s training set or not

hlapollo_mhc_pred_0

Logit score for the input example, higher numbers indicate more likely binding/presentation

hlapollo_mhc_pred_0_rank

Percentile Rank for the input example, lower numbers indicate more likely binding/presentation

pepsickle_0.2.1_max_pep_aa_cleav_score

The maximum cleavage score among the amino acids within the peptide of interest

pepsickle_0.2.1_plus1_aa_cleav_score

The cleavage score of the amino acid immediately downstream of peptide of interest

pepsickle_0.2.1_score

pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) – higher is better

all_transcript_ids_encoding_peptide

All transcript identifiers encoding peptide of interest – useful for ensuring non-CTA transcripts are not encoding the peptide

all_gene_ids_encoding_peptide

All gene identifiers encoding peptide of interest – useful for ensuring non-CTA genes are not encoding the peptide

all_gene_names_encoding_peptide

All gene symbols encoding peptide of interest – useful for ensuring non-CTA genes are not encoding the peptide

rna_reads_covering_genomic_origin

Number of RNA tumor reads covering genomic origin of pMHC

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

proportion_rna_reads_covering_genomic_origin_with_peptide_cds

rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds

mhcflurry_agretopicity

Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence

mhcflurry_agretopicity_status

Status for MHCflurry agretopicity calculation: WT_MATCH_FOUND, NO_WT_MATCH, or WT_TOOL_SCORE_MISSING

mean_mtec_tpm

Mean expression (in TPM) of CTA transcript in mTEC cells

median_mtec_tpm

Median expression (in TPM) of CTA transcript in mTEC cells

stdev_mtec_tpm

Standard deviation of expression (in TPM) of CTA transcript in mTEC cells

mean_mtec_num_reads

Mean count of RNA-seq reads for CTA transcript in mTEC cells

median_mtec_num_reads

Median count of RNA-seq reads for CTA transcript in mTEC cells

stdev_mtec_num_reads

Standard deviation of count of RNA-seq reads for CTA transcript in mTEC cells

gene_detectable_normal_tissues

Normal tissues with detectable expression of CTA

gene_main_subcellular_location

Subcellular location of CTA

gene_name

Gene symbol

gene_id

Ensembl gene identifier

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score_mhcflurry

Prioritization score of pMHC using MHCflurry binding affinity

priority_score_mhcflurry_no_ccf using MHCflurry binding affinity

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity

priority_score_mhcflurry_prim_alns

Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification

priority_score_mhcflurry_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification

Tumor-level Metrics

Column Name

Description

b2m_mutations

List of detected B2M mutations, or NO_DETECTED_MUTATION when the mutation scan ran and found none

b2m_tpm

B2M gene-level TPM in RNA tumor sample, or NOT_COMPUTED if expression could not be extracted

tap1_mutations

List of detected TAP1 mutations, or NO_DETECTED_MUTATION when the mutation scan ran and found none

tap1_tpm

TAP1 gene-level TPM in RNA tumor sample, or NOT_COMPUTED if expression could not be extracted

tap2_mutations

List of detected TAP2 mutations, or NO_DETECTED_MUTATION when the mutation scan ran and found none

tap2_tpm

TAP2 gene-level TPM in RNA tumor sample, or NOT_COMPUTED if expression could not be extracted

HLA Allele-level Metrics

Column Name

Description

allele

HLA allele is interest

hla_allele_raw_read_aligned_count

Total number of HLA-aligned RNA tumor reads

hla_allele_proportion_rna_tumor_reads

Proportion of all RNA tumor reads assigned to HLA allele

hla_allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor reads assigned to HLA allele

hla_allele_support

Tool-level support for HLA allele

lohhla_allele_loss_pval

LOHHLA allele-specific loss test p-value

lohhla_allele_loss_status

Status for LOHHLA p-value availability: LOH_RESULT_AVAILABLE or NO_LOHHLA_RESULT_FOR_ALLELE

Optional columns

The LENS report may contain additional columns depending upon the tools and references utilized. For example, users with licenses for the NetMHC tool suite that add them to the tools used (see LINK) will see NetMHC-specific columns.

A list of optional columns is listed below.

Optional Columns

Column Name

Description

netmhcpan_4.1b.score_el

Elution score from NetMHCpan

netmhcpan_4.1b.perc_rank_el

Elution percent rank from NetMHCpan

netmhcpan_4.1b.score_ba

Binding affinity score from NetMHCpan

netmhcpan_4.1b.perc_rank_ba

Binding affinity percent rank from NetMHCpan

netmhcpan_4.1b.aff_nm

Binding affinity from NetMHCpan

netmhcstabpan_1.0.stab_pred_score

Binding stability score from NetMHCstabpan

netmhcstabpan_1.0.halflife_hours

Half life in hours from NetMHCstabpan

netmhcstabpan_1.0.perc_rank_stab

Binding stability percent rank from NetMHCstabpan

netmhcpan_agretopicity

Agretopicity calculated by NetMHCpan using BLASTP-derived nearest Wildtype sequence

netmhcpan_agretopicity_status

Status for NetMHCpan agretopicity calculation: WT_MATCH_FOUND, NO_WT_MATCH, or WT_TOOL_SCORE_MISSING

priority_score_netmhcpan

Prioritization score of pMHC using NetMHCpan binding affinity

priority_score_netmhcpan_no_ccf using NetMHCpan binding affinity

Prioritization score of pMHC (calculated without CCF) using NetMHCpan binding affinity

priority_score_netmhcpan_prim_alns

Prioritization score of pMHC using NetMHCpan binding affinity and RNA tumor primary alignments for quantification

priority_score_netmhcpan_no_ccf_prim_alns

Prioritization score of pMHC (calculated without CCF) using NetMHCpan and RNA tumor primary alignments for quantification

priority_score_recommended

Recommended priority score for cross-source ranking, selected from the best available priority score column for the row

priority_score_basis

Basis for priority_score_recommended, such as MHCFLURRY_PRIMARY_ALIGNMENTS_WITH_CCF, MHCFLURRY_WITH_CCF, MHCFLURRY_PRIMARY_ALIGNMENTS_NO_CCF, MHCFLURRY_NO_CCF, NETMHCPAN_PRIMARY_ALIGNMENTS_WITH_CCF, NETMHCPAN_WITH_CCF, NETMHCPAN_PRIMARY_ALIGNMENTS_NO_CCF, NETMHCPAN_NO_CCF, MAXIMUM_AVAILABLE_PRIORITY_SCORE, or NOT_COMPUTED