LENS Report

The LENS workflow concludes by generating a report consistent of predicted pMHCs. The report can be found in the project’s outputs/lens/<PAT_NAME>/<RUN_NAME>/ directory.

The LENS report, by default, includes information about each predicted pMHC from each antigen source (e.g. SNV, InDel, ERV, etc.). The inclusion of all pMHCs from each antigen source results in many columns not being relevant to each pMHC (for example, a left_gene column intended for a fusion-derived pMHC is not relevant to a SNV-derived pMHC). We have broken down each antigen source’s relevant columns below.

Note

The inclusion of pMHC descriptive columns (e.g. derived from external tools) is conditional upon that tool being specified within the workflow.

Somatic Single Nucleotide Variants (SNVs)

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“SNV” for somatic single nucleotide variant pMHCs)

variant_callers

Variant callers which detected variant

variant_effect

Variant’s effect on protein

mut_aa_pos

Mutated amino acid position index within peptide sequence (0-indexed)

nt_context

Nucleotide sequence context around the coding variant (includes coding variant)

pep_context

Peptide sequence context around the varinat amino acid (includes variant amino acid)

snv_alt_allele

Coding variant allele

snv_ref_allele

Reference allele

snv_type

SNV type (“missense” for relevant SNVs)

transcript_id

Ensembl transcript identifier

variant_coords

Genomic coordinates of coding variant

variant_position_in_cds

Where, within the transcript coding sequence, the variant occurs (0-index)

rna_reads_covering_genomic_origin

Number of RNA tumor reads covering genomic origin of pMHC

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

proportion_rna_reads_covering_genomic_origin_with_peptide_cds

rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds

mhcflurry_agretopicity

Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence

vaf

Variant allele frequency

totcopynum

Total copy number of the variant allele

multiplicity

Multiplicity value used for cancer cell fraction (CCF) calculations

ccf

Cancer cell fraction (clonality)

gene_name

Gene symbol

gene_id

Ensembl gene identifier

mean_mtec_tpm

Mean expression (in TPM) of SNV-harboriing transcript in mTEC cells

median_mtec_tpm

Median expression (in TPM) of SNV-harboring transcript in mTEC cells

stdev_mtec_tpm

Standard deviation of expression (in TPM) of SNV-harboring transcript in mTEC cells

mean_mtec_num_reads

Mean count of RNA-seq reads for SNV-harboring transcript in mTEC cells

median_mtec_num_reads

Median count of RNA-seq reads for SNV-harboring transcript in mTEC cells

stdev_mtec_num_reads

Standard deviation of count of RNA-seq reads for SNV-harboring transcript in mTEC cells

gene_detectable_normal_tissues

Normal tissues with detectable expression of SNV-harboring transcript (likely irrelevant for pMHC prioritization)

gene_main_subcellular_location

Subcellular location of SNV-harboring transcript

allele_raw_read_aligned_count

Raw count of reads aligned to HLA allele (seq2HLA)

allele_proportion_rna_tumor_reads

Proportion of total RNA tumor sample reads aligned to HLA allele (seq2HLA)

allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor sample reads aligned to HLA allele (seq2HLA)

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score

Prioritization score of pMHC

priority_score_no_ccf

Prioritization score of pMHC (calculated without CCF)

Somatic Insertions and Deletions (InDels)

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“INDEL” for somatic insertion and deletion pMHCs)

variant_callers

Variant callers which detected variant

variant_effect

Variant’s effect on protein

indel_alt_allele

Coding variant allele

indel_ref_allele

Reference allele

indel_type

Insertion or deletion type

nt_context

Nucleotide sequence context around the coding variant (includes coding variant)

pep_context

Peptide sequence context around the varinat amino acid (includes variant amino acid)

transcript_id

Ensembl transcript identifier

valid_ref_orf

Confirms translated reading frame is correct

variant_coords

Genomic coordinates of coding variant

rna_reads_covering_genomic_origin

Number of RNA tumor reads covering genomic origin of pMHC

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

proportion_rna_reads_covering_genomic_origin_with_peptide_cds

rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

mhcflurry_agretopicity

Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence

vaf

Variant allele frequency

totcopynum

Total copy number of the variant allele

multiplicity

Multiplicity value used for cancer cell fraction (CCF) calculations

ccf

Cancer cell fraction (clonality)

gene_name

Gene symbol

gene_id

Ensembl gene identifier

mean_mtec_tpm

Mean expression (in TPM) of InDel-harboriing transcript in mTEC cells

median_mtec_tpm

Median expression (in TPM) of InDel-harboring transcript in mTEC cells

stdev_mtec_tpm

Standard deviation of expression (in TPM) of InDel-harboring transcript in mTEC cells

mean_mtec_num_reads

Mean count of RNA-seq reads for InDel-harboring transcript in mTEC cells

median_mtec_num_reads

Median count of RNA-seq reads for InDel-harboring transcript in mTEC cells

stdev_mtec_num_reads

Standard deviation of count of RNA-seq reads for InDel-harboring transcript in mTEC cells

gene_detectable_normal_tissues

Normal tissues with detectable expression of InDel-harboring transcript (likely irrelevant for pMHC prioritization)

gene_main_subcellular_location

Subcellular location of InDel-harboring transcript

allele_raw_read_aligned_count

Raw count of reads aligned to HLA allele (seq2HLA)

allele_proportion_rna_tumor_reads

Proportion of total RNA tumor sample reads aligned to HLA allele (seq2HLA)

allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor sample reads aligned to HLA allele (seq2HLA)

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score

Prioritization score of pMHC

priority_score_no_ccf

Prioritization score of pMHC (calculated without CCF)

Splice variants

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“SPLICE” for splice variant pMHCs)

allele_raw_read_aligned_count

Raw count of reads aligned to HLA allele (seq2HLA)

allele_proportion_rna_tumor_reads

Proportion of total RNA tumor sample reads aligned to HLA allele (seq2HLA)

allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor sample reads aligned to HLA allele (seq2HLA)

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

Fusions

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“FUSION” for fusion-derived pMHCs)

fusion_annotation

Annotation information about the fusion event

fusion_id

Fusion identifier

fusion_left_breakpoint

Fusion’s left breakpoint

fusion_left_gene

Fusion’s left gene symbol

fusion_left_transcript

Fusion’s left ensembl transcript identifier

fusion_right_breakpoint

Fusion’s right breakpoint

fusion_right_gene

Fusion’s right gene symbol

fusion_right_transcript

Fusion’s right ensembl transcript identifier

fusion_type

Type of fusion

nt_context

Nucleotide sequence context around the coding variant (includes coding variant)

pep_context

Peptide sequence context around the varinat amino acid (includes variant amino acid)

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

allele_raw_read_aligned_count

Raw count of reads aligned to HLA allele (seq2HLA)

allele_proportion_rna_tumor_reads

Proportion of total RNA tumor sample reads aligned to HLA allele (seq2HLA)

allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor sample reads aligned to HLA allele (seq2HLA)

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score_no_ccf

Prioritization score of pMHC (calculated without CCF)

ERVs

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“ERV” for ERV-derived pMHCs)

erv_norm_cpm

ERV counts per million (normal RNA sample)

erv_orf_id

ERV open reading frame identifier (from gEVE)

erv_orf_raw_read_count

ERV open reading frame raw read count (tumor RNA sample)

erv_orf_tpm

ERV open reading frame transcripts per million

erv_tumor_cpm

ERV counts per million (tumor RNA sample)

erv_tumor_cpm_to_norm_cpm_delta

Difference between tumor CPM and normal CPM (log(tumor cpm + 1) - log(normal cpm + 1))

pep_context

Peptide sequence context around the varinat amino acid (includes variant amino acid)

rna_reads_covering_genomic_origin_with_peptide_cds

Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds

Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC

erv_hervq_region

ERV open read frame’s hERVQuant region (if any)

erv_geve_annot

ERV open reading frame gEVE annotation

erv_ribo_cov_mean

ERV open reading frame ribo-seq mean coverage (external references)

erv_ribo_probe_count

ERV open reading frame ribo-seq probe count (external references)

erv_hervq_region_total_erv_orf_count

Total number of ERV open reading frames in corresponding hERVQuant region (if any) – more is better

erv_hervq_region_ribo_covd_erv_orf_count

Total number of ERV open reading frames in corresponding hERVQuant region with ribo-seq coverage (if any) – More is better

erv_mtec_exp_status

Whether ERV open reading frame expression in mTEC cells (no expression is better)

erv_norm_exp_status

Whether ERV open reading frame expression in normal cells (no expression is better)

erv_hervq_region_proteins_list

List of ERV proteins within ERV open reading frame’s corresponding hERVQuant region

erv_hervq_region_erv_uniq_proteins_count

Number of ERV proteins within ERV open reading frame’s corresponding hERVQuant region

erv_hervq_region_avg_exp_corr

Average expression correlation among ERV open reading frames within corresponding hERVQuant region (external references)

erv_hervq_region_pairwise_corr_count

Number of pairwise ERV open reading frame expression correlations used to calculate average expression correlation (external references)

erv_hervq_region_score

Score used for calculating ERV confidence score (1 if in hERVQuant region; 0 otherwise)

erv_annot_score

Score used for calculating ERV confidence score (1 if valid ERV protein; 0 otherwise)

erv_ribo_cov_mean_score

Score used for calculating ERV confidence score (1 if ribo-seq coverage; 0 otherwise)

erv_total_erv_count_score

Score used for calculating ERV confidence score (1 if >1 ERV ORFs in hERVQuant region; 0 otherwise)

erv_ribo_covd_erv_count_score

Score used for calculating ERV confidence score (1 if >1 ERV ORFs with ribo-seq coverage; 0 otherwise)

erv_mtec_exp_status_score

Score used for calculating ERV confidence score (1 if not expressed in mTECs; 0 otherwise)

erv_norm_exp_status_score

Score used for calculating ERV confidence score (1 if not expressed in normal tissues; 0 otherwise)

erv_uniq_proteins_count_score

Score used for calculating ERV confidence score (0 if no ERV proteins, 0.25 if one unique ERV protein in hERVQuant region; 0.5 if two unique ERV proteins in hERVQuant region; 0.75 if three unique ERV proteins in hERVQuant region; 1.0 if four unique ERV proteins in hERVQuant region)

erv_avg_exp_corr_within_hervq_region_score

Score used for calculating ERV confidence score (bound between [0, 1]; expression correlation value)

erv_raw_erv_orf_confidence_score

Raw ERV score

erv_normd_erv_orf_confidence_score

Normalized ERV score (bound between [0, 1])

allele_raw_read_aligned_count

Raw count of reads aligned to HLA allele (seq2HLA)

allele_proportion_rna_tumor_reads

Proportion of total RNA tumor sample reads aligned to HLA allele (seq2HLA)

allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor sample reads aligned to HLA allele (seq2HLA)

origin_descriptor

Generic identifier

tpm

Transcripts per million (Salmon by default)

priority_score_no_ccf

Prioritization score of pMHC (calculated without CCF)

Viruses

Column Name

Description

allele

Relevant HLA allele

peptide

Peptide sequence

mhcflurry_2.1.1.aff

Binding affinity from MHCflurry

mhcflurry_2.1.1.aff_perc

Binding affinity percent from MHCflurry

mhcflurry_2.1.1.proc_score

Processing score from MHCflurry

mhcflurry_2.1.1.pres_score

Presentation score from MHCflurry

mhcflurry_2.1.1.pres_perc

Presentation score percent from MHCflurry

antigen_source

Antigen source (“VIRUS” for viral pMHCs)

allele_raw_read_aligned_count

Raw count of reads aligned to HLA allele (seq2HLA)

allele_proportion_rna_tumor_reads

Proportion of total RNA tumor sample reads aligned to HLA allele (seq2HLA)

allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor sample reads aligned to HLA allele (seq2HLA)

Cancer-testis Antigens and Self-antigens

Tumor-level Metrics

Column Name

Description

b2m_mutations

List of any B2M mutations

b2m_tpm

B2M gene-level TPM in RNA tumor sample

tap1_mutations

List of any TAP1 mutations

tap1_tpm

TAP1 gene-level TPM in RNA tumor sample

tap2_mutations

List of any TAP2 mutations

tap2_tpm

TAP2 gene-level TPM in RNA tumor sample

HLA Allele-level Metrics

Column Name

Description

allele

HLA allele is interest

hla_allele_raw_read_aligned_count

Total number of HLA-aligned RNA tumor reads

hla_allele_proportion_rna_tumor_reads

Proportion of all RNA tumor reads assigned to HLA allele

hla_allele_proportion_hla_rna_tumor_reads

Proportion of HLA-aligned RNA tumor reads assigned to HLA allele

hla_allele_support

Tool-level support for HLA allele

lohhla_allele_loss_pval

LOHHLA allele-specific loss test p-value

Optional columns

The LENS report may contain additional columns depending upon the tools and references utilized. For example, users with licenses for the NetMHC tool suite that add them to the tools used (see LINK) will see NetMHC-specific columns.

A list of optional columns is listed below.

Optional Columns

Column Name

Description

netmhcpan_4.1b.score_el

Elution score from NetMHCpan

netmhcpan_4.1b.perc_rank_el

Elution percent rank from NetMHCpan

netmhcpan_4.1b.score_ba

Binding affinity score from NetMHCpan

netmhcpan_4.1b.perc_rank_ba

Binding affinity percent rank from NetMHCpan

netmhcpan_4.1b.aff_nm

Binding affinity from NetMHCpan

netmhcstabpan_1.0.stab_pred_score

Binding stability score from NetMHCstabpan

netmhcstabpan_1.0.halflife_hours

Half life in hours from NetMHCstabpan

netmhcstabpan_1.0.perc_rank_stab

Binding stability percent rank from NetMHCstabpan

netmhcpan_agretopicity

Agretopicity calculated by NetMHCpan using BLASTP-derived nearest Wildtype sequence