Report Column Reference _______________________ .. note:: For a surface-level guide to reading and using the LENS report, see :doc:`interpreting_report`. This page provides exhaustive column-by-column documentation for every antigen source. The LENS workflow concludes by generating a report consisting of predicted pMHCs. The report can be found in the project's ``outputs/lens////`` directory. The LENS report, by default, includes information about each predicted pMHC from each antigen source (e.g. SNV, InDel, ERV, etc.). The inclusion of all pMHCs from each antigen source results in many columns not being relevant to each pMHC (for example, a ``left_gene`` column intended for a fusion-derived pMHC is not relevant to a SNV-derived pMHC). We have broken down each antigen source's relevant columns below. .. note:: The inclusion of pMHC descriptive columns (e.g. derived from external tools) is conditional upon that tool being specified within the workflow. Final report semantics ====================== The final ``*.report.tsv`` file is the reviewed LENS report. It includes a stable ``candidate_id`` for each row and the review scoring columns produced by the review evidence workflow. Intermediate unreviewed reports may be present in sample-level output directories, but the report under ``outputs/lens`` is the file intended for downstream review. The same directory also contains ``*.run_qc_summary.tsv``. This long-format summary records final report row counts, candidate ID integrity checks, candidate counts by antigen source, missingness for important columns, and final-report validation messages. It does not include Nextflow task status counts because trace files are produced after the workflow DAG finishes; use RAFT project status or ``raft generate-reports`` to inspect failed or retried tasks from ``outputs/reports/trace*.txt``. ``candidate_id`` values are unique within a report and are assigned in report row order when they are not already present. Evidence JSON files produced during review use the same identifiers, so ``candidate_id`` is the preferred key for auditing a report row back to detailed review evidence. The final report includes these review columns: .. list-table:: :widths: 35 65 :header-rows: 1 * - Column Name - Description * - candidate_id - Stable row identifier used to join report rows, review scores, and review evidence JSON. * - alignment_confidence_score - Review score summarizing alignment and read-level confidence evidence. * - final_tier - Review tier assigned to the candidate, such as ``PASS``, ``REVIEW``, or ``FAIL``. * - reasons - Semicolon-delimited review flags or explanations that contributed to the score or tier. This field can be empty for candidates without triggered review flags. Missing and non-applicable values ================================= Because the report combines candidates from several antigen sources, many columns only apply to one source. For example, fusion breakpoint columns are expected to be populated only for fusion-derived pMHCs, and InDel allele columns are expected to be populated only for InDel-derived pMHCs. LENS uses explicit values where possible: .. list-table:: :widths: 35 65 :header-rows: 1 * - Value - Meaning * - ``NOT_APPLICABLE`` - The field does not apply to the candidate's antigen source or context. * - ``NOT_COMPUTED`` - The computation did not run or the required input was unavailable. * - ``NO_DETECTED_MUTATION`` - A mutation scan ran and did not detect a mutation in the requested gene. * - ``NO_MATCHED_WT`` - A matched wildtype peptide or sequence could not be identified for the candidate. * - ``NO_LOHHLA_RESULT_FOR_ALLELE`` - LOHHLA ran, but did not report a p-value for the candidate's presented allele. * - ``NA`` - Legacy or tool-derived missing value. Check the column and antigen source before interpreting it. Agretopicity and CCF-aware priority scores are most meaningful for variant sources where a matched wildtype sequence and tumor context are available. For non-variant sources, and for variant rows without a matched wildtype peptide or CCF input, these fields can be missing or marked as not computed. Use ``priority_score_recommended`` as the primary cross-source sorting score. The companion ``priority_score_basis`` column records which source score was selected for that row. LOHHLA p-values are allele-level annotations. Use ``lohhla_allele_loss_status`` to distinguish an available p-value from an allele that did not have a LOHHLA result. Missing p-values should not be interpreted as evidence against HLA loss. Somatic Single Nucleotide Variants (SNVs) ========================================= .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("SNV" for somatic single nucleotide variant pMHCs) * - variant_callers - Variant callers which detected variant * - variant_effect - Variant's effect on protein * - mut_aa_pos - Mutated amino acid position index within ``peptide`` sequence (0-indexed) * - nt_context - Nucleotide sequence context around the coding variant (includes coding variant) * - pep_context - Peptide sequence context around the variant amino acid (includes variant amino acid) * - snv_alt_allele - Coding variant allele * - snv_ref_allele - Reference allele * - snv_type - SNV type ("missense" for relevant SNVs) * - transcript_id - Ensembl transcript identifier * - variant_coords - Genomic coordinates of coding variant * - variant_position_in_cds - Where, within the transcript coding sequence, the variant occurs (0-index) * - n_flank - The N flank protein sequence upstream of peptide * - c_flank - The C flank protein sequence downstream of peptide * - hlapollo_train_allele - Whether the pMHC's allele is within HLApollo's training set or not * - hlapollo_mhc_pred_0 - Logit score for the input example, higher numbers indicate more likely binding/presentation * - hlapollo_mhc_pred_0_rank - Percentile Rank for the input example, lower numbers indicate more likely binding/presentation * - pepsickle_0.2.1_max_pep_aa_cleav_score - The maximum cleavage score among the amino acids within the peptide of interest * - pepsickle_0.2.1_plus1_aa_cleav_score - The cleavage score of the amino acid immediately downstream of peptide of interest * - pepsickle_0.2.1_score - pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) -- higher is better * - rna_reads_covering_genomic_origin - Number of RNA tumor reads covering genomic origin of pMHC * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds - Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - proportion_rna_reads_covering_genomic_origin_with_peptide_cds - rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds * - mhcflurry_agretopicity - Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence * - mhcflurry_agretopicity_status - Status for MHCflurry agretopicity calculation: ``WT_MATCH_FOUND``, ``NO_WT_MATCH``, or ``WT_TOOL_SCORE_MISSING`` * - vaf - Variant allele frequency * - totcopynum - Total copy number of the variant allele * - multiplicity - Multiplicity value used for cancer cell fraction (CCF) calculations * - ccf - Cancer cell fraction (clonality) * - gene_name - Gene symbol * - gene_id - Ensembl gene identifier * - mean_mtec_tpm - Mean expression (in TPM) of SNV-harboriing transcript in mTEC cells * - median_mtec_tpm - Median expression (in TPM) of SNV-harboring transcript in mTEC cells * - stdev_mtec_tpm - Standard deviation of expression (in TPM) of SNV-harboring transcript in mTEC cells * - mean_mtec_num_reads - Mean count of RNA-seq reads for SNV-harboring transcript in mTEC cells * - median_mtec_num_reads - Median count of RNA-seq reads for SNV-harboring transcript in mTEC cells * - stdev_mtec_num_reads - Standard deviation of count of RNA-seq reads for SNV-harboring transcript in mTEC cells * - gene_detectable_normal_tissues - Normal tissues with detectable expression of SNV-harboring transcript (likely irrelevant for pMHC prioritization) * - gene_main_subcellular_location - Subcellular location of SNV-harboring transcript * - origin_descriptor - Generic identifier * - tpm - Transcripts per million (Salmon by default) * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_recommended - Recommended priority score for cross-source ranking, selected from the best available priority score column for the row * - priority_score_basis - Basis for ``priority_score_recommended``, such as ``MHCFLURRY_PRIMARY_ALIGNMENTS_WITH_CCF``, ``MHCFLURRY_WITH_CCF``, ``MHCFLURRY_PRIMARY_ALIGNMENTS_NO_CCF``, ``MHCFLURRY_NO_CCF``, ``NETMHCPAN_PRIMARY_ALIGNMENTS_WITH_CCF``, ``NETMHCPAN_WITH_CCF``, ``NETMHCPAN_PRIMARY_ALIGNMENTS_NO_CCF``, ``NETMHCPAN_NO_CCF``, ``MAXIMUM_AVAILABLE_PRIORITY_SCORE``, or ``NOT_COMPUTED`` Somatic Insertions and Deletions (InDels) ========================================= .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("INDEL" for somatic insertion and deletion pMHCs) * - variant_callers - Variant callers which detected variant * - variant_effect - Variant's effect on protein * - indel_alt_allele - Coding variant allele * - indel_ref_allele - Reference allele * - indel_type - Insertion or deletion type * - nt_context - Nucleotide sequence context around the coding variant (includes coding variant) * - pep_context - Peptide sequence context around the variant amino acid (includes variant amino acid) * - transcript_id - Ensembl transcript identifier * - valid_ref_orf - Confirms translated reading frame is correct * - variant_coords - Genomic coordinates of coding variant * - n_flank - The N flank protein sequence upstream of peptide * - c_flank - The C flank protein sequence downstream of peptide * - hlapollo_train_allele - Whether the pMHC's allele is within HLApollo's training set or not * - hlapollo_mhc_pred_0 - Logit score for the input example, higher numbers indicate more likely binding/presentation * - hlapollo_mhc_pred_0_rank - Percentile Rank for the input example, lower numbers indicate more likely binding/presentation * - pepsickle_0.2.1_max_pep_aa_cleav_score - The maximum cleavage score among the amino acids within the peptide of interest * - pepsickle_0.2.1_plus1_aa_cleav_score - The cleavage score of the amino acid immediately downstream of peptide of interest * - pepsickle_0.2.1_score - pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) -- higher is better * - rna_reads_covering_genomic_origin - Number of RNA tumor reads covering genomic origin of pMHC * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - proportion_rna_reads_covering_genomic_origin_with_peptide_cds - rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds * - primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds - Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - mhcflurry_agretopicity - Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence * - mhcflurry_agretopicity_status - Status for MHCflurry agretopicity calculation: ``WT_MATCH_FOUND``, ``NO_WT_MATCH``, or ``WT_TOOL_SCORE_MISSING`` * - vaf - Variant allele frequency * - totcopynum - Total copy number of the variant allele * - multiplicity - Multiplicity value used for cancer cell fraction (CCF) calculations * - ccf - Cancer cell fraction (clonality) * - gene_name - Gene symbol * - gene_id - Ensembl gene identifier * - mean_mtec_tpm - Mean expression (in TPM) of InDel-harboriing transcript in mTEC cells * - median_mtec_tpm - Median expression (in TPM) of InDel-harboring transcript in mTEC cells * - stdev_mtec_tpm - Standard deviation of expression (in TPM) of InDel-harboring transcript in mTEC cells * - mean_mtec_num_reads - Mean count of RNA-seq reads for InDel-harboring transcript in mTEC cells * - median_mtec_num_reads - Median count of RNA-seq reads for InDel-harboring transcript in mTEC cells * - stdev_mtec_num_reads - Standard deviation of count of RNA-seq reads for InDel-harboring transcript in mTEC cells * - gene_detectable_normal_tissues - Normal tissues with detectable expression of InDel-harboring transcript (likely irrelevant for pMHC prioritization) * - gene_main_subcellular_location - Subcellular location of InDel-harboring transcript * - origin_descriptor - Generic identifier * - tpm - Transcripts per million (Salmon by default) * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification Splice variants ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("SPLICE" for splice variant pMHCs) * - coding_sequence - Nucleotide coding sequence for peptide of interest * - pep_context - Peptide sequence context around the variant amino acid (includes variant amino acid) * - splice_coords - Coordinates of the splice event * - splice_description - Description of the splice event (see SNAF documentation) * - n_flank - The N flank protein sequence upstream of peptide * - c_flank - The C flank protein sequence downstream of peptide * - hlapollo_train_allele - Whether the pMHC's allele is within HLApollo's training set or not * - hlapollo_mhc_pred_0 - Logit score for the input example, higher numbers indicate more likely binding/presentation * - hlapollo_mhc_pred_0_rank - Percentile Rank for the input example, lower numbers indicate more likely binding/presentation * - pepsickle_0.2.1_max_pep_aa_cleav_score - The maximum cleavage score among the amino acids within the peptide of interest * - pepsickle_0.2.1_plus1_aa_cleav_score - The cleavage score of the amino acid immediately downstream of peptide of interest * - pepsickle_0.2.1_score - pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) -- higher is better * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds - Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - gene_id - Ensembl gene identifier * - origin_descriptor - Generic identifier * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification Fusions ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("FUSION" for fusion-derived pMHCs) * - fusion_annotation - Annotation information about the fusion event * - fusion_id - Fusion identifier * - fusion_left_breakpoint - Fusion's left breakpoint * - fusion_left_gene - Fusion's left gene symbol * - fusion_left_transcript - Fusion's left ensembl transcript identifier * - fusion_right_breakpoint - Fusion's right breakpoint * - fusion_right_gene - Fusion's right gene symbol * - fusion_right_transcript - Fusion's right ensembl transcript identifier * - fusion_type - Type of fusion * - nt_context - Nucleotide sequence context around the coding variant (includes coding variant) * - pep_context - Peptide sequence context around the variant amino acid (includes variant amino acid) * - n_flank - The N flank protein sequence upstream of peptide * - c_flank - The C flank protein sequence downstream of peptide * - hlapollo_train_allele - Whether the pMHC's allele is within HLApollo's training set or not * - hlapollo_mhc_pred_0 - Logit score for the input example, higher numbers indicate more likely binding/presentation * - hlapollo_mhc_pred_0_rank - Percentile Rank for the input example, lower numbers indicate more likely binding/presentation * - pepsickle_0.2.1_max_pep_aa_cleav_score - The maximum cleavage score among the amino acids within the peptide of interest * - pepsickle_0.2.1_plus1_aa_cleav_score - The cleavage score of the amino acid immediately downstream of peptide of interest * - pepsickle_0.2.1_score - pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) -- higher is better * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds - Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - mhcflurry_agretopicity - Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence * - mhcflurry_agretopicity_status - Status for MHCflurry agretopicity calculation: ``WT_MATCH_FOUND``, ``NO_WT_MATCH``, or ``WT_TOOL_SCORE_MISSING`` * - origin_descriptor - Generic identifier * - tpm - Transcripts per million (Salmon by default; TPMs listed separately for each original transcript) * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification ERVs ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("ERV" for ERV-derived pMHCs) * - erv_norm_cpm - ERV counts per million (normal RNA sample) * - erv_orf_id - ERV open reading frame identifier (from gEVE) * - erv_orf_raw_read_count - ERV open reading frame raw read count (tumor RNA sample) * - erv_orf_tpm - ERV open reading frame transcripts per million * - erv_tumor_cpm - ERV counts per million (tumor RNA sample) * - erv_tumor_cpm_to_norm_cpm_delta - Difference between tumor CPM and normal CPM (log(tumor cpm + 1) - log(normal cpm + 1)) * - pep_context - Peptide sequence context around the variant amino acid (includes variant amino acid) * - n_flank - The N flank protein sequence upstream of peptide * - c_flank - The C flank protein sequence downstream of peptide * - hlapollo_train_allele - Whether the pMHC's allele is within HLApollo's training set or not * - hlapollo_mhc_pred_0 - Logit score for the input example, higher numbers indicate more likely binding/presentation * - hlapollo_mhc_pred_0_rank - Percentile Rank for the input example, lower numbers indicate more likely binding/presentation * - pepsickle_0.2.1_max_pep_aa_cleav_score - The maximum cleavage score among the amino acids within the peptide of interest * - pepsickle_0.2.1_plus1_aa_cleav_score - The cleavage score of the amino acid immediately downstream of peptide of interest * - pepsickle_0.2.1_score - pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) -- higher is better * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds - Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - erv_hervq_region - ERV open read frame's hERVQuant region (if any) * - erv_geve_annot - ERV open reading frame gEVE annotation * - erv_ribo_cov_mean - ERV open reading frame ribo-seq mean coverage (external references) * - erv_ribo_probe_count - ERV open reading frame ribo-seq probe count (external references) * - erv_hervq_region_total_erv_orf_count - Total number of ERV open reading frames in corresponding hERVQuant region (if any) -- more is better * - erv_hervq_region_ribo_covd_erv_orf_count - Total number of ERV open reading frames in corresponding hERVQuant region with ribo-seq coverage (if any) -- More is better * - erv_mtec_exp_status - Whether ERV open reading frame expression in mTEC cells (no expression is better) * - erv_norm_exp_status - Whether ERV open reading frame expression in normal cells (no expression is better) * - erv_hervq_region_proteins_list - List of ERV proteins within ERV open reading frame's corresponding hERVQuant region * - erv_hervq_region_erv_uniq_proteins_count - Number of ERV proteins within ERV open reading frame's corresponding hERVQuant region * - erv_hervq_region_avg_exp_corr - Average expression correlation among ERV open reading frames within corresponding hERVQuant region (external references) * - erv_hervq_region_pairwise_corr_count - Number of pairwise ERV open reading frame expression correlations used to calculate average expression correlation (external references) * - erv_hervq_region_score - Score used for calculating ERV confidence score (1 if in hERVQuant region; 0 otherwise) * - erv_annot_score - Score used for calculating ERV confidence score (1 if valid ERV protein; 0 otherwise) * - erv_ribo_cov_mean_score - Score used for calculating ERV confidence score (1 if ribo-seq coverage; 0 otherwise) * - erv_total_erv_count_score - Score used for calculating ERV confidence score (1 if >1 ERV ORFs in hERVQuant region; 0 otherwise) * - erv_ribo_covd_erv_count_score - Score used for calculating ERV confidence score (1 if >1 ERV ORFs with ribo-seq coverage; 0 otherwise) * - erv_mtec_exp_status_score - Score used for calculating ERV confidence score (1 if not expressed in mTECs; 0 otherwise) * - erv_norm_exp_status_score - Score used for calculating ERV confidence score (1 if not expressed in normal tissues; 0 otherwise) * - erv_uniq_proteins_count_score - Score used for calculating ERV confidence score (0 if no ERV proteins, 0.25 if one unique ERV protein in hERVQuant region; 0.5 if two unique ERV proteins in hERVQuant region; 0.75 if three unique ERV proteins in hERVQuant region; 1.0 if four unique ERV proteins in hERVQuant region) * - erv_avg_exp_corr_within_hervq_region_score - Score used for calculating ERV confidence score (bound between [0, 1]; expression correlation value) * - erv_raw_erv_orf_confidence_score - Raw ERV score * - erv_normd_erv_orf_confidence_score - Normalized ERV score (bound between [0, 1]) * - origin_descriptor - Generic identifier * - tpm - Transcripts per million (Salmon by default) * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification Viruses ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("VIRUS" for viral pMHCs) * - pep_context - Sequence context around peptide of interest * - virus_id - Viral identifier * - rna_reads_covering_genomic_origin - Number of RNA tumor reads covering genomic origin of pMHC * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - origin_descriptor - Generic identifier * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification Cancer-testis Antigens and Self-antigens ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - Relevant HLA allele * - peptide - Peptide sequence * - mhcflurry_2.1.1.aff - Binding affinity from MHCflurry * - mhcflurry_2.1.1.aff_perc - Binding affinity percent from MHCflurry * - mhcflurry_2.1.1.proc_score - Processing score from MHCflurry * - mhcflurry_2.1.1.pres_score - Presentation score from MHCflurry * - mhcflurry_2.1.1.pres_perc - Presentation score percent from MHCflurry * - antigen_source - Antigen source ("CTA/SELF" for CTA/Self-antigen pMHCs) * - pep_context - Sequence context around peptide of interest * - transcript_id - Ensembl transcript identifier * - n_flank - The N flank protein sequence upstream of peptide * - c_flank - The C flank protein sequence downstream of peptide * - hlapollo_train_allele - Whether the pMHC's allele is within HLApollo's training set or not * - hlapollo_mhc_pred_0 - Logit score for the input example, higher numbers indicate more likely binding/presentation * - hlapollo_mhc_pred_0_rank - Percentile Rank for the input example, lower numbers indicate more likely binding/presentation * - pepsickle_0.2.1_max_pep_aa_cleav_score - The maximum cleavage score among the amino acids within the peptide of interest * - pepsickle_0.2.1_plus1_aa_cleav_score - The cleavage score of the amino acid immediately downstream of peptide of interest * - pepsickle_0.2.1_score - pepsickle_0.2.1_plus1_aa_cleav_score * (1 - pepsickle.0.2.1_plus1_aa_cleav_score) -- higher is better * - all_transcript_ids_encoding_peptide - All transcript identifiers encoding peptide of interest -- useful for ensuring non-CTA transcripts are not encoding the peptide * - all_gene_ids_encoding_peptide - All gene identifiers encoding peptide of interest -- useful for ensuring non-CTA genes are not encoding the peptide * - all_gene_names_encoding_peptide - All gene symbols encoding peptide of interest -- useful for ensuring non-CTA genes are not encoding the peptide * - rna_reads_covering_genomic_origin - Number of RNA tumor reads covering genomic origin of pMHC * - rna_reads_covering_genomic_origin_with_peptide_cds - Number of RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - primary_aln_rna_reads_covering_genomic_origin_with_peptide_cds - Number of primary alignment RNA tumor reads covering genomic origin of pMHC with coding sequence of pMHC * - proportion_rna_reads_covering_genomic_origin_with_peptide_cds - rna_reads_covering_genomic_origin_with_peptide_cds/proportion_rna_reads_covering_genomic_origin_with_peptide_cds * - mhcflurry_agretopicity - Agretopicity calculated by MHCflurry using BLASTP-derived nearest Wildtype sequence * - mhcflurry_agretopicity_status - Status for MHCflurry agretopicity calculation: ``WT_MATCH_FOUND``, ``NO_WT_MATCH``, or ``WT_TOOL_SCORE_MISSING`` * - mean_mtec_tpm - Mean expression (in TPM) of CTA transcript in mTEC cells * - median_mtec_tpm - Median expression (in TPM) of CTA transcript in mTEC cells * - stdev_mtec_tpm - Standard deviation of expression (in TPM) of CTA transcript in mTEC cells * - mean_mtec_num_reads - Mean count of RNA-seq reads for CTA transcript in mTEC cells * - median_mtec_num_reads - Median count of RNA-seq reads for CTA transcript in mTEC cells * - stdev_mtec_num_reads - Standard deviation of count of RNA-seq reads for CTA transcript in mTEC cells * - gene_detectable_normal_tissues - Normal tissues with detectable expression of CTA * - gene_main_subcellular_location - Subcellular location of CTA * - gene_name - Gene symbol * - gene_id - Ensembl gene identifier * - origin_descriptor - Generic identifier * - tpm - Transcripts per million (Salmon by default) * - priority_score_mhcflurry - Prioritization score of pMHC using MHCflurry binding affinity * - priority_score_mhcflurry_no_ccf using MHCflurry binding affinity - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity * - priority_score_mhcflurry_prim_alns - Prioritization score of pMHC using MHCflurry binding affinity and RNA tumor primary alignments for quantification * - priority_score_mhcflurry_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using MHCflurry binding affinity and RNA tumor primary alignments for quantification Tumor-level Metrics ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - b2m_mutations - List of detected B2M mutations, or ``NO_DETECTED_MUTATION`` when the mutation scan ran and found none * - b2m_tpm - B2M gene-level TPM in RNA tumor sample, or ``NOT_COMPUTED`` if expression could not be extracted * - tap1_mutations - List of detected TAP1 mutations, or ``NO_DETECTED_MUTATION`` when the mutation scan ran and found none * - tap1_tpm - TAP1 gene-level TPM in RNA tumor sample, or ``NOT_COMPUTED`` if expression could not be extracted * - tap2_mutations - List of detected TAP2 mutations, or ``NO_DETECTED_MUTATION`` when the mutation scan ran and found none * - tap2_tpm - TAP2 gene-level TPM in RNA tumor sample, or ``NOT_COMPUTED`` if expression could not be extracted HLA Allele-level Metrics ======================================== .. list-table:: :widths: 50 50 :header-rows: 1 * - Column Name - Description * - allele - HLA allele is interest * - hla_allele_raw_read_aligned_count - Total number of HLA-aligned RNA tumor reads * - hla_allele_proportion_rna_tumor_reads - Proportion of all RNA tumor reads assigned to HLA allele * - hla_allele_proportion_hla_rna_tumor_reads - Proportion of HLA-aligned RNA tumor reads assigned to HLA allele * - hla_allele_support - Tool-level support for HLA allele * - lohhla_allele_loss_pval - LOHHLA allele-specific loss test p-value * - lohhla_allele_loss_status - Status for LOHHLA p-value availability: ``LOH_RESULT_AVAILABLE`` or ``NO_LOHHLA_RESULT_FOR_ALLELE`` Optional columns ================ The LENS report may contain additional columns depending upon the tools and references utilized. For example, users with licenses for the ``NetMHC`` tool suite that add them to the tools used (see LINK) will see ``NetMHC``-specific columns. A list of optional columns is listed below. .. list-table:: Optional Columns :widths: 50 50 :header-rows: 1 * - Column Name - Description * - netmhcpan_4.1b.score_el - Elution score from NetMHCpan * - netmhcpan_4.1b.perc_rank_el - Elution percent rank from NetMHCpan * - netmhcpan_4.1b.score_ba - Binding affinity score from NetMHCpan * - netmhcpan_4.1b.perc_rank_ba - Binding affinity percent rank from NetMHCpan * - netmhcpan_4.1b.aff_nm - Binding affinity from NetMHCpan * - netmhcstabpan_1.0.stab_pred_score - Binding stability score from NetMHCstabpan * - netmhcstabpan_1.0.halflife_hours - Half life in hours from NetMHCstabpan * - netmhcstabpan_1.0.perc_rank_stab - Binding stability percent rank from NetMHCstabpan * - netmhcpan_agretopicity - Agretopicity calculated by NetMHCpan using BLASTP-derived nearest Wildtype sequence * - netmhcpan_agretopicity_status - Status for NetMHCpan agretopicity calculation: ``WT_MATCH_FOUND``, ``NO_WT_MATCH``, or ``WT_TOOL_SCORE_MISSING`` * - priority_score_netmhcpan - Prioritization score of pMHC using NetMHCpan binding affinity * - priority_score_netmhcpan_no_ccf using NetMHCpan binding affinity - Prioritization score of pMHC (calculated without CCF) using NetMHCpan binding affinity * - priority_score_netmhcpan_prim_alns - Prioritization score of pMHC using NetMHCpan binding affinity and RNA tumor primary alignments for quantification * - priority_score_netmhcpan_no_ccf_prim_alns - Prioritization score of pMHC (calculated without CCF) using NetMHCpan and RNA tumor primary alignments for quantification * - priority_score_recommended - Recommended priority score for cross-source ranking, selected from the best available priority score column for the row * - priority_score_basis - Basis for ``priority_score_recommended``, such as ``MHCFLURRY_PRIMARY_ALIGNMENTS_WITH_CCF``, ``MHCFLURRY_WITH_CCF``, ``MHCFLURRY_PRIMARY_ALIGNMENTS_NO_CCF``, ``MHCFLURRY_NO_CCF``, ``NETMHCPAN_PRIMARY_ALIGNMENTS_WITH_CCF``, ``NETMHCPAN_WITH_CCF``, ``NETMHCPAN_PRIMARY_ALIGNMENTS_NO_CCF``, ``NETMHCPAN_NO_CCF``, ``MAXIMUM_AVAILABLE_PRIORITY_SCORE``, or ``NOT_COMPUTED``