Changelog

Version 1.9

  • Fixed subset filtering in get_groups process to remove groups whose runs are a strict subset of another group’s runs.

  • Fixed bug in get_groups process where group entries with exact dataset matches (e.g., combo[2] == entry[3]) were not being included in fulfilled combos.

  • Fixed bug in subjoin process where empty run sets produced invalid group entries, now skipped with continue.

  • Adding comprehensive QC for support of LENS report generator UI

  • Adding Qualimap v2.3 as part of comprehensive QC effort

  • Including cnvkit diagram as QC output

  • Modifying tissue-specific expression filtering (e.g. ERVs) to pan-tissue expression filtering

  • Updated peptide-level ERV reference

  • Improving memory usage (along with RAFT changes) to allow LENS to run on systems with fewer resources

  • Adding hg38 bwa-mem2 index files to reference download to allow LENS to run on systems with fewer resources

  • Updating bwa-mem2 to v2.3

  • Fixing bug with user-provided alleles

  • Fixing MHCflurry input file generator which interferred with splice variant pMHC characterization

  • Updating a date-based tagged HLApollo Docker image

  • Updating to versioned longgf Docker image

  • Updating to versioned snpEff Docker image to BioContainer v5.4

  • Removing unneeded ERVs from viral reference

  • Adding viral integration detection

  • Improving memory usage by accounting for input file size when allocating resources

  • Improved MHCflurry and NetMHCpan agretopicity calculations by using short-peptide BLASTP settings and adding explicit agretopicity status columns for missing matched-wildtype cases

  • Added explicit LOHHLA allele-loss status reporting so missing LOHHLA p-values are distinguished from available allele-loss results

  • Added recommended priority score and basis columns for consistent cross-source candidate ranking

  • Added a workflow-produced LENS run QC summary next to final reports, covering report row counts, candidate IDs, antigen-source counts, important-column missingness, and validation messages

  • Suppressed empty per-transcript BED files when requested transcripts have no CDS intervals in the workflow GTF, and added a transcript BED summary TSV for auditability

Version 1.8

  • Updated reference download script to allow detection of symlinked references

  • Removed VarScan2 from default variant detectors – VarScan2 is still available if needed

  • Added support for DeepSomatic (primarily for TATTALR development)

  • Improved FASTQ sanity checking such that all FASTQs in manifest are confirmed to exist prior to processing

  • Added bypass for pMHC characterization second pass in cases where no tools are specified

  • Fixed bug in pMHC characterization second pass in cases where all tools are not allele-specific

  • Fixed issue where NetMHC tools were breaking second pass aggregation

  • Fixed issue with MHCflurry mouse allele nomenclature preventing proper pMHC metric aggregation

  • Removed unneeded chrEBV samtools call in virdetect

  • Added ability to specify custom priorization scores

  • Changed HLA consensus tool threshold to prevent hard-coded value interrupting RNA only samples

  • Corrected order of gene count transformations for gene signatures

  • Updated Jacquard’s captured tags to include total depth and alt. depth to fix MAF

  • Secondary pass of pMHC characterization can now be skipped

  • Changed consensus HLA allele decision logic to within the process (to handle with RNA-only samples)

  • Updated SNAF filtering parameters

  • Added resource allocations to LENSTools processes

  • Changing STAR sorting bin count to alleviate memory issues

  • Fixed intermediate file cleaning

  • Change STAR BAMs sorting to samtools (rather than using STAR’s internal sorting) to reduce OoM errors

  • Improved tool resource allocations to multiple tools to reduce OoM errors

Version 1.7.1

  • Fixed infinite loop bug in make_snv_peptides_context

  • Fixed inefficiencies affecting splice variant RNA tumor read support counting

  • Updated lenstools_cta_funnel to handle cases where no CTA pMHCs present

Version 1.7

  • Updated Jacquard to 1.1.5

  • Updated sample delimiters from * to + to prevent edge case of substring run identifiers

  • Added HLApollo support

  • Improved references download script to improve compatibility among LENS versions

  • Fixed gEVE-related references in download script to circumvent third party server downtime (with permission from Dr. So Nakagawa, PhD.)

  • Fixed input tuples and process blocks for netctlpan, antigen.garnish, deephlapan, and abra2 (thanks @kevinpryan).

  • Improved OptiType process block for speed improvements (thanks @kevinpryan)

  • Fixed ‘grep awk’ bug in snaf.nf

  • Fixed processing of ambiguous amino acids by MHCflurry

  • Added pepsickle support

  • Updated LENSTools to fix lenstools_make_indel_peptides_context bugs

  • Added preliminary intra-patient inter-group analysis

  • Added multipass pMHC characterization to better handle long running tools

  • Modified workflow to natively support RNA only patients (no SNV or InDels pMHCs will be reported)

  • Reimplemented support for intermediate file cleaning of trimmed FASTQs and BAMs (using raft.py run-workflow ... --clean-intermediates) – more information available in documentation and https://tinyurl.com/trick-nf-caching

  • Reduced HLA allele consensus support requirement to single tool such that all alleles by all tools will be processed. Users can filter alleles (including by support) in the LENS report.

  • Added vcf2maf as part of workflow with more stringent variant filtering

  • Added automatic checking of user-provided metadata files for raft.py run-ots and raft.py run-workflow modes

  • Fixed bug preventing intersect variant combining method from working

  • Fixed LENStools use of bcftools Docker image

  • Added documentation for LENS report columns previously missing from documentation

Version 1.6

  • Fixed bugs within subjoin that prevented proper joining in some scenarios.

  • Added bed input file to mpileup_parallel process for speed improvement.

  • Incorporated a manifest check within the LENS workflow that will terminate the workflow if an improper manifest is detected.

  • Added run-level suffix to some LENSTools outputs that did not already include it.

  • Improved split_snaf_by_sample efficiency.

  • Standardized to BioContainers Docker images where possible. Users can expected further standardization using BioContainers in the future.

  • Updated HLA typing subworkflow to incorporate multiple HLA typers.

  • Removed alns_to_lens workflow until it can be properly integrated into the overall LENS workflow.

  • Updated somalier from 0.2.17 to 0.2.19.

  • Updated igv-reports from 1.8.0 to v.14.1.

  • Updated gtfparse from 1.0.1 to 1.0.7.

  • Updated cnvkit from 0.9.9 to 0.9.12.

  • Updated blast from 2.13.0 to 2.16.0

  • Updated seqtk from 1.3 to 1.4.

  • Updated mhcflurry from 2.1.1 to 2.1.4

  • Updated whatshap from 1.2.1 to 2.4.

  • Updated lenstools from 1.5.1 to 1.6.

  • Updated varscan from 2.1.1 to 2.4.6.

  • Updated deepvariant from 1.1.0 to 1.8.0.

  • Updated starfusion from 1.10.1 to 1.14.0.

  • Updated salmon from 1.1.0 to 1.10.3.

  • Updated gffread from 0.11.7 to 0.12.7.

  • Updated bedtools from 2.28.0 to 2.31.1

  • Updated abra2 from 2.20 to 2.24.

  • Updated bcftools from 1.11 -> 1.21.

  • Updated gatk4 from 4.1.6.0 to 4.6.1.0.

  • Updated bowtie2 from 2.5.1 -> 2.5.4.

  • Updated minimap2 from 2.2.4 -> 2.28.

  • Updated bbmap from 38.86 to 39.17.

  • Updated bwa from 0.7.17 to 0.7.8.

  • Updated star from 2.7.0f to 2.7.3a

  • Updated fastp from 0.23.1 to 0.24.0.

Version 1.5.1

  • Fixed frameshift InDel peptide generation bug (where coding reading frame exceeds canonical stop codon).

  • Changed to only use seq2HLA for HLA typing due to repeated Optitype failures.

Version 1.5

  • Added antitgen processing and presentation machinery (APPM) outputs to LENS report.

  • Updated CTA peptide filtering such that peptides that occur in non-CTA transcripts are excluded.

  • Updated LENS report to include all transcript ids, gene ids, and gene names that expressed peptide of interest.

  • Added primary alignment RNA tumor read support for each pMHC. Note that all fusion-supporting RNA tumor reads are assumed to be primarily aligned.

  • Include LENS version in output file name.

  • Updated LENS workflow to support mouse samples.

  • Updated LENS workflow to start from BAM files.

  • Added LENS report to include column describing which variant callers detected SNV/InDel variants.

  • Added LOHHLA output to LENS report.

  • Added consensus-based approach to HLA typing.

  • Updated HLA allele-specific expression to apply to consensus-based HLA calls.

  • Added columns describing which HLA typers support each HLA allele.

  • Added additional prioritization metrics to LENS reports.

Version 1.4

  • Fixed Resource allocations (defined in *.config files) to reduce cloud usage burden.

  • Added gene signatures workflow (see gene_signatures module).

  • Added gene signatures workflow dependencies (binfotron, tximport, and generic).

  • Included set -o pipefail in bwa_mem2_samtools_sort process.

  • Added HLA allele-specific expression (ASE) estimates in LENS report (requires seq2hla_ase call).

  • Made HLA allele emission by seq2hla optional since sometimes seq2hla fails to identify alleles.

  • Modified manifest parsing to allow for bypassing FASTQ symlinking (for AWS and GCP applications).

  • Added QC analysis showing how filtering steps affect potential pMHC target removal (funneling).

  • Included support for ar_, nd_, and ad_ prefixes. Users are still encouraged to use the original prefix style.

  • Modified MHCflurry process to allow 0 exit code if no HLA alleles present.

  • Modified several neos module processes and workflows to allow emissions of outputs required for funneling analysis.

  • Modified cnvkit workflow in onco module to use all normal DNA samples for guessing baits. Will be optimized in the future.

  • Fixed bamblaster process call in samblaster to properly use provided CPUs and memory.

  • Added seq2hla_ase process in seq2hla module.

  • Added sequenza_merge_seqz label to merge_seqz process in sequenza module to allow proper resource allocation.

  • Modified split_snaf_by_sample in snaf module to prevent non-zero exit if no sample-specific splice variants are detected.

  • Added somatic_filter_parameter_dump process in somatic module to provide filtering information in SNV and InDel funneling plots.

  • Modified tximport to fix transcript-to-gene raw counts for gene signatures workflow.

  • Modified get_fastqs to allow copying (not symlinking) in AWS and GCP environments.

  • Fixed varscan2_somatic_parallel workflow in varscan2 module to fix InDel variant emission.

  • Added labels to processes in viral module to ensure proper resource allocation.

  • Added lohhla workflow to onco module to detect HLA loss of heterozygosity.

  • Modified some processes (e.g. varscan2_somatic_by_chr) to deal with tarball input files due to subdirectories not working properly in Google Cloud.a

  • Modified tximport Docker image to support ps dependencies for Nextflow stats reporting.