Manifest Specifications

This page provides the full column reference, sample naming conventions, complex sample set configurations, and manifest validation for RAFT manifests. For a task-oriented guide to creating manifests, see Preparing Your Samples.

Manifest columns

A RAFT manifest must have at least the columns defined in the table below. Columns can be in any order and other columns containing non-RAFT metadata are also allowed.

RAFT Columns

Column

Description

Allowed values

Dataset

Name for collection of patients

Free text

Patient_Name

Name for collection of samples

Free text

Run_Name

Name for the specific sample

Free text (see note below)

File_Prefix

Base name (or full path) of input files

Free text

Sequencing_Method

Sequencing protocol for sample

(RNA-seq, WES, WXS, WGS)

Normal

Is the sample normal or abnormal (tumor)?

(TRUE, FALSE)

Note

A sample’s Run_Name is instrumental in guiding samples through some RAFT workflows. A sample’s Run_Name should have a two-letter prefix that describes the type of sample and a delimiter (- or _) followed by an arbitrary unique identifier. The first letter of the prefix is either a (for abnormal) or n (for normal). The second letter is either r (for RNA) or d (for DNA). For example, a sample with an ar- (or ar\_) prefix is an abnormal (tumor) RNA sample while a sample with a nd- prefix is a normal DNA sample.

Complex sample sets

Users may encounter situations that require more than one sample per sample type per patient. For example, users may have a single set of DNA samples (normal DNA and tumor DNA), but may have multiple RNA-seq samples (e.g. multiple timepoints). RAFT’s subjoin functionality supports these cases via a Group column in the manifest:

Patient_Name Run_Name Dataset File_Prefix Sequencing_Method Normal Group
Pt01 ad-Pt01-03A     AML     9f7f7   WES     FALSE 1-2
Pt01 nd-Pt01-11A     AML     8e74a   WES     TRUE 1-2
Pt01 ar-Pt01-03A     AML     cdb288  RNA-Seq FALSE 1
Pt01 ar-Pt01-03B     AML     cdb289  RNA-Seq FALSE 2

This scenario depicts a patient (Pt01) with a DNA normal sample, a DNA tumor sample, and two RNA-seq samples. In this example, ar-Pt01-03A is a pre-treatment sample and ar-PT01-03B is a post-treatment sample. LENS will be run on two distinct sample sets:

The pre-treatment sample set:

  • ad-Pt01-03A

  • nd-Pt01-11A

  • ar-Pt01-03A

The post-treatment sample set:

  • ad-Pt01-03A

  • nd-Pt01-11A

  • ar-Pt01-03B

Each sample set produces its own LENS report.

Note

Groups identifiers do not have to be numbers. Descriptive identifiers (e.g. pre-treatment and post-treatment) are also supported.

Validating a manifest

LENS automatically checks manifest integrity when raft is run in either run-ots or run-workflow modes.

To manually verify a manifest:

raft check-manifest -m </PATH/TO/MANIFEST>

This validates required columns, allowed values, Run_Name prefix conventions, cross-sample HLA consistency, and other integrity checks.