A systematic approach to the reporting of medically relevant findings from whole genome sequencing

Background The MedSeq Project is a randomized clinical trial developing approaches to assess the impact of integrating genome sequencing into clinical medicine. To facilitate the return of results of potential medical relevance to physicians and patients participating in the MedSeq Project, we sought to develop a reporting approach for the effective communication of such findings. Methods Genome sequencing was performed on the Illumina HiSeq platform. Variants were filtered, interpreted, and validated according to methods developed by the Laboratory for Molecular Medicine and consistent with current professional guidelines. The GeneInsight software suite, which is integrated with the Partners HealthCare electronic health record, was used for variant curation, report drafting, and delivery. Results We developed a concise 5–6 page Genome Report (GR) featuring a single-page summary of results of potential medical relevance with additional pages containing structured variant, gene, and disease information along with supporting evidence for reported variants and brief descriptions of associated diseases and clinical implications. The GR is formatted to provide a succinct summary of genomic findings, enabling physicians to take appropriate steps for disease diagnosis, prevention, and management in their patients. Conclusions Our experience highlights important considerations for the reporting of results of potential medical relevance and provides a framework for interpretation and reporting practices in clinical genome sequencing. Electronic supplementary material The online version of this article (doi:10.1186/s12881-014-0134-1) contains supplementary material, which is available to authorized users.


I. RESULTS RELEVANT TO INDICATION FOR TESTING
For this patient with a diagnosis of cardiomyopathy, we reviewed all variants found in 62 genes with known association with hereditary cardiovascular disease and identified one variant of uncertain significance. More information is needed to determine if this variant contributes to disease.

Phenotype
Gene Transcript

A. MONOGENIC DISEASE RISK: 0 VARIANTS IDENTIFIED
This test did not identify any genetic variants that may be responsible for existing disease or the development of disease in this individual's lifetime. As a carrier for recessive genetic variants, this individual is at higher risk for having a child with one or more of these highly penetrant disorders. To determine the risk for this individual's future children to be affected, the partner of this individual would also need to be tested for variants in these genes. Other biologically related family members may also be carriers of these variants. *Carriers for some recessive disorders may be at risk for certain phenotypes. Please see variant descriptions for more information.

C. PHARMACOGENOMIC ASSOCIATIONS
This test identified the following pharmacogenomic associations. Additional pharmacogenomic results may be requested, but will require additional molecular confirmation prior to disclosure.

D. RED BLOOD CELL AND PLATELET ANTIGENS
This test identified the ABO Rh blood type as B Negative. Additional blood group information is available at the end of the report. DCM usually presents with any one of the following: Heart failure with symptoms of congestion and/or reduced cardiac output, arrhythmias and/or conduction system disease and thromboembolic disease including stroke. The incidence of DCM is currently underestimated. Familial dilated cardiomyopathy is principally caused by genetic mutations in genes that encode for cytoskeletal and sarcomeric proteins in the cardiac myocyte. Adapted from GeneReviews abstract: http://www.ncbi.nlm.nih.gov/books/NBK1309/. FAMILIAL RISK: Dilated Cardiomyopathy due to pathogenic variants in the RBM20 gene is typically inherited in an autosomal dominant pattern. Each first-degree relative has a 50% chance of inheriting the variant and its risk for disease.

A. MONOGENIC DISEASE RISK
This test did not identify any genetic variants that may be responsible for existing disease or the development of disease in this individual's lifetime.

VARIANT INTERPRETATION:
The p.Glu366Lys variant in SERPINA1 (also known as p.Glu342Lys or PI*Z) is the most common alpha-1 antitrypsin deficiency allele, leading to a high risk of emphysema (and to a lesser extent liver disease) when homozygous. In summary, even with the high population frequency of this variant, it meets our criteria to be classified as pathogenic. DISEASE INFORMATION: Alpha-1 Antitrypsin Deficiency Disorder (AATD) is one of the most common metabolic disorders in persons of northern European heritage, occurring in approximately one in 5,000-7,000 individuals in North America and one in 1,500-3,000 in Scandinavians. COPD, specifically emphysema, is the most common manifestation of AATD and smoking is the major factor influencing age of onset and course of disease. Some individuals also present with liver disease. AATD is caused by homozygosity for the common deficiency allele, PI*Z, of SERPINA1. Clinical manifestations are infrequent in heterozygotes, except in some smokers. Adapted from GeneReviews: http://www.ncbi.nlm.nih.gov/books/NBK1519/ FAMILIAL RISK: AATD is inherited in an autosomal recessive manner. The risk of this patient's child having AATD is dependent on the carrier status of the patient's partner. Two carriers have a 25% risk for having a child with AATD. Other biologically related family members may also be carriers of this variant. Elevated HDL

VARIANT INTERPRETATION:
The p.Ser289Phe variant in LIPC has been reported in 1 compound heterozygous individual with hepatic lipase deficiency and segregated with disease in 3 affected compound heterozygous relatives from 1 family (Hegele 1991). This variant has been identified in 0.13% (11/8584) of European American chromosomes and 0.05% (4/4384) of African American chromosomes by the NHLBI Exome Sequencing Project (http://evs.gs.washington.edu/EVS/; dbSNP rs121912502). Although this variant has been seen in the general population, its frequency is low enough to be consistent with a recessive carrier frequency. In vitro assays indicate the p.Ser289Phe variant leads to reduced LIPC activity (Durstenfeld 1994). However, these types of assays may not accurately represent biological function. Computational prediction tools and conservation analysis also suggest that the p.Ser289Phe variant may impact the protein, though this information is not predictive enough to determine pathogenicity. In summary, while there is some suspicion for a pathogenic role, the clinical significance of the p.Ser289Phe variant is uncertain.

Rare RBC Antigens
No rare presence or absence of RBC antigens was identified.

Rare Platelet Antigens
No rare presence or absence of platelet antigens was identified.

DISCUSSION
These red blood cell (RBC) and human platelet antigen (HPA) predictions are based on published genotype to phenotype correlations for the alleles present. Some antigens have also been serologically determined using traditional blood typing methods. During pregnancy or transfusion alloantibodies to blood group antigens and platelet antigens can form against foreign RBCs that contain immunogenic blood group and platelet antigens that the recipient is missing. These alloantibodies can cause clinically important complications during future transfusions and pregnancy.

Blood Production Transfusion
This individual does NOT have an increased risk of forming unusual RBC or platelet alloantibodies, since this test revealed a normal presence of high frequency antigens and no antigen gene rearrangements.

LABORATORY FOR MOLECULAR MEDICINE
Accession ID: PMXX-12345 CENTER FOR PERSONALIZED GENETIC MEDICINE Name: Doe, Jane

GENOME REPORT (CONTINUED) Blood Production Donation
This individual does NOT pose an increased risk to blood product recipients since this test revealed a normal presence of high frequency antigens and no antigen gene rearrangements.

RED BLOOD CELL ANTIGENS
Wr (

[-] [-] [-] [-] [-]
Key: [+] presence of antigen predicted by genotyping; + presence of antigen predicted by genotyping and confirmed by serology; +* presence of antigen detected by serology, genotype prediction not available; [+w] weak presence of antigen predicted by genotyping; +w weak presence of antigen predicted by genotyping and confirmed by serology; +w* weak presence of antigen detected by serology, genotype prediction not available; [−] absence of antigen predicted by genotyping; -absence of antigen predicted by genotyping and confirmed by serology, -* absence of antigen detected by serology, genotype prediction not available; NC indicates no sequencing coverage, Dis indicates discordant. Rare (less than 5% population frequency) presence or absence of antigen is indicated in red.

METHODOLOGY
Genomic sequencing is performed using next generation sequencing on the Illumina HiSeq platform. Genomes are sequenced to at least 30X mean coverage and a minimum of 95% of bases are sequenced to at least 8X coverage. Paired-end 100bp reads are aligned to the NCBI reference sequence (GRCh37) using the Burrows-Wheeler Aligner (BWA), and variant calls are made using the Genomic Analysis Tool Kit (GATK). Variants are subsequently filtered to identify: (1) variants classified as disease causing in public databases; (2) nonsense, frameshift, and +/-

LIMITATIONS
It should be noted that this test does not sequence all bases in a human genome and not all variants have been identified or interpreted. Triplet repeat expansions, translocations and large copy number events are currently not reliably detected by genome sequencing. Furthermore, not all disease-associated genes have been identified and the clinical significance of variation in many genes is not well understood. It is recommended that genomic sequencing data is periodically reinterpreted, especially when new symptoms arise.

COVERAGE OF ANALYZED GENES RELEVANT TO CARDIOVASCULAR DISEASE
The table below provides a list of genes relevant to cardiovascular disease that were evaluated during this individual's genome sequencing analysis. The proportion of the gene covered at ≥8X, e.g. the proportion of the gene with at least 8 mapped reads, is also provided. Please note that the presence of pathogenic variation in genes not analyzed or with incomplete coverage cannot be fully excluded.