139x Filetype PDF File size 0.38 MB Source: www.cell.com
Leading Edge Essay Distilling Pathophysiology from Complex Disease Genetics 1, 2 3 Aravinda Chakravarti, * Andrew G. Clark, and Vamsi K. Mootha 1 Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA 2 Cornell University, Ithaca, NY 14850, USA 3 Massachusetts General Hospital, Boston, MA 02114, USA *Correspondence: aravinda@jhmi.edu http://dx.doi.org/10.1016/j.cell.2013.09.001 Technologies for genome-wide sequence interrogation have dramatically improved our ability to identify loci associated with complexhumandisease.However,achasmremainsbetweencorrela- tionsandcausalitythatstems,inpart,fromalimitingtheoreticalframeworkderivedfromMendelian geneticsandanincompleteunderstandingofdiseasephysiology.Hereweproposeasetofcriteria, akin to Koch’s postulates for infectious disease, for assigning causality between genetic variants andhumandiseasephenotypes. .Thus it is easy to prove that the wearing of tall hats and the carrying of incorrect knowledge is worse than no umbrellas enlarges the chest, prolongs life, and confers comparative immunity knowledge at all (Brown and Goldstein, from disease; for the statistics show that the classes which use these articles 1992). are bigger, healthier, and live longer than the class which never dreams of pos- Consider that two types of genomic sessing such things. It does not take much perspicacity to see that what really surveys, one horizontal and the other ver- makesthisdifferenceisnotthetallhatandtheumbrella,butthewealthandnour- tical, are now routine for attempting to ishment of which they are evidence, and that a gold watch or membership of a understand human biology and disease. club in Pall Mall might be proved in the same way to have the like sovereign In horizontal or broad surveys, we can virtues.. obtain the full genome sequence in tens George Bernard Shaw, The Doctor’s Dilemma (Preface), 1909 to hundreds of thousands of individuals to sort out which genomic segments are important and which are innocent Distinguishing correlation from causality der to promote the role of one or more bystanders, to a particular comparison is the essence of experimental science. genes as being ‘‘causal,’’ rather than just between individuals, such as those with Nowhere is the need for this distinction ‘‘associated,’’ in a disease process versus without coronary artery disease greatertodaythanincomplexdiseasege- (Brown and Goldstein, 1992; Falkow, or cases with early versus late onset of netics, where proof that specific genes 1988, 2004)(Box 1). dementia. In contrast, in vertical or deep have causal effects on human disease Below we discuss the nature of the surveys, we examine the effects of the phenotypesremainsanenormousburden ‘‘proof’’ that we desire in order to make genomeastheDNAinformationgetspro- andchallenge.Giventhepotentialscienti- fundamental discoveries in human path- cessed, and its encoded functions get fic and medical payoffs of disease gene ophysiology. We admit at the outset executed through its transcriptome, pro- discovery (Chakravarti, 2001), we argue that the answers are not straightforward, teome, and effectors such as the metab- in this Essay of the need for a rigorous ex- and that there are serious technical olome. Both of these classes of studies amination of the assumptions under and intellectual impediments to demon- are relevant to analysis of a disease of which we connect genes to phenotypes. strating causality for the common com- unknown etiology and have re-empha- This is particularly so in this age of routine plex disorders of man where multiple sized the long-held suspicion that study- -omic surveys, which can produce more interacting genes are involved. We ing genes one-at-a-time may not be false-positive than true-positive findings acknowledge that even unproven candi- meaningful because a gene’s effect is (Kohane et al., 2006). Moreover, genomic date genes may lead to significant usually pleiotropic, context dependent, mapping and sequencing approaches insight into disease pathophysiology. and contingent upon the state of many that are invaluable for producing a list of Nevertheless, the casual conflation of other genetic and nongenetic factors unbiased candidates are, by themselves, ‘‘mapped locus’’ to ‘‘proven gene’’ is a (Chin et al., 2012). In turn, this implies insufficientforimplicatingspecificgene(s) constant source of confusion and that proving a gene’s specific role in a in a diseaseorbiologicalprocess.Conse- obfuscation in biology and medicine biological process, either in wild-type or quently, we suggest that specific genetic that requires remedy. We hope to offer mutant form, may not be straightforward criteria, analogous to Koch’s postulates some concrete suggestions, however because its role may only be evident in microbiology,needtobesatisfiedinor- difficult they may be to satisfy, because when examined in relation to its Cell 155, September 26, 2013 ª2013 Elsevier Inc. 21 Box1.Koch’sPostulatesforComplex any overt disease phenotype, presum- The case of amyotrophic lateral scle- HumanDiseasesandTraits ably due to the buffering by other genes rosis (ALS), a devastating, progressive (MacArthur et al., 2012). Acknowledging motor neuron disease, illustrates this (1) Candidate gene variants are this complexity, there are two general point (Ludolph et al., 2012). Despite the enriched in patients. ways forward. First, at this stage of our lack of evidence, we largely describe (2) Disruptionofthegeneinamodel knowledge, perhaps we should not worry ALS as being ‘‘heterogeneous’’ and system gives rise to a model about ‘‘all’’ of the genes in a disease, in comprised of single-gene mutations that phenotype that is accepted as many ways an undefinable goal, but can individually lead to disease. In relevant and ‘‘equivalent’’ to the rather those whose effects are demon- 1993, mutations in superoxide dismutase humanphenotype. strable, i.e., through a mutation that, irre- 1(SOD1) were identified in an auto- (3) The model phenotype can spective of its interactions, can by itself somal-dominant form of the disease; be rescued with the wild-type affect a critical pathway. Second, as we subsequently, the disorder has become humanalleles. unravel the effects of multiple genes on synonymous with aberrant clearance of (4) Themodelphenotypecannotbe a phenotype, we should advance the free radicals as its central pathology. rescued with the mutant human same criterion, namely, that a set of What is often not appreciated, however, alleles. mutations affects that same critical pro- is that fewer than 10% of all cases of cess. Both of these goals are approach- ALS are familial and even fewer follow able, particularly with recent advances an apparent Mendelian pattern. Even biochemical partners, and in particular in genome-editing technologies that within this subset of cases, more than contexts of diet, pathogen exposure, allow the creation of multiple mutations 20 distinct genes, spanning other path- etc. (Zerba et al., 1996). This is a partic- within a single experimental organism ways including RNA homeostasis, have ular problem in genetic studies of any (Wang et al., 2013). The question then is been identified, and SOD1 represents a outbred nonexperimental organism, how ‘‘complex’’ are complex traits and minority of cases. The molecular etiology suchasthehuman,andstudiesofhuman diseases? for the majority of the sporadic forms of disease, where investigations are obser- the disease remains unclear, and the sci- vational not experimental. It is the strong TheNewGenetics:Understanding entific problem in understanding ALS is belief of contemporary human geneticists the Function of Variation more than simply identification of addi- that uncovering the genetic underpin- With the rediscovery of Mendel’s rules of tional genes. We may ask, can SOD1 nings of any disease, however complex, transmission more than 100 years ago, and the other described gene mutations is the surest unbiased route to under- there was a vicious debate on the lead to ALS by themselves? Are these standing its pathophysiology and, thus, relative importance of single-gene versus the key rate-limiting steps to ALS or sim- enabling its future rational therapies multifactorial inheritance (Provine, 1971). ply one of several required in concert? Is (Brooke et al., 2008). Consequently, for Geneticists quickly, and successfully, the aberrant clearance of free radicals this view to prevail, we should require focused on deciphering the specific the fundamental defect or one of many experimental evidence, be it in cells, tis- mechanisms of gene inheritance and un- such pathologies or a common down- sues, experimental models, or the rare derstanding the physiology of the gene stream consequence? Given the diver- patient, for the role of a specific gene in in lieu of answering why some pheno- sity and number of deleterious, even a disease process. We discuss here the typeshadcomplexetiologyandtransmis- loss-of-function, genetic variants in all types of evidence that we consider sion. Nevertheless, the rare examples of of our genomes (Abecasis et al., 2012; incontrovertible. deciphering the genetic basis of complex MacArthur et al., 2012) and, in the Successinthisdifficulttaskrequiresus phenotypes,suchasfortruncate(wing)in absence of stronger evidence bearing to solve a logical conundrum: how can Drosophila (Altenburg and Muller, 1920), on these questions, it is fair to assume we understand the genes underlying a clearly emphasized that traits were more that ALS patients harbor multiple muta- phenotype if some of these component than the additive properties of multiple tions with a plurality of molecular defects factors, in isolation, do not have recog- genes. Today, it is quite clear that and that free radical metabolism is only nizable phenotypes on their own? We Mendelian inheritance of traits, including one of a set of canonical pathophysiol- know that even in a simple model organ- diseases, is the exception not the rule. ogies that define the disease. No doubt, ism, budding yeast, synthetic lethality— Nevertheless, the entire language of this plurality is the case for cancer (Vo- where death or some other phenotype genetics is in terms of individual genes gelstein et al., 2013), Crohn’s disease occurs only through the conspiracy of for individual phenotypes, with one (Jostins et al., 2012), and even rare mutations at two different genes—is function, rather than the ensemble and developmental disorders such as Hirsch- widely prevalent (Costanzo et al., 2010). emergent properties of genomes. This sprung disease (McCallion et al., 2003). Interactions of greater complexity and absence of a specific genetics language In all of these cases, a richer genetics involving more than two genes are also for the proper description of the multi- vocabulary may improve our understand- known in yeast (Hartman et al., 2001) genic architecture of traits (the ensemble) ing of the phenotypes through recog- and must be true for humans as well. A remains as an impediment to our under- nizing what we know and what we human genome will typically harbor 20 standing of the nature and degree of don’t; our current language limits us to genes that are fully inactivated, without genetic complexity of the phenotype. describing genes not phenotypes. 22 Cell 155, September 26, 2013 ª2013 Elsevier Inc. Molecular biology, genetics’ twin, on molecular biology, biochemistry, and is now applicable to any human trait or the other hand, appears to have been far physiology of the genes within a mapped disease. In fact, more than 2,000 more successful in deciphering and locus to even identify the disease gene, confirmed loci, each containing multiple describing not only its individual compo- let alone understand its functions. Suc- genes, affecting susceptibility to more nents (e.g., DNA, RNA, protein) but also cessinthisendeavorwillrequireasynthe- than 100 medically relevant traits (e.g., their mutual relationships (e.g., DNA- sis of many biological disciplines that blood pressure) and disease (e.g., hyper- protein interaction) and ensembles (e.g., includes the role of genetic variation as tension) are now known (Hindorff et al., transcriptional complex), although this is intrinsic to the biological process, not an 2009). For most complextraits examined, also far from complete (Watson et al., aspect to be ignored. many such loci have been mapped, but 2007). Not only do we understand the Consequently,meldingvariation-based the vast majority of the specific genes structure of individual genes and how genetic and molecular biological thinking remain unidentified. We can sometimes their molecular functions get executed, is of critical importance for both fields guess at a candidate gene within the lo- but we are also starting to learn how and is central to our understanding of cus (Jostins et al., 2012), sometimes functionsgetregulatedthroughadiversity mechanisms of trait variation, including implicate a gene by virtue of an abun- of cis- and trans-acting functions. The interindividual variation in disease risk. If dance of rare variants among affected consequencesoftheprimaryandinterac- most disease, in most humans, is the individuals (Jostins et al., 2012), in rare tion effects are often well understood, consequence of the effects of variation circumstances, use therapeutic modula- even though not completely described, at many genes, then knowledge of their tion of a pathway to pinpoint the gene at both the molecular and cellular levels functional relationships, rather than (Moon et al., 2004), and sometimes (Alberts et al., 2007). There are also merely their identities, is central to under- identify one by painstaking experimental improving technologies and understand- standing the phenotype. This is clearly a dissection (Musunuru et al., 2010), but, ing of the structures and functions of problem of ‘‘Systems Biology’’ but one generally, identification of the underlying ensembles of proteins and cells, and that incorporates genetic variation gene has not become easier. In fact, how these interact and communicate directly. The ability to integrate the real- mostofthemappedlociunderlyingcom- with one another to create complexity ities of suchwidespreadgeneticvariation, plex traits remain unresolved at the gene (Ilsley et al., 2013). Although the use of which are ultimately at the causal root or mechanistic level. genetic tools and genetic perspectives of disease mechanisms, with systems Despite the beginning clues to human are fundamental to this progress, these biology approaches to understand func- disease pathophysiology that complex advances have not as yet led to a major tional contingencies is central to the disease mapping is providing, and the revision of our understanding of trait or challenge of deciphering complex human slow identification of individual genes, it disease variation. The major reason for disease. Importantly, it is likely to spur appears highly unlikely that we can this discrepancy is that, with few excep- newthinking in both fields. understand traits and diseases this way. tions (Raj et al., 2010), molecular and cell There is indeed evidence for scenarios biology has focused on the impact of Genetic Dissection of Complex in which variation in complex traits, deleting or overexpressing genes and Phenotypes including risk of complex disease, is not grappled with the consequences of Genetic transmission rules imply that, mediated by a myriad of variants of allelic variation. even in an intractable species such as minute effect, spread evenly across the Classical Mendelian genetics has been us, one can map genomic segments that genome (Yang et al., 2011). Therefore, a boon to uncovering biology from yeast must contain a disease or trait gene. The we need other approaches to override to humans whenever a mutation with a lure and success of this method is that this bottleneck. simple inheritance pattern can be iso- we can map a disease locus in the For Mendelian disorders, gene identifi- lated. This approach has been revolution- absenceofanyknowledgeoftheunderly- cation within a locus is made possible by ary in the unicellular yeast, particularly ing biology of the phenotype. Such eachmutationbeingnecessaryandsuffi- because genetics (and gene manipula- mapping requires identification of the cientforthephenotype,beingfunctionally tion), biochemistry, and cell biology were segregation of common sites of variation deleterious and rare, and having an inher- meldedtounderstandfunctionatavariety across the genome, now easy to identify itance pattern consistent with the pheno- of levels. This kind of multilevel approach through sequencing, and recognition of type. It’s the mutation that eventually has been less straightforward, but still a genomic segment identical-by-descent reveals the biology and explains the largely successful, for a metazoan such in affected individuals, both within and phenotype. Any component locus for a asDrosophilawheremoregenesandmul- between families. This task has become complex disease has no such restriction, tiple specialized cells often rescue the easier and more powerful as sequencing as the causal variants are neither neces- effects of a mutation or enhance its minor technology has improved to provide a sary nor sufficient, nor coding (in fact, effect. These lessons suggest to us that nearlycompletecatalogofvariantsabove theyarefrequentlynoncodingandregula- the current approach, based strictly on 1% frequency in the population; further tory) nor rare (Emison et al., 2010; Jostins genetic variation, to understanding com- improvements to sample rarer variants etal., 2012).Currently,themajorattempts plexhumandiseaseisalsogrosslyinsuffi- are ongoing (Abecasis et al., 2012). to overcome this impediment involve reli- cient and,asinyeastandflies,willrequire Consequently, genetic mapping, once ance on single severe mutations at the the contemporaneous analysis of the the province of rare Mendelian disorders, very same component genes and Cell 155, September 26, 2013 ª2013 Elsevier Inc. 23 complex inheritance problem (Yosef et al., 2013). Even more importantly, this approach might, through the effect of mutations, allow us to decipher cell cir- cuitry and understand which pathways are limiting and which are redundant. This last aspect is critical: as we argue below, with our current state of knowl- edge, we are likely to have our greatest success with understanding how genes map onto pathways, and how pathways mapontodisease,beforeatruequantita- tive understanding of disease biology emerges.Onemightcounterthatexisting gene ontologies do precisely that, but, even in yeast, this appears to be highly incomplete (Dutkowski et al., 2013). Proving Causality: Molecular Koch’sPostulates The evidence that a specific gene is involved in a particular human disease has historically been nonstatistical and based on our experience with identifying Figure 1. Complementary Approaches Necessary for Proving Genetic Causality and mutations in Mendelian diseases. The Understanding the Pathophysiology of Complex Disease chief criteria have been to demonstrate Geneticassociationstudiesinhumanscansynergizewithpriorknowledgeandsystems-levelquantitative cosegregation with the phenotype in analysis to generate predictions of what pathways and modules are disrupted, where (anatomically), and families, exclusivity of the mutation to when(developmentally)toyieldaspecificmorphologicalorbiochemicalphenotype.Thesepredictionscan affected individuals (rare alleles absent then be tested in an appropriate model system while adhering to the postulates outlined in Box 1. incontrols),andthenatureofthemutation (a plausibly deleterious allele at a demonstrating Mendelian inheritance of ease.Thisapproachhasbeenhighlyprof- conserved site within a protein). Unfortu- the same or similar phenotype, and/or itable in Crohn’s disease—a common nately, as already mentioned, all of these identifying single genes with a demon- inflammatorydisorderwhoserootcauses rules break down in complex phenotypes strable excess of rare coding variants. remained cryptic until genome-wide where neither cosegregation nor exclu- Thefirstofthesetwostrategiesisastrong association studies identified a large sivity to affecteds nor obviously delete- unproven hypothesis and probably not number of loci with fundamental defects rious alleles are likely; moreover, many universally true, whereas the second re- in mucosalimmunity(GrahamandXavier, mutationsaresuspectedtobenoncoding lies on very large sample sizes of patients 2013)—but not in type 2 diabetes, where andinadiversityofregulatory RNAmole- and suffers from the unknown functional the pathophysiology awaits clarification cules. Consequently, statistical evidence effect of the majority of rare coding vari- (Groop and Pociot, 2013). Although we of enrichment has been the mainstay, ants. Consequently, these strategies suspect that the numbers of pathways but this has two negative consequences: themselvesdependonthehiddenbiology involved are fewer than the numbers of first, scanning across the genome or we seek and are applicable only to the genes involved, this is merely suspicion. multiple loci covering tens to hundreds most common human diseases. It ap- Nevertheless, can we reduce the com- of megabases requires very large sample pears to us that ignorance of biology has plexity of the problem by identifying all sizes and very strict levels of significance become rate limiting for understanding of the relevant pathways? Despite uncer- toguardagainstthemanyexpectedfalse- diseasepathophysiology,exceptperhaps tainty, this approachhastheadvantageof positive findings; second, genetic effects for the Mendelian disorders. There are leading to specific testable hypotheses. that are small or genes with only a few two ways to get out of this vicious cycle Thesecondapproachistofocusresearch causal alleles are notoriously difficult to (Figure 1). onwhythediseaseiscomplexinthefirst detect, although they may be very impor- One approach may be to use a set of place. Although the genome is linear, its tant to understanding pathogenesis. This model traits and diseases and employ expression and biology are highly difficulty translates into a low power of their existing mapped loci to identify a nonlinear and hierarchical, being seques- detection, as common disease alleles small set of the component genes by tered in specific cells and organelles cannot be distinguished from bystander brute-force (or, luck) and use the uncov- (Ilsley et al., 2013). Understanding this associated alleles, whereas rare alleles ered biology to infer which other genes hierarchy, the province of systems are observed too infrequently to provide in their ‘‘pathways’’ can explain the dis- biology, is critical to the solution of the statistical significance. Consequently, 24 Cell 155, September 26, 2013 ª2013 Elsevier Inc.
no reviews yet
Please Login to review.