Tuesday, November 9, 2010

Thanks for the Histones, Mom!

Several papers have emerged over the last year or so indicating transgenerational effects that influence the behavior and/or physiology of offspring. For example, a recent study in Nature from Margaret Morris's group at the University of New South Wales proposes that obese father's transmit an epigenetic signature through the germ line to female offspring, resulting in impaired beta cell function, impaired insulin secretion and glucose intolerance (Ng et al. Nature 2010). Other recent studies have found similar evidence for paternal inheritance of non-genetic information (for example, see Pentinat et al. Endocrinology 2010; Nelson et al. Epigenomics 2010). However, an outstanding issue relates to the identity of the underlying molecular mechanisms that are involved in these effects. In this blog entry, I highlight some emerging pathways that might potentially contribute to epigenetic inheritance through the germ line.

Two major studies have characterized modified histones in human and mouse sperm (Hammoud et al. Nature 2009; Brykczynska et al. Nat Struc Mol Biol 2010). Previously, it was thought that modified histones were unlikely to be a major component of the highly compact chromatin contained in sperm. However, these two studies indicate that approximately 30% of human promoters contain modified histones. Further, many of these epigenetic signatures are conserved between humans and mice. It has been postulated that these histone signatures are retained in the zygote and play an important role at early stages of development in offspring. However, direct evidence for this model is not yet strong. Interestingly, two noteworthy studies in C. elegans suggest that modified histones established in the parental germ cells transmit essential information to offspring through the germ line (Furuhashi et al. Epigenetics 2010; Rechtsteiner et al. PLoS Genetics 2010).

Both of these studies pickup on an older study by Susan Strome's group, in which she found 6 loci, including the H3K36 methyltransferase MES–4 [an NSD homolog], that are required for normal germ cell development in offspring (Capowski et al. Genetics 1991). Null MES-4 mutant offspring undergo normal germ cell development when MES–4 is expressed by the mother, but not if the mother is homozygous. Thus, a transgenerational maternal effect occurs. In the two most recent studies, it was found that MES–4 establishes H3K36 trimethylated histone marks independent of transcription, and this maternally established epigenetic signature is required for normal germ cell development in offspring. The authors propose that MES–4 transmits a memory of gene expression in the parental germline to offspring.

Taken together, these early observations suggest that epigenetic signatures in the form of modified histones in eggs and/or sperm might impact upon gene expression and the development and physiology of offspring.

Monday, November 8, 2010

Eppendorf & Science Prize for Neurobiology

Thank you to everyone sending congratulatory messages regarding the Eppendorf & Science Prize.  A link to my essay is provided here:

Parental Control Over the Brain

This work is the result a major collaborative effort.  Jiangwen Zhang at Harvard FAS Computing played a central role in the development of the informatics pipeline.  While next generation sequencing data analysis is becoming more mainstream, there was absolutely nothing to help in early 2007 (beyond Eland and a few other aligners) when we started and Jiangwen's work was essential to getting it up and running.  David Haig at Harvard played a vital role in the development of the statistical analysis and data interpretation.  Gary Schroth and Shujun Luo at Illumina kindly collaborated by sharing their early versions of an RNA-Seq protocol and by carrying out some pilot sequencing studies for us to determine if the approach would succeed (yes, the initiation of the study predated publication of RNA-Seq).  Jim Butler worked on the qPCR analysis of Il18 heterozygous mice.  The entire study was carried out under the guidance and mentorship of Catherine Dulac in her lab in the Molecular and Cellular Biology Department at Harvard.

Also, thank you to the Jaenisch lab for sharing mice and the outstanding members of the Dulac lab for discussions and ideas.

The work was funded by the Klarman Family Foundation for Eating Disorders, the Howard Hughes Medical Institute, and an award from Merck.  I was funded by the Human Frontiers Science Program and the Alberta Heritage Foundation for Medical Research.

Thank you to the Canadian Press for highlighting the prize at home!

Friday, October 22, 2010

Number One on F1000!

The faculty of 1000 (F1000) is a post-publication peer review process performed by leading scientists that ranks publications in a variety of scientific fields. Our paper has the number one spot in Neuroscience! Fantastic.

Tuesday, October 19, 2010

Are The Effects of Stress and Anxiety Communicated Through the Germline to Offspring? -- Inconceivable!



Life is full of stress. Jobs are lost, divorces ensue, accidents happen…even wars and terrorism are a part of life for some. There is no doubt that these things impact tremendously on children, but could chronic stress in one generation really influence future descendants for generations to come? It seems daunting to imagine that life works that way or that natural selection gave rise to mechanisms that communicate stressful experiences to future generations. Nonetheless, the evidence is building and I highlight some surprises here.

There has been a major interest in the effects of stress on the development and behavior of offspring for a long time. The work of Michael Meaney and colleagues revealed that patterns of maternal behavior can influence the development of stress responses in rats. This effect establishes a transgenerational form of behavioral inheritance from mother to offspring (Francis et al. 1999). Further work by the Meaney and Szyf groups went on to establish evidence that the transgenerational effects of maternal behavior are correlated with changes in the expression of the glucocorticoid receptor (GCR) in the hippocampus of rats, such that rats exhibiting robust maternal behavior characterized by high licking and grooming of the pups, exhibited elevated GCR expression leading to a feedback effect that dampens the HPA axis response to stress. The increase in GCR expression is proposed to be downstream of increased serotonin signaling and is stabilized in offspring through an epigenetic program in which the methylation status of an alternative promoter for the GCR gene is reduced (Weaver 2004). The loss of this methyl mark is proposed to allow for a stable increase in GCR expression and dampened stress response in the offspring (Weaver 2004). In short, good mothers that groom and care for their pups will trigger an epigenetic program in the offspring that results in those offspring ultimately becoming good mothers themselves and having reduced stress and anxiety-like behaviors. Bad mothers that offer less care to the pups instill a different pattern of epigenetic marks on the DNA of their offspring, such that their pups grow up to be poor mothers and exhibit greater stress responses. Cross fostering of the pups breaks the cycle, revealing that it truly is the early life exposure to maternal behavior that instills this cycle of anxiety and poor mothering.

This work is a striking example of epigenetic inheritance, but has been met with skepticism (see Miller, Science 2010 and Buchen, Nature 2010). A recent study by these groups correlates such processes with early life stress in humans by examining GCR methylation in the brains of 12 suicide victims that were abused as children (McGowan 2009). This is a tough sell for this reader, given that methylation is highly variable in humans and genome wide association studies need thousands of people to detect anything meaningful in a study of this sort. In general, it is likely that the epigenetic story of this one gene is a small piece of a much larger biological picture that involves an epigenetic program that encompasses many genes and other adaptations involving synaptic plasticity, altered cell death, etc. Indeed, new studies are emerging that suggest this is the case. However, the Meaney studies are foundational in that they appear to bring together two disparate fields, namely molecular neuroscience and behavioral neuroscience.

-------------------------------------------------------------------------------------------------

Recently, the evidence for transgenerational effects of stress have become even more astounding. Two papers have suggest evidence for germline transmission of stress effects to offspring. Over the years our awareness of the transgenerational effects of stress has been growing (Matthews and Phillips, Endocrinology 2010). Children born from survivors of the Dutch famine during WWII exhibit elevated stress responses. Children born from mothers at the World Trade Center attacks have depressed cortisol responses, as do those born from Jewish Holocaust survivors. Most believe that these effects are the result of exposure to a "stressful" environment in the womb, and no doubt that many effects are the result of changes to the uterine environment. However, two independent studies in mice have recently suggested that anxiety-like behaviors can be transmitted through the paternal germline, which may indicate that a stress-related epigenetic signature is carried on the DNA packaged in sperm. These studies appear to be remarkable examples of epigenetic transgenerational inheritance.

The first study is from Rene Hen's group at Columbia (Alter et al. Biol Psychiatry 2009) and is entitled: "Paternal Transmission of Complex Phenotypes in Inbred Mice"

This study used genetically identical BALB/cJ mice and found that males could be separated into those with high anxiety-like (HA) behaviors in an open field test and with low anxiety-like (LA) behaviors. These males were then mated to genetically identical BALB/cJ female mice. Remarkably, correlation matrices and multiple regression models accounting for a variety of variables revealed that the anxiety-like behaviors of fathers were significantly associated with anxiety-like behaviors and hippocampal size of daughters, but with body weight in sons. The authors propose that these surprising and complex associations are suggestive of a nongenetic mode of inheritance through the male germline. The results are very striking. However, from my own experience with mouse genomics, these "genetically identical" inbred mouse strains still have many polymorphic sites drifting through the population. Imposing the behavioral categorization of HA vs LA, may select for subsets of BALB/cJ male mice with sets of polymorphisms that influence anxiety-related behaviors and these polymorphic sites may also influence phenotypes in offspring in a complex and sex specific manner. In this case, the transmission would be genetic. So far the underlying cause(s) of these bizarre effects are not yet clear. It would be nice to know what would happen if the same experiment were performed for mothers.

The second study is from Isabelle Mansuy's lab (Franklin et al. Biol Psychiatry 2010) and is entitled: "Epigenetic Transmission of the Impact of Early Stress Across Generations"

In this study the authors expose pups to chronic and unpredictable maternal separation, which is extremely stressful to the pups. Not surprisingly, these offspring exhibit depression-like behaviors as adults, but the depression/stress related phenotype was restricted to male mice (F1s). This sex bias effect of maternal separation in rodents was observed previously by others. Remarkably however, the offspring of these depressed males also exhibited depressed behavior, but this time the males were normal and the females were significantly affected (F2s). Finally, the normal behaving F2 males were mated to normal females, and despite the fact that the F2 males seemed normal, their offspring still exhibited altered depression-like phenotypes (F3s). However!!….this time it was only the male F3s that exhibited the phenotype, not the female F3s. Thus, a complex transgenerational epigenetic effect is proposed that is transmitted through the paternal germline and interacts with sex effects in offspring to produce stress-related behavioral phenotypes. Finally, the authors note some modest changes to the expression and/or methylation status of a few genes in F1 sperm and F2 brain. However, it seems unlikely that these small effects play a major role in the observed phenotype, thus the underlying mechanisms remain largely unknown. In summary, the study proposes that exposure to chronic stress during early postnatal development leaves an epigenetic signature that can be propagated through the germline to future offspring for at least 3 generations.

The results are hard to reconcile with our current mechanistic understanding of epigenetic programming in the genome. However, several studies that have found evidence for transgenerational epigenetic inheritance report that complex sex effects are associated with the observed phenotype. These complex effects seem to appear in studies of both humans and mice.

These findings are preliminary and require further investigation by others.  However, if this is found to be true, then there is new and fundamental biology to be uncovered. 

Saturday, September 25, 2010

The Long Epigenetic Shadow of A Genetic Mutation


The evidence that parents pass epigenetic information on to their offspring and that information influences offspring physiology and behavior is continuing to grow. In the field of obesity research, it is now beyond doubt that maternal obesity, and paradoxically calorie restriction, programs the physiology of offspring such they are extremely susceptible to developing metabolic syndrome. Some evidence has indicated that paternal obesity can also have similar effects, which separates out the in utero effects of maternal obesity and indicates that epigenetic programs are being passed on to the next generation through the germ cells.


Paternal germ line transmission of epigenetic programs has been understudied compared to maternal transmission. However, a particularly remarkable study on paternal transmission of metabolic state has recently been published in Human Molecular Genetics by Jo Nadeau and David Buchner (http://www.ncbi.nlm.nih.gov/pubmed/20696673) entitled:


Hum Mol Genet. 2010 Aug 18. [Epub ahead of print]

Ancestral paternal genotype controls body weight and food intake for multiple generations.

Yazbek SN, Spiezio SH, Nadeau JH, Buchner DA


There are several remarkable points that make this paper extremely interesting. The first is the high quality of the work. The samples sizes are large, many different crosses are tested and the read out was body weight, which is a very reliable and simple measure of effect. The authors build on a previous discovery of a mutation that causes resistance to obesity. They find that the obesity resistance phenotype is passed on to offspring even when the mutation itself is not inherited, thus indicating the existence of some heritable epigenetic effect. Astoundingly, the phenotype is only transmitted from fathers and cannot be inherited from mothers. Further, it can be transmitted for at least two generations through the paternal germline, which means that a mouse is obesity-resistant if it's paternal grandfather had the mutation. This work clearly emphasizes the challenges ahead for understanding non-mendelian diseases. We are the sum of many effects: genetic programs, environmental epigenetic programs, transgenerational epigenetic programs, stochastic effects, microbiome effects and lifestyle decisions.


Food for thought.

Tuesday, September 14, 2010

Article In New York Times

There is a very nice article in the New York Times today on our work. Here is the link:

Sunday, September 12, 2010

ManyMoon.com highlights the Gregg Lab!

I have been searching for a good way to manage projects and ideas within the lab. I especially want to find a way to motivate creativity and productivity. My belief is that much of what has worked in the IT industry could be used to improve innovation in academic labs (like blogging!). Most of the project management software I am aware of didn't look like a good fit. I am a big fan of google docs and found a very useful platform, called ManyMoon (http://www.manymoon.com/), that interfaces with Google Docs and functions as a personnel and project management platform. I have been using it to set up a structure for the lab and believe that it will work well in the long term with some tinkering....

The company kindly highlighted the Gregg Lab recently, (even though it doesn't quite exist yet) and the manner in which I am attempting to use their platform. Here is the link to the article:




Other ideas I am planning for the lab include:
1.) A large flatscreen monitor, which is being designed into the construction of the lab and will allow people to post ideas, papers, findings, art, pictures, etc. from their computers to communicate and promote innovation and interaction with others in the lab.

2.) A small budget (when grants allow) for individual innovation projects in which students/postdocs can attempt to begin to develop high-risk high-reward ideas on their own within a limited budget without fear of failure. Special lab meetings will be set aside for these projects in which we will discuss pig picture problems and emerging technologies and ideas. The idea is inspired by Google, which allows some employees to spend a certina percentage of their time on their own private projects and ideas....this has lead to many great breakthroughs for that company and I hope to experiment with it in different ways in an academic setting.

Saturday, September 4, 2010

Parents Rule The Brain

My postdoctoral studies from Catherine Dulac’s laboratory at Harvard have finally been published as two companion papers in the August 6th issue of Science [Gregg et al., Science 2010; Gregg et al., Science 2010b]. For those who would like some different perspectives on the findings, the work has been reviewed in Nature [Keverene 2010], Science [Wilkinson 2010], and Neuron [Shah 2010]. In this blog entry I wish to provide a summary of the findings for a general audience. I hope you find it interesting.

In many ways, adult health and behavior is a reflection of developmental processes and early life experiences. Uncovering the early life processes and mechanisms of inheritance that influence adult behavior and health is fundamental to our understanding and treatment of neurological and psychiatric diseases, as well as to broader social issues related to diet, parenting, education, and socioeconomic policies. When we think of infant and childhood development and early life experiences, we can’t help but think of the paramount roles that parents play. Parents influence our brain development and behavior in many ways and so substantially that they can set us on a course for life.

Often when we consider the impact of parents on their offspring we think in terms of either parental behavior or genetic inheritance. Recall that each of us inherits 23 chromosomes from Mum and 23 chromosomes from Dad (19 in the case of mice). Importantly, maternally and paternally inherited chromosomes are not functionally equivalent, due to heritable epigenetic marks established on the chromosomal DNA in the parental gametes, called genomic imprints. Genomic imprinting is thought to be rare in the genome, affecting ~100 genes in mice, and yet examples of parental effects upon gene expression, brain function and the behavior of offspring are growing in number and becoming increasingly mysterious. For example, experiments in which chimeric mice were generated from a mix of wildtype cells and parthenogenetic cells (PG, cells from embryos with two mothers and no father) or androgenetic cells (AG, cells from embryos with two fathers and no mother) revealed the preferential contribution of cells with a maternally-derived genome (PG cells) to cortical and limbic brain regions, but cells with a paternally-derived genome (AG cells) contributed strictly to hypothalamic regions. Cortical brain regions have executive functions in the brain (top-down control), while hypothalamic regions control primary drive (feeding, sleeping, sex, etc). From these striking observations, Barry Keverne proposed the idea that mothers and fathers differentially influence the evolution and function of the cortical versus the hypothalamic brain regions. Several other transgenerational parental effects have also been uncovered. In a study of genetically identical uniparental mice, a complex paternal transmission pattern of anxiety-related behaviors and growth effects was found that suggests epigenetic and sex-specific transgenerational effects. Similarly complex effects have been described in humans. These studies all suggest that some information is passed through the germline from parents to offspring in a manner that is distinct from the genome sequence, and further, that this information can influence the behavior and physiology of the offspring. Scientists call this information “epigenetic”. My studies have worked to uncover forms of epigenetic regulation that are differentially inherited from mothers versus fathers, thereby resulting in genomic imprinting that influences gene expression in the brain.

Epigenetics is the study of changes in phenotype (appearance) or gene expression caused by mechanisms other than changes in the underlying DNA sequence.”

One outstanding question in the field of genomic imprinting is why a mechanism that results in differential gene expression from maternally versus paternally inherited chromosomes would ever have evolved. The leading theoretical explanation is even more captivating than the phenomenon itself and is called the Parental Conflict Theory (or Kinship Theory). Imprinting has been identified in placental mammals and flowering plants only. What is unique about these species is the preferential investment of maternal resources in the growth, development and rearing of the offspring. In mammals, a second important point is that monogamy is extremely rare, and therefore, fathers not only invest comparatively few resources in their offspring, but they also can’t guarantee that they will father their partner’s future offspring. David Haig’s Parental Conflict Theory proposes that a conflict results from this situation, such that fathers effectively compete to have their offspring consume the maximum amount of maternal resources possible, even at the expense of the mother’s long term survival. To counter this genetic arms race, mothers evolve mechanisms that reduce offspring consumption of maternal resources, so they can distribute those resources to future litters. Thus, unique maternal and paternal gene expression programs evolve through this conflict to function antagonistically. Currently, there is good evidence to support this theory for many, but not all cases of imprinting.

“The Kinship Theory for the evolution of genomic imprinting proposes that maternal and paternal gene expression programs are in conflict and function antagonistically to each other.”

Genomic imprinting has been clearly linked to human brain function and behavior through studies of Prader-Willi (PWS) and Angelmen Syndrome (AS), which result from a paternally or maternally inherited deletion of an imprinted gene cluster on chromosome 15, respectively. PWS is associated with hyperphagia, stubbornness and compulsive traits, while AS is associated with absent speech, happy affect and inappropriate laughter. Importantly the only difference between PWS and AS at the level of the genome is that a mutation in a cluster of imprinted genes on chromosome 15 is inherited from the father, in the case of PWS, or the mother, in the case of AS. Recently, it has also been proposed that imprinted genes play a role in schizophrenia and autism. This theory, proposed by Crespi and Badcock, suggests that autism spectrum disorders are due to an imbalance in maternal and paternal gene expression programs in the brain such that the balance is too paternal. The same theory proposes that schizophrenia is an imbalance in the opposite direction, such that gene expression is too maternal. Currently, it is not yet clear whether these ideas are correct. Finally, disruptions in imprinting have also been identified in several forms of cancer, such as Wilms’ tumor and colorectal cancer. In sum, parental effects associated with imprinting influence the behavior and physiology of offspring and contribute to human disease, suggesting an underlying biology and epigenetic mode of inheritance that is clearly important, but poorly understood.

My recently published companion studies, performed with collaborators at Harvard, are focused on understanding the nature and functions of paternal and maternal gene expression programs in the developing and adult brain. We initially mapped the expression pattern of 45 previously known imprinted genes across 118 adult brain regions. This study identified neural systems that are enriched for the expression of imprinted genes. We found that imprinted genes are preferentially expressed by the major monoaminergic nuclei of the brain (serotonin, dopamine, or noradrenaline expressing neurons). These areas of the brain have been implicated in a wide range of neurological and psychiatric disorders, including major depression, eating disorders, schizophrenia, and autism spectrum disorders. In addition, nuclei involved in feeding behavior, such as the arcuate nucleus, and social behaviors, such as the preoptic area, were also enriched for imprinted gene expression. This initial set of observations suggested that imprinted genes regulate neural systems of major interest to neuroscience, and prompted us to develop a genome-wide approach to study genomic imprinting in specific brain regions at different developmental stages.

“We found that imprinted genes are preferentially expressed in serotonergic, dopaminergic and noradrenergic brain nuclei. These are regions of the brain implicated in numerous psychiatric disorders and believed to be the sites of action for several anti-depressant drugs (ie. SSRIs).”

Our approach utilizes next generation sequencing technology. This is a new technology that just emerged when I began my postdoc in 2006 and, in collaboration with the company making the sequencing technology (Illumina Inc.), we were one of the first groups to begin using the system at Harvard. It allows one to sequence genome information with an incredibly high throughput. We sequenced the mRNA (rather than the genomic DNA) of RNA samples harvested from crosses of two distantly related subspecies of mice, called CASTEiJ and C57BL/6J. Our first step was to sequence the entire transcriptome (all the mRNA) of the individual CASTEiJ and C57BL/6J mice to identify all coding single nucleotide polymorphisms (SNPs, single base differences in the genome sequence) that distinguish the genes of the two strains. We then performed RNA-Seq on specific brain regions of F1 hybrid offspring generated by reciprocal crosses of CASTEiJ and C57BL/6J mothers and fathers. I could then distinguish gene expression levels from maternally versus paternally inherited alleles (gene copies) using the SNP base call information we had first generated.

Inspired by the chimera studies described above, which suggested preferential maternal control over cortical regions and preferential paternal control over hypothalamic regions, we compared parent specific gene expression programs in the brain in the adult medial prefrontal cortex (mPFC) and the preoptic area (POA) of the hypothalamus. The number of genes subject to parental effects that significantly altered maternal or paternal allele-specific expression in these regions is greater than expected (~372 genes) and involves complex isoform-specific parental effects. However, we did not find evidence for biased maternal control over the cortex. Instead, we found that in both the mPFC and POA, ~70% of autosomal genes (autosomes are all the chromosomes that are not X or Y chromosomes) exhibiting parental effects preferentially express the paternal allele. Thus, instead of mums controlling the cortex and dads controlling the hypothalamus, we found that dads appear to have strongly biased control over both regions. We do not currently know if there are any regions of the adult brain that mothers have biased control over. Interestingly, an analysis of gene expression on the X chromosome in females revealed preferential expression of the maternally inherited X chromosome in the adult female brain, especially in the cortex. This result was further confirmed with a transgenic approach. In males, the X is strictly maternally derived and since females have two X chromosomes, and the X therefore spends 2/3 of its time in a female body, David Haig has proposed that evolution will select for genes and mutations on the X that favour maternal interests. Remarkably, we know from previous work that the X chromosome has evolved a preferential role in the regulation of the brain and many forms of mental retardation are associated with X-linked genes. We therefore speculate that the autosomes and X chromosome give rise to paternal and maternal gene expression programs, respectively, which influence adult brain function and behavior.

The parental effects in the developing fetal brain differed from those found in the adult. We found ~553 genes subject to parental effects in the fetal brain, compared to 257 in the adult POA and 153 in the adult mPFC. Thus, parent specific gene expression programs dominate during development of the brain, rather than in the adult brain. Further, rather than a paternal expression bias, 61% of the imprinted genes in the developing brain exhibited preferential expression of the maternal allele. These results reveal maternal effects that are specifically associated with brain development.

Our data suggests mothers have biased control over gene expression in the developing fetal brain. However, in the adult brain, fathers have biased control over autosomal gene expression, while the X chromosome appears to function as the nexus of maternal control.”

Finally, we analyzed males and females separately and uncovered evidence for sex specific parental effects. An important example is interleukin 18 (Il18), which exhibits preferential expression in the female, but not male, mPFC. Il18 has been linked to multiple sclerosis, a sexually dimorphic neurological disease that predominates in women and is associated with maternal parent-of-origin effects. In the POA of the hypothalamus, we also noted that females have 3 times the number of genes subject to sex specific parental effects as males. The POA plays a central role in regulating maternal behavior. Given that maternal behavior alone impacts offspring brain development and behavior, this result suggests a remarkable convergence of parental influences.

Our studies of parent-specific gene expression programs in the CNS suggest surprising and complex modes of parental influence over brain development and adult brain function in offspring. What are the mechanisms that regulate these effects? What are the functions of maternal and paternal gene expression programs in the brain? Do parental influences on gene expression adapt to environmental pressures? In what ways do imprinted genes influence the behavior and physiology of offspring? What is the nature of genomic imprinting in humans? Do maternal and/or paternal gene expression programs play a role in human diseases and disorders? These questions set a course for an exciting frontier and may shed new light on our understanding of brain evolution, function and disease.

Thank you for reading about my work. Please contact me with any questions through this blog site.

Friday, September 3, 2010

Notes On Experimental Design


The goal of any experiment is to address a specific question such that the results will be reproducible and serve as a solid foundation for further thoughts and experimental work. A solid experimental design and statistical analysis is essential for one to be able to draw correct conclusions and generate reproducible work. However, academic science programs almost never formally teach the foundations of experimental design and statistical analysis. This is incredible, really, but I have yet to see it in my 11 years in science. Experimental design and interpretation is an art form and deeply philosophical issue, but over the years people have worked out standards and fundamentals that need to be considered. I present some resources for these issues in this post.





This is a link to a well reviewed web article on experimental design by Sid Sytsma:


http://liutaiomottola.com/myth/expdesig.html



Here I also highlight two recent studies that serve as thoughtful reminders of the basics and issues of experimental design.



The first idea of interest is explained in two papers by Richter et al. in Nature Methods who argue that systematic variation improves the reproducibility of experiments compared to traditional, highly controlled standardization:


Nat Methods. 2009 Apr;6(4):257-61.

Environmental standardization: cure or cause of poor reproducibility in animal experiments?

Richter SH, Garner JP, Würbel H.

Justus-Liebig-University of Giessen, Germany.

Comment in:

Abstract

It is widely believed that environmental standardization is the best way to guarantee reproducible results in animal experiments. However, mounting evidence indicates that even subtle differences in laboratory or test conditions can lead to conflicting test outcomes. Because experimental treatments may interact with environmental conditions, experiments conducted under highly standardized conditions may reveal local 'truths' with little external validity. We review this hypothesis here and present a proof of principle based on data from a multilaboratory study on behavioral differences between inbred mouse strains. Our findings suggest that environmental standardization is a cause of, rather than a cure for, poor reproducibility of experimental outcomes. Environmental standardization can contribute to spurious and conflicting findings in the literature and unnecessary animal use. This conclusion calls for research into practicable and effective ways of systematic environmental heterogenization to attenuate these scientific, economic and ethical costs.

Nat Methods. 2010 Mar;7(3):167-8.

Systematic variation improves reproducibility of animal experiments.

Richter SH, Garner JP, Auer C, Kunert J, Würbel H.

Behavioural Biology, University of Münster, Münster, Germany.



The second paper by Auer and Doerge in Genetics emphasizes the importance of sampling, randomization, replication and blocking in experimental design. They deal with RNA-seq experiments specifically, but the issues are broadly relevant.



Genetics. 2010 Jun;185(2):405-16. Epub 2010 May 3.

Statistical design and analysis of RNA sequencing data.

Auer PL, Doerge RW.

Department of Statistics, Purdue University, West Lafayette, Indiana 47907, USA.

Abstract

Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collecting replicated RNA sequencing data according to well known statistical designs that partition the sources of biological and technical variation. Examples of these designs and their corresponding models are presented with the goal of testing differential expression.


These are useful references for individuals in the process of designing experiments.

Friday, July 23, 2010

How to Discover


For myself and my future trainees, I am using this blog entry to reflect on the principle components of the path to discovery in bench science. We have a limited lifetime to discover things and make an impact in our careers, so it is worth thinking carefully about the questions and directions one chooses to focus on. Discovery is at its best when it arrives in the form of an unexpected observation, but there is an art to discovery and a correct approach that will position an individual to make unexpected observations. Intertwined with that approach is the right attitude and the art of phrasing an interesting question. I want to comment on both aspects of discovery from my experiences. It is important to note that discovery leads to more problems/questions, as much as it does to solutions. An important part of the scientific process is solving problems and I think of this as separate to discovery. A full scientific research program in a lab balances a spectrum of projects focused on discovery, problem solving, and the generation of new technology within a focused area of expertise. Research projects within the lab should complement each other to promote synergy, rather than dilute expertise and interest across too many poorly related questions. A central bread-and-butter theme is needed to unite the lab environment, especially at early stages.

I start with an outline of the scientific process and end with some personal perspectives.

The Scientific Process
The scientific process is a series of deductive and inductive steps (see Nisbet, Elder and Miner. Handbook of Statistical Analysis and Data Mining Applications,.2009).

1. Define the problem and central question to be answered
2. Gather existing information about the phenomenon
3. Form one or more hypotheses
4. Collect new experimental data
5. Analyze the information in the new data set
6. Interpret the results.
7. Synthesize conclusions, based on the old data, new data, and intuition.
8. Form new hypotheses for further testing
9. Do it again.

Before beginning a major new body of work, it is useful to reflect on aspects of this process in the design of your project. Consider the following checklist:

The Experimenter's CheckList

1. What is the question you asking? Refine this question so it is clear and simple.
2. What have others found with regard to this question? On a scale of from 1 to 10, what are the opportunities for discovery? Is this a crowded, old field?
3. How can you improve your question to make it more interesting? To ask something that has not been asked previously?
4. How does your question fit into a story or conceptual framework? What stories could you tell? Scientific publishing is about the story.  This is simply how human beings understand information.
5. What is interesting or important about your question? Where could it take you in your dreams? Rank it on a scale of 1 to 10.
6. Can you optimize your question to maximize your chances of discovery?
7. What are the different approaches you could take to address the question? Balance Risk vs Reward in the decision for the approach.
8. What controls will you need to interpret your experimental results?
9. How will you analyze your results?! What statistical approaches will you use. Think of this before, so you don't forget controls that prevent interpretation of the data.
10. What preliminary tests could you do to optimize technical aspects of your approach to improve the quality of your data? (Always take the time to optimize so that your results will be clear and interpretable....do not base conclusions on weak results!)
11. Write out each of the steps in your experiment, so you don't need to think while you are doing it!
12. What will contribute to variability in your experiment? How can you control that variability?
13. What will the final figure look like that will present your results? What are the possible comparisons and different ways of looking at the data you will get?
14. What degree of effect do you expect in your study? Do you expect a lot of variability? How many replicates will you need to overcome that variability and detect a real effect?
15. Do you have all the reagents and/or samples (animals) you need for your experiment? Figure this out ahead of time.
16. Plan everything out in day planner schedule to determine how long it will take and when certain milestones will be achieved.
17.  Do not under power your study.  Plan for an appropriately high n, so you can make solid conclusions at the end.
18. Order everything you need and FINISH your experiment! A well conceived experiment is always worth finishing to the END! (always finish your experiments even if you loose heart half way through)

Points to Consider

1. When you think about your experiment from a technical perspective, think about efficiency. At each step you capture a certain percentage of the effect and with each step you will introduce a certain amount of noise. How do you optimize your signal to noise? How big of an effect are you seeking?
2. Make sure your initial results are truly rock SOLID. These are the foundations of your project and if they are flimsy, you will reach a stage where you will feel pressure to bend your data to fit those early observations so you can get that paper together under extreme stress in 3-4 years time.    The pressure is coming....make sure the foundations of your study...your first experiments....are rock solid, so you are set on the right path from the beginning. BE PATIENT if your observations are ambiguous and keep working to NAIL THAT RESULT!
3. Thinking about the final figure that will represent your data is essential at the very beginning, so you know where you are headed. Design your figures mentally before and during an experiment. (I constantly pencil out figures to think about results)
4. FINISH YOUR EXPERIMENT AND IMMEDIATELY MAKE A PUBLICATION QUALITY FIGURE. This ensures you are making progress and on top of your data, even if the results don't make sense at that time. They may make perfect sense years later!

FInal Thoughts:

The Power of Characterizing Biological Systems
Science is a balance of risk and reward. The existing funding system forces you to balance your risk carefully. You must show productivity and at the same time push into new frontiers. Though recipes for discovery are limiting in themselves, I tend to favor characterization projects to get going. Simple approaches that involve just "looking and learning", like observing a behavior pattern or staining for several marker proteins to visualize the organization of neural circuit, can expose lots of new questions. Mouse genetic approaches to characterizing a system can also give great and elegant insights as well, but carry more risk and time investment. An unbiased characterization and careful systematic analysis of the results will foster ideas. Look for new technical approaches to characterizing the system associated with your question. A new angle can change everything!! Always think of new technical approaches to ask questions in ways that could not be done before, you will always learn something. Characterization projects lead to papers and useful knowledge with relatively little risk and they set you up for discovery. Everyone should have a component of their research that simply involves looking and describing.

How to look:
1. Make an unbiased list of the features you are seeing in your data. For example, if you are looking at the expression pattern of several genes of interest, where are they expressed? what types of cells? Where are the cells? What do we know about their functions?
2. Design some specific questions (or hypotheses) from your initial observations. Make a list of several different ideas and questions and use your gut to judge the best place to start.
3. Think of methods to rigorously quantitate and statistically analyze your characterization of the system. For example, quantitate cell numbers, measure dendritic projection patterns, monitor feeding patterns, etc.


Balance Characterization Projects with Innovation Projects
Characterization projects generate new knowledge and I think it is important to distinguish knowledge generation from innovation. Innovation involves solving a problem for the first time or in some novel manner, or following a crazy idea to see where it might lead. Innovation is about following your gut and trying a high risk, high reward idea. You must be comfortable and accept regular failure in the road to innovation and that is why it is important to balance your research program by having two classes of projects. In the optimal circumstances, characterization projects and innovation projects complement each other.


Build A Discovery Niche
Big discoveries are made by:
(1) Doing something others cannot do, because they don't have access to the knowledge, resources and tools necessary - aka. build new tools/resources that only you have and learn new fields whenever possible.
(2) Doing something others won't do, because it is very difficult - no replacement for hardwork
(3) Fortunate insight or chance discovery that is capitalized on (the prepared mind!) - pay attention, do carefully controlled experiments, think deeply about your results
(4) Taking risks. You must take some risks in your career and recognize that the path you start down is rarely headed where you think it is...hold weak opinions.
(5) Reading and talking. You must learn broadly in order to understand the impact of your results and observations and connections to other fields. Something that seems mundane might be huge when cast in the right light.

Think Differently
(1) Characterization projects are powerful ways to develop hypotheses and break new ground, but you must use them strategically. Think differently. Look for untracked territory and ways to bring together different fields.
(2) If you are uncomfortable and unsure of where your work is leading, but you find it very interesting....that is normal and that is life on the front of innovation. Just keep asking good questions.
(3) The genius is in the details. Think carefully about the what, why, where and when details of your observations.

MOST IMPORTANTLY
JUST TRY! JUST KEEP TRYING! Never fear failure or a new technique. Dust yourself off and try again. Science is mostly about just trying....you often won't know if you are on to a good thing until late in the game.