User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

7.1.4 - developing and evaluating hypotheses, developing hypotheses section  .

After interviewing affected individuals, gathering data to characterize the outbreak by time, place, and person, and consulting with other health officials, a disease detective will have more focused hypotheses about the source of the disease, its mode of transmission, and the exposures which cause the disease. Hypotheses should be stated in a manner that can be tested.

Hypotheses are developed in a variety of ways. First, consider the known epidemiology for the disease: What is the agent's usual reservoir? How is it usually transmitted? What are the known risk factors? Consider all the 'usual suspects.'

Open-ended conversations with those who fell ill or even visiting homes to look for clues in refrigerators and shelves can be helpful. If the epidemic curve points to a short period of exposure, ask what events occurred around that time. If people living in a particular area have the highest attack rates, or if some groups with a particular age, sex, or other personal characteristics are at greatest risk, ask "why?". Such questions about the data should lead to hypotheses that can be tested.

Evaluating Hypotheses Section  

There are two approaches to evaluating hypotheses: comparison of the hypotheses with the established facts and analytic epidemiology , which allows testing hypotheses.

A comparison with established facts is useful when the evidence is so strong that the hypothesis does not need to be tested. A 1991 investigation of an outbreak of vitamin D intoxication in Massachusetts is a good example. All of the people affected drank milk delivered to their homes by a local dairy. Investigators hypothesized that the dairy was the source, and the milk was the vehicle of excess vitamin D. When they visited the dairy, they quickly recognized that far more than the recommended dose of vitamin D was inadvertently being added to the milk. No further analysis was necessary.

Analytic epidemiology is used when the cause is less clear. Hypotheses are tested, using a comparison group to quantify relationships between various exposures and disease. Case-control, occasionally cohort studies, are useful for this purpose.

Case-control studies Section  

As you recall from last week's lesson, in a case-control study case-patients and controls are asked about their exposures. An odds ratio is calculated to quantify the relationship between exposure and disease.

In general, the more case patients (and controls) you have, the easier it is to find an association. Often, however, an outbreak is small. For example, 4 or 5 cases may constitute an outbreak. An adequate number of potential controls is more easily located. In an outbreak of 50 or more cases, 1 control per case-patient will usually suffice. In smaller outbreaks, you might use 2, 3, or 4 controls per case-patient. More than 4 controls per case-patient are rarely worth the effort because the power of the study does not increase much when you have more than 4 controls per case-patient (we will talk more on power and sample size in epidemiologic studies later in this course!).

Testing statistical significance Section  

The final step in testing a hypothesis is to determine how likely it is that the study results could have occurred by chance alone. Is the exposure the study results suggest as the source of the outbreak related to the disease after all? The significance of the odds ratio can be assessed with a chi-square test. We will also discuss statistical tests that control for many possible factors later in the course.

Cohort studies Section  

If the outbreak occurs in a small, well-defined population a cohort study may be possible. For example, if an outbreak of gastroenteritis occurs among people who attended a particular social function, such as a banquet, and a complete list of guests is available, it is possible to ask each attendee the same set of questions about potential exposures and whether he or she had become ill with gastroenteritis.

After collecting this information from each guest, an attack rate can be calculated for people who ate a particular item (were exposed) and an attack rate for those who did not eat that item (were not exposed). For the exposed group, the attack rate is found by dividing the number of people who ate the item and became ill by the total number of people who ate that item. For those who were not exposed, the attack rate is found by dividing the number of people who did not eat the item but still became ill by the total number of people who did not eat that item.

To identify the source of the outbreak from this information, you would look for an item with:

  • high attack rate among those exposed and
  • a low attack rate among those not exposed (so the difference or ratio between attack rates for the two exposure groups is high); in addition
  • most of the people who became ill should have consumed the item, so that the exposure could explain most, if not all, of the cases.

We will learn more about cohort studies in Week 9 of this course.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 06 November 2020

Epidemiological hypothesis testing using a phylogeographic and phylodynamic framework

  • Simon Dellicour   ORCID: orcid.org/0000-0001-9558-1052 1 , 2 ,
  • Sebastian Lequime   ORCID: orcid.org/0000-0002-3140-0651 2 ,
  • Bram Vrancken   ORCID: orcid.org/0000-0001-6547-5283 2 ,
  • Mandev S. Gill 2 ,
  • Paul Bastide   ORCID: orcid.org/0000-0002-8084-9893 2 ,
  • Karthik Gangavarapu 3 ,
  • Nathaniel L. Matteson 3 ,
  • Yi Tan 4 , 5 ,
  • Louis du Plessis   ORCID: orcid.org/0000-0003-0352-6289 6 ,
  • Alexander A. Fisher 7 ,
  • Martha I. Nelson 8 ,
  • Marius Gilbert 1 ,
  • Marc A. Suchard   ORCID: orcid.org/0000-0001-9818-479X 7 , 9 , 10 ,
  • Kristian G. Andersen 3 , 11 ,
  • Nathan D. Grubaugh 12 ,
  • Oliver G. Pybus   ORCID: orcid.org/0000-0002-8797-2667 6 &
  • Philippe Lemey   ORCID: orcid.org/0000-0003-2826-5353 2  

Nature Communications volume  11 , Article number:  5620 ( 2020 ) Cite this article

9323 Accesses

25 Citations

19 Altmetric

Metrics details

  • Ecological epidemiology
  • Molecular ecology
  • Phylogenetics
  • Viral epidemiology
  • West nile virus

Computational analyses of pathogen genomes are increasingly used to unravel the dispersal history and transmission dynamics of epidemics. Here, we show how to go beyond historical reconstructions and use spatially-explicit phylogeographic and phylodynamic approaches to formally test epidemiological hypotheses. We illustrate our approach by focusing on the West Nile virus (WNV) spread in North America that has substantially impacted public, veterinary, and wildlife health. We apply an analytical workflow to a comprehensive WNV genome collection to test the impact of environmental factors on the dispersal of viral lineages and on viral population genetic diversity through time. We find that WNV lineages tend to disperse faster in areas with higher temperatures and we identify temporal variation in temperature as a main predictor of viral genetic diversity through time. By contrasting inference with simulation, we find no evidence for viral lineages to preferentially circulate within the same migratory bird flyway, suggesting a substantial role for non-migratory birds or mosquito dispersal along the longitudinal gradient.

Similar content being viewed by others

hypothesis formulation epidemiology

The evolutionary drivers and correlates of viral host jumps

Cedric C. S. Tan, Lucy van Dorp & Francois Balloux

hypothesis formulation epidemiology

Effects of climate change and human activities on vector-borne diseases

William M. de Souza & Scott C. Weaver

hypothesis formulation epidemiology

Infectious disease in an era of global change

Rachel E. Baker, Ayesha S. Mahmud, … C. Jessica E. Metcalf

Introduction

The evolutionary analysis of rapidly evolving pathogens, particularly RNA viruses, allows us to establish the epidemiological relatedness of cases through time and space. Such transmission information can be difficult to uncover using classical epidemiological approaches. The development of spatially explicit phylogeographic models 1 , 2 , which place time-referenced phylogenies in a geographical context, can provide a detailed spatio-temporal picture of the dispersal history of viral lineages 3 . These spatially explicit reconstructions frequently serve illustrative or descriptive purposes, and remain underused for testing epidemiological hypotheses in a quantitative fashion. However, recent advances in methodology offer the ability to analyse the impact of underlying factors on the dispersal dynamics of virus lineages 4 , 5 , 6 , giving rise to the concept of landscape phylogeography 7 . Similar improvements have been made to phylodynamic analyses that use flexible coalescent models to reconstruct virus demographic history 8 , 9 ; these methods can now provide insights into epidemiological or environmental variables that might be associated with population size change 10 .

In this study, we focus on the spread of West Nile virus (WNV) across North America, which has considerably impacted public, veterinary, and wildlife health 11 . WNV is the most widely distributed encephalitic flavivirus transmitted by the bite of infected mosquitoes 12 , 13 . WNV is a single-stranded RNA virus that is maintained by an enzootic transmission cycle primarily involving Culex mosquitoes and birds 14 , 15 , 16 , 17 . Humans are incidental terminal hosts, because viremia does not reach a sufficient level for subsequent transmission to mosquitoes 17 , 18 . WNV human infections are mostly subclinical although symptoms may range from fever to meningoencephalitis and can occasionally lead to death 17 , 19 . It has been estimated that only 20–25% of infected people become symptomatic, and that <1 in 150 develops neuroinvasive disease 20 . The WNV epidemic in North America likely resulted from a single introduction to the continent 20 years ago 21 . Its persistence is likely not the result of successive reintroductions from outside of the hemisphere, but rather of local overwintering and maintenance of long-term avian and/or mosquito transmission cycles 11 . Overwintering could also be facilitated by vertical transmission of WNV from infected female mosquitos to their offspring 22 , 23 , 24 . WNV represents one of the most important vector-borne diseases in North America 15 ; there were an estimated 7 million human infections in the U.S. 25 , causing a reported 24,657 human neuroinvasive cases between 1999 to 2018, leading to 2,199 deaths ( www.cdc.gov/westnile ). In addition, WNV has had a notable impact on North American bird populations 26 , 27 , with several species 28 such as the American crow ( Corvus brachyrhynchos ) being particularly severely affected.

Since the beginning of the epidemic in North America in 1999 21 , WNV has received considerable attention from local and national health institutions and the scientific community. This had led to the sequencing of >2000 complete viral genomes collected at various times and locations across the continent. The resulting availability of virus genetic data represents a unique opportunity to better understand the evolutionary history of WNV invasion into an originally non-endemic area. Here, we take advantage of these genomic data to address epidemiological questions that are challenging to tackle with non-molecular approaches.

The overall goal of this study is to go beyond historical reconstructions and formally test epidemiological hypotheses by exploiting phylodynamic and spatially explicit phylogeographic models. We detail and apply an analytical workflow that consists of state-of-the-art methods that we further improve to test hypotheses in molecular epidemiology. We demonstrate the power of this approach by analysing a comprehensive data set of WNV genomes with the objective of unveiling the dispersal and demographic dynamics of the virus in North America. Specifically, we aim to (i) reconstruct the dispersal history of WNV on the continent, (ii) compare the dispersal dynamics of the three WNV genotypes, (iii) test the impact of environmental factors on the dispersal locations of WNV lineages, (iv) test the impact of environmental factors on the dispersal velocity of WNV lineages, (v) test the impact of migratory bird flyways on the dispersal history of WNV lineages, and (vi) test the impact of environmental factors on viral genetic diversity through time.

Reconstructing the dispersal history and dynamics of WNV lineages

To infer the dispersal history of WNV lineages in North America, we performed a spatially explicit phylogeographic analysis 1 of 801 viral genomes (Supplementary Figs.  S1 and S2 ), which is almost an order of magnitude larger than the early US-wide study by Pybus et al. 2 (104 WNV genomes). The resulting sampling presents a reasonable correspondence between West Nile fever prevalence in the human population and sampling density in most areas associated with the highest numbers of reported cases (e.g., Los Angeles, Houston, Dallas, Chicago, New York), but also some under-sampled locations (e.g., in Colorado; Supplementary Fig.  S1 ). Year-by-year visualisation of the reconstructed invasion history highlights both relatively fast and relatively slow long-distance dispersal events across the continent (Supplementary Fig.  S3 ), which is further confirmed by the comparison between the durations and geographic distances travelled by phylogeographic branches (Supplementary Fig.  S4 ). Some of these long-distance dispersal events were notably fast, with >2000 km travelled in only a couple of months (Supplementary Fig.  S4 ).

To quantify the spatial dissemination of virus lineages, we extracted the spatio-temporal information embedded in molecular clock phylogenies sampled by Bayesian phylogeographic analysis. From the resulting collection of lineage movement vectors, we estimated several key statistics of spatial dynamics (Fig.  1 ). We estimated a mean lineage dispersal velocity of ~1200 km/year, which is consistent with previous estimates 2 . We further inferred how the mean lineage dispersal velocity changed through time, and found that dispersal velocity was notably higher in the earlier years of the epidemic (Fig.  1 ). The early peak of lineage dispersal velocity around 2001 corresponds to the expansion phase of the epidemic. This is corroborated by our estimate of the maximal wavefront distance from the epidemic origin through time (Fig.  1 ). This expansion phase lasted until 2002, when WNV lineages first reached the west coast (Fig.  1 and Supplementary Fig. S3 ). From East to West, WNV lineages dispersed across various North American environmental conditions in terms of land cover, altitude, and climatic conditions (Fig.  2 ).

figure 1

Maximum clade credibility (MCC) tree obtained by continuous phylogeographic inference based on 100 posterior trees (see the text for further details). Nodes of the tree are coloured from red (the time to the most recent common ancestor, TMRCA) to green (most recent sampling time). Older nodes are plotted on top of younger nodes, but we provide also an alternative year-by-year representation in Supplementary Fig.  S1 . In addition, this figure reports global dispersal statistics (mean lineage dispersal velocity and mean diffusion coefficient) averaged over the entire virus spread, the evolution of the mean lineage dispersal velocity through time, the evolution of the maximal wavefront distance from the origin of the epidemic, as well as the delimitations of the North American Migratory Flyways (NAMF) considered in the USA.

figure 2

See Table S1 for the source of data for each environmental raster.

We also compared the dispersal velocity estimated for five subsets of WNV phylogenetic branches (Fig.  3 ): branches occurring during (before 2002) and after the expansion phase (after 2002), as well as branches assigned to each of the three commonly defined WNV genotypes that circulated in North America (NY99, WN02, and SW03; Supplementary Figs.  S1 – S2 ). While NY99 is the WNV genotype that initially invaded North America, WN02 and subsequently SW03 emerged as the two main co-circulating genotypes characterised by distinct amino acid substitutions 29 , 30 , 31 , 32 . We specifically compare the dispersal history and dynamics of lineages belonging to these three different genotypes in order to investigate the assumption that WNV dispersal might have been facilitated by local environmental adaptations 32 . To address this question, we performed the three landscape phylogeographic testing approaches presented below on the complete data set including all viral lineages inferred by continuous phylogeographic inference, as well as on these different subsets of lineages. We first compared the lineage dispersal velocity estimated for each subset by estimating both the mean and weighted lineage dispersal velocities. As shown in Fig.  3 and detailed in the Methods section, the weighted metric is more efficient and suitable to compare the dispersal velocity associated with different data sets, or subsets in the present situation. Posterior distributions of the weighted dispersal velocity confirmed that the lineage dispersal was much faster during the expansion phase (<2002; Fig.  3 ). Second, these estimates also indicated that SW03 is associated with a higher dispersal velocity than the dominant genotype, WN02.

figure 3

The map displays the maximum clade credibility (MCC) tree obtained by continuous phylogeographic inference with nodes coloured according to three different genotypes.

Testing the impact of environmental factors on the dispersal locations of viral lineages

To investigate the impact of environmental factors on the dispersal dynamics of WNV, we performed three different statistical tests in a landscape phylogeographic framework. First, we tested whether lineage dispersal locations tended to be associated with specific environmental conditions. In practice, we started by computing the E statistic, which measures the mean environmental values at tree node positions. These values were extracted from rasters (geo-referenced grids) that summarised the different environmental factors to be tested: elevation, main land cover variables in the study area (forests, shrublands, savannas, grasslands, croplands, urban areas; Fig.  2 ), and monthly time-series collections of climatic rasters (for temperature and precipitation; Supplementary Table  S1 ). For the time-series climatic factors, the raster used for extracting the environmental value was selected according to the time of occurrence of each tree node. The E statistic was computed for each posterior tree sampled during the phylogeographic analysis, yielding a posterior distribution of this metric (Supplementary Fig.  S5 ). To determine whether the posterior distributions for E were significantly lower or higher than expected by chance under a null dispersal model, we also computed E based on simulations, using the inferred set of tree topologies along which a new stochastic diffusion history was simulated according to the estimated diffusion parameters. The statistical support was assessed by comparing inferred and simulated distributions of E . If the inferred distribution was significantly lower than the simulated distribution of E , this provides evidence for the environmental factor to repulse viral lineages, while an inferred distribution higher than the simulated distribution of E would provide evidence for the environmental factor to attract viral lineages.

These first landscape phylogeographic tests revealed that WNV lineages (i) tended to avoid areas associated with relatively higher elevation, forest coverage, and precipitation, and (ii) tended to disperse in areas associated with relatively higher urban coverage, temperature, and shrublands (Supplementary Table  S2 ). However, when analysing each genotype separately, different trends emerged. For instance, SW03 lineages did not tend to significantly avoid (or disperse to) areas with relatively higher elevation (~600–750 m above sea level), and only SW03 lineages significantly dispersed towards areas with shrublands (Supplementary Table  S2 ). Furthermore, when only focusing on WNV lineages occurring before 2002, we did not identify any significant association between environmental values and node positions. Interestingly, this implies that we cannot associate viral dispersal during the expansion phase with specific favourable environmental conditions (Supplementary Table  S2 ). As these tests are directly based on the environmental values extracted at internal and tip node positions, their outcome can be particularly impacted by the nature of sampling. Indeed, half of the node positions, i.e., the tip node positions, are directly determined by the sampling. To assess the sensitivity of the tests to heterogeneous sampling, we also repeated these tests while only considering internal tree nodes. Since internal nodes are phylogeographically linked to tip nodes, discarding tip branches can only mitigate the direct impact of the sampling pattern on the outcome of the analysis. These additional tests provided consistent results, except for one environmental factor: while precipitation was identified as a factor repulsing viral lineages, it was not the case anymore when only considering internal tree branches, indicating that the initial result could be attributed to a sampling artefact.

Testing the impact of environmental factors on the dispersal velocity of viral lineages

In the second landscape phylogeographic test, we analysed whether the heterogeneity observed in lineage dispersal velocity could be explained by specific environmental factors that are predominant in the study area. For this purpose, we used a computational method that assesses the correlation between lineage dispersal durations and environmentally scaled distances 4 , 33 . These distances were computed on several environmental rasters (Fig.  2 and Supplementary Table  S1 ): elevation, main land cover variables in the study area (forests, shrublands, savannas, grasslands, croplands, urban areas), as well as annual mean temperature and annual precipitation. This analysis aimed to quantify the impact of each factor on virus movement by calculating a statistic, Q , that measures the correlation between lineage durations and environmentally scaled distances. Specifically, the Q statistic describes the difference in strength of the correlation when distances are scaled using the environmental raster versus when they are computed using a “null” raster (i.e., a uniform raster with a value of “1” assigned to all cells). As detailed in the Methods section, two alternative path models were used to compute these environmentally scaled distances: the least-cost path model 34 and a model based on circuit theory 35 . The Q statistic was estimated for each posterior tree sampled during the phylogeographic analysis, yielding a posterior distribution of this metric. As for the statistic E , statistical support for Q was then obtained by comparing inferred and simulated distributions of Q ; the latter was obtained by estimating Q on the same set of tree topologies, along which a new stochastic diffusion history was simulated. This simulation procedure thereby generated a null model of dispersal, and the comparison between the inferred and simulated Q distributions enabled us to approximate a Bayes factor support (see “Methods” for further details).

As summarised in Supplementary Table  S3 , we found strong support for one variable: annual temperature raster treated as a conductance factor. Using this factor, the association between lineage duration and environmentally scaled distances was significant using the path model based on circuit theory 35 . As detailed in Fig.  4 , this environmental variable better explained the heterogeneity in lineage dispersal velocity than geographic distance alone (i.e., its Q distribution was positive). Furthermore, this result received strong statistical support (Bayes factor > 20), obtained by comparing the distribution of Q values with that obtained under a null model (Fig.  4 ). We also performed these tests on each WNV genotype separately (Supplementary Table  S4 ). With these additional tests, we only found the same statistical support associated with temperature for the viral lineages belonging to the WN02 genotype. In addition, these tests based on subsets of lineages also revealed that the higher elevation was significantly associated with lower dispersal velocity of WN02 lineages.

figure 4

The graph displays the distribution of the correlation metric Q computed on 100 spatially annotated trees obtained by continuous phylogeographic inference (red distributions). The metric Q measures to what extent considering a heterogeneous environmental raster, increases the correlation between lineage durations and environmentally scaled distances compared to a homogeneous raster. If Q is positive and supported, it indicates that the heterogeneity in lineage dispersal velocity can be at least partially explained by the environmental factor under investigation. The graph also displays the distribution of Q values computed on the same 100 posterior trees along which we simulated a new forward-in-time diffusion process (grey distributions). These simulations are used as a null dispersal model to estimate the support associated with the inferred distribution of Q values. For both inferred and simulated trees, we report the Q distributions obtained while transforming the original environmental raster according to two different scaling parameter k values (100 and 1000; respectively full and dashed line, see the text for further details on this transformation). The annual mean temperature raster, transformed in conductance values using these two k values, is the only environmental factor for which we detect a positive distribution of Q that is also associated with a strong statistical support (Bayes factor > 20).

Testing the impact of environmental factors on the dispersal frequency of viral lineages

The third landscape phylogeography test that we performed focused on the impact of specific environmental factors on the dispersal frequency of viral lineages. Specifically, we aimed to investigate the impact of migratory bird flyways on the dispersal history of WNV. For this purpose, we first tested whether virus lineages tended to remain within the same North American Migratory Flyway (NAMF; Fig.  1 ). As in the two first testing approaches, we again compared inferred and simulated diffusion dynamics (i.e., simulation of a new stochastic diffusion process along the estimated trees). Under the null hypothesis (i.e., NAMFs have no impact on WNV dispersal history), virus lineages should not transition between flyways less often than under the null dispersal model. Our test did not reject this null hypothesis (BF < 1). As the NAMF borders are based on administrative areas (US counties), we also performed a similar test using the alternative delimitation of migratory bird flyways estimated for terrestrial bird species by La Sorte et al. 36 (Supplementary Fig.  S6 ). Again, the null hypothesis was not rejected, indicating that inferred virus lineages did not tend to remain within specific flyways more often than expected by chance. Finally, these tests were repeated on each of the five subsets of WNV lineages (<2002, >2002, NY99, WN02, SW03) and yielded the same results, i.e., no rejection of the null hypothesis stating that flyways do not constrain WNV dispersal.

Testing the impact of environmental factors on the viral genetic diversity through time

We next employed a phylodynamic approach to investigate predictors of the dynamics of viral genetic diversity through time. In particular, we used the generalised linear model (GLM) extension 10 of the skygrid coalescent model 9 , hereafter referred to as the “skygrid-GLM” approach, to statistically test for associations between estimated dynamics of virus effective population size and several covariates. Coalescent models that estimate effective population size (Ne) typically assume a single panmictic population that encompasses all individuals. As this assumption is frequently violated in practice, the estimated effective population size is sometimes interpreted as representing an estimate of the genetic diversity of the whole virus population 37 . The skygrid-GLM approach accounts for uncertainty in effective population size estimates when testing for associations with covariates; neglecting this uncertainty can lead to spurious conclusions 10 .

We first performed univariate skygrid-GLM analyses of four distinct time-varying covariates reflecting seasonal changes: monthly human WNV case counts (log-transformed), temperature, precipitation, and a greenness index. For the human case count covariate, we only detected a significant association with the viral effective population size when considering a lag period of at least one month. In addition, univariate analyses of temperature and precipitation time-series were also associated with the virus genetic diversity dynamics (i.e., the posterior GLM coefficients for these covariates had 95% credible intervals that did not include zero; Fig.  5 ). To further assess the relative importance of each covariate, we performed multivariate skygrid-GLM analyses to rank covariates based on their inclusion probabilities 38 . The first multivariate analysis involved all covariates and suggested that the lagged human case counts best explain viral population size dynamics, with an inclusion probability close to 1. However, because human case counts are known to be a consequence rather than a potential causal driver of the WNV epidemic, we performed a second multivariate analysis after having excluded this covariate. This time, the temperature time-series emerged as the covariate with the highest inclusion probability.

figure 5

These associations were tested with a generalised linear model (GLM) extension of the coalescent model used to infer the dynamics of the viral effective population size of the virus (Ne) through time. Specifically, we here tested the following time-series variables as potential covariates (orange curves): number of human cases (log-transformed and with a negative time period of one month), mean temperature, mean precipitation, and Normalised Difference Vegetation Index (NDVI, a greenness index). Posterior mean estimates of the viral effective population size based on both sequence data and covariate data are represented by blue curves, and the corresponding blue polygon reflects the 95% HPD region. Posterior mean estimates of the viral effective population size inferred strictly from sequence data are represented by grey curves and the corresponding grey polygon reflects the 95% HPD region. A significant association between the covariate and effective population size is inferred when the 95% HPD interval of the GLM coefficient excludes zero, which is the case for the case count, temperature, and precipitation covariates.

In this study, we use spatially explicit phylogeographic and phylodynamic inference to reconstruct the dispersal history and dynamics of a continental viral spread. Through comparative analyses of lineage dispersal statistics, we highlight distinct trends within the overall spread of WNV. First, we have demonstrated that the WNV spread in North America can be divided into an initial “invasion phase” and a subsequent “maintenance phase” (see Carrington et al. 39 for similar terminology used in the context of spatial invasion of dengue viruses). The invasion phase is characterised by an increase in virus effective population size until the west coast was reached, followed by a maintenance phase associated with a more stable cyclic variation of effective population size (Fig.  5 ). In only 2–3 years, WNV rapidly spread from the east to the west coast of North America, despite the fact that the migratory flyways of its avian hosts are primarily north-south directed. This could suggest potentially important roles for non-migratory bird movements, as well as natural or human-mediated mosquito dispersal, in spreading WNV along a longitudinal gradient 40 , 41 . However, the absence of clear within flyway clustering of viral lineages could also arise when different avian migration routes intersect at southern connections. If local WNV transmission occurs at these locations, viruses could travel along different flyways when the birds make their return northward migration, as proposed by Swetnam et al. 42 . While this scenario is possible, there is insufficient data to formally investigate with our approaches. Overall, we uncover a higher lineage dispersal velocity during the invasion phase, which could reflect a consequence of increased bird immunity through time slowing down spatial dispersal. It has indeed been demonstrated that avian immunity can impact WNV transmission dynamics 43 . Second, we also reveal different dispersal velocities associated with the three WNV genotypes that have circulated in North America: viral lineages of the dominant current genotype (WN02) have spread slower than lineages of NY99 and SW03. NY99 was the main genotype during the invasion phase but has not been detected in the US since the mid 2000s. A faster dispersal associated with NY99 is thus coherent with the higher dispersal velocity identified for lineages circulating during the invasion phase. The higher dispersal velocity for SW03 compared to WN02 is in line with recently reported evidence that SW03 spread faster than WN02 in California 44 .

In the second part of the study, we illustrate the application of a phylogeographic framework for hypothesis testing that builds on previously developed models. These analytical approaches are based on a spatially explicit phylogeographic or phylodynamic (skygrid coalescent) reconstruction, and aim to assess the impact of environmental factors on the dispersal locations, velocity, and frequency of viral lineages, as well as on the overall genetic diversity of the viral population. The WNV epidemic in North America is a powerful illustration of viral invasion and emergence in a new environment 31 , making it a highly relevant case study to apply such hypothesis testing approaches. We first test the association between environmental factors and lineage dispersal locations, demonstrating that, overall, WNV lineages have preferentially circulated in specific environmental conditions (higher urban coverage, temperature, and shrublands) and tended to avoid others (higher elevation and forest coverage). Second, we have tested the association between environmental factors and lineage dispersal velocity. With these tests, we find evidence for the impact of only one environmental factor on virus lineage dispersal velocity, namely annual mean temperature. Third, we tested the impact of migratory flyways on the dispersal frequency of viral lineages among areas. Here, we formally test the hypothesis that WNV lineages are contained or preferentially circulate within the same migratory flyway and find no statistical support for this.

We have also performed these three different landscape phylogeographic tests on subsets of WNV lineages (lineages occurring during and after the invasion phase, as well as NY99, WN02, and SW03 lineages). When focusing on lineages occurring during the invasion phase (<2002), we do not identify any significant association between a particular environmental factor and the dispersal location or velocity of lineages. This result corroborates the idea that, during the early phase of the epidemic, the virus rapidly conquered the continent despite various environmental conditions, which was likely helped by large populations of susceptible hosts/vectors already present in North America 32 . These additional tests also highlight interesting differences among the three WNV genotypes. For instance, we found that the dispersal of SW03 genotype is faster than WN02 and also preferentially in shrublands and at higher temperatures. At face value, it appears that the mutations that define the SW03 genotype, NS4A-A85T and NS5-K314R 45 , may be signatures of adaptations to such specific environmental conditions. It may, however, be an artefact of the SW03 genotype being most commonly detected in mosquito species such as Cx. tarsalis and Cx. quiquefasciatus that are found in the relatively high elevation shrublands of the southwest US 44 , 46 . In this scenario, the faster dispersal velocities could result from preferentially utilising these two highly efficient WNV vectors 47 , especially when considering the warm temperatures of the southwest 48 , 49 . It is also important to note that to date, no specific phenotypic advantage has been observed for SW03 genotypes compared to WN02 genotypes 50 , 51 . Further research is needed to discern if the differences among the three WNV genotypes are due to virus-specific factors, heterogeneous sampling effort, or ecological variation.

When testing the impact of flyways on the five different subsets of lineages, we reach the same result of no preferential circulation within flyways. This overall result contrasts with previously reported phylogenetic clustering by flyways 31 , 42 . However, the clustering analysis of Di Giallonardo et al. 31 was based on a discrete phylogeographic analysis and, as recognised by the authors, it is difficult to distinguish the effect of these flyways from those of geographic distance. Here, we circumvent this issue by performing a spatial analysis that explicitly represents dispersal as a function of geographic distance. Our results are, however, not in contradiction with the already established role of migratory birds in spreading the virus 52 , 53 , but we do not find evidence that viral lineage dispersal is structured by flyway. Specifically, our test does not reject the null hypothesis of absence of clustering by flyways, which at least signals that the tested flyways do not have a discernible impact on WNV lineages circulation. Dissecting the precise involvement of migratory bird in WNV spread, thus, require additional collection of empirical data. Furthermore, our phylogeographic analysis highlights the occurrence of several fast and long-distance dispersal events along a longitudinal gradient. A potential anthropogenic contribution to such long-distance dispersal (e.g., through commercial transport) warrants further investigation.

In addition to its significant association with the dispersal locations and velocity of WNV lineages, the relevance of temperature is further demonstrated by the association between the virus genetic dynamics and several time-dependent covariates. Indeed, among the three environmental time-series we tested, temporal variation in temperature is the most important predictor of cycles in viral genetic diversity. Temperature is known to have a dramatic impact on the biology of arboviruses and their arthropod hosts 54 , including WNV. Higher temperatures have been shown to impact directly the mosquito life cycle, by accelerating larval development 11 , decreasing the interval between blood meals, and prolonging the mosquito breeding season 55 . Higher temperatures have been also associated with shorter extrinsic incubation periods, accelerating WNV transmission by the mosquito vector 56 , 57 . Interestingly, temperature has also been suggested as a variable that can increase the predictive power of WNV forecast models 58 . The impact of temperature that we reveal here on both dispersal velocity and viral genetic diversity is particularly important in the context of global warming. In addition to altering mosquito species distribution 59 , 60 , an overall temperature increase in North America could imply an increase in enzootic transmission and hence increased spill-over risk in different regions. In addition to temperature, we find evidence for an association between viral genetic diversity dynamics and the number of human cases, but only when a lag period of at least one month is added to the model (having only monthly case counts available, it was not possible to test shorter lag periods). Such lag could, at least in part, be explained by the time needed for mosquitos to become infectious and bite humans. As human case counts are in theory backdated to the date of onset of illness, incubation time in humans should not contribute to this lag.

Our study illustrates and details the utility of landscape phylogeographic and phylodynamic hypothesis tests when applied to a comprehensive data set of viral genomes sampled during an epidemic. Such spatially explicit investigations are only possible when viral genomes (whether recently collected or available on public databases such as GenBank) are associated with sufficiently precise metadata, in particular the collection date and the sampling location. The availability of precise collection dates - ideally known to the day - for isolates obtained over a sufficiently long time-span enables reliable timing of epidemic events due to the accurate calibration of molecular clock models. Further, spatially explicit phylogeographic inference is possible only when viral genomes are associated with sampling coordinates. However, geographic coordinates are frequently unknown or unreported. In practice this may not represent a limitation if a sufficiently precise descriptive sampling location is specified (e.g., a district or administrative area), as this information can be converted into geographic coordinates. The full benefits of comprehensive phylogeographic analyses of viral epidemics will be realised only when precise location and time metadata are made systematically available.

Although we use a comprehensive collection of WNV genomes in this study, it would be useful to perform analyses based on even larger data sets that cover regions under-sampled in the current study; this work is the focus of an ongoing collaborative project (westnile4k.org). While the resolution of phylogeographic analyses will always depend on the spatial granularity of available samples, they can still be powerful in elucidating the dispersal history of sampled lineages. When testing the impact of environmental factors on lineage dispersal velocity and frequency, heterogeneous sampling density will primarily affect statistical power in detecting the impact of relevant environmental factors in under- or unsampled areas 33 . However, the sampling pattern can have much more impact on the tests dedicated to the impact of environmental factors on the dispersal locations of viral lineages. As stated above, in this test, half of the environmental values will be extracted at tip node locations, which are directly determined by the sampling effort. To circumvent this issue and assess the robustness of the test regarding the sampling pattern, we here proposed to repeat the analysis after having discarded all the tip branches, which logically mitigated a potential impact of the sampling pattern on the outcome of this analysis. Furthermore, in this study, we note that heterogeneous sampling density across counties can be at least partially mitigated by performing phylogenetic subsampling (detailed in the “Methods” section). Another limitation to underline is that, contrary to the tests focusing on the impact of environmental factors on the dispersal locations and frequency, the present framework does not allow testing the impact of time-series environmental variables on the dispersal velocity of viral lineages. It would be interesting to extend that framework so that it can, e.g., test the impact of spatio-temporal variation of temperature on the dispersal velocity of WNV lineages. On the opposite, while skygrid-GLM analyses intrinsically integrate temporal variations of covariates, these tests treat the epidemic as a unique panmictic population of viruses. In addition to ignoring the actual population structure, this aspect implies the comparison of the viral effective population size with a unique environmental value per time slice and for the entire study area. To mitigate spatial heterogeneity as much as possible, we used the continuous phylogeographic reconstruction to define successive minimum convex hull polygons delimiting the study area at each time slice. These polygons were used to extract the environmental values that were then averaged to obtain a single environmental value per time slice considered in the skygrid-GLM analysis.

By placing virus lineages in a spatio-temporal context, phylogeographic inference provides information on the linkage of infections through space and time. Mapping lineage dispersal can provide a valuable source of information for epidemiological investigations and can shed light on the ecological and environmental processes that have impacted the epidemic dispersal history and transmission dynamics. When complemented with phylodynamic testing approaches, such as the skygrid-GLM approach used here, these methods offer new opportunities for epidemiological hypotheses testing. These tests can complement traditional epidemiological approaches that employ occurrence data. If coupled to real-time virus genome sequencing, landscape phylogeographic and phylodynamic testing approaches have the potential to inform epidemic control and surveillance decisions 61 .

Selection of viral sequences

We started by gathering all WNV sequences available on GenBank on the 20 th November 2017. We only selected sequences (i) of at least 10 kb, i.e., covering almost the entire viral genome (~11 kb), and (ii) associated with a sufficiently precise sampling location, i.e., at least an administrative area of level 2. Administrative areas of level 2 are hereafter abbreviated “admin-2” and correspond to US counties. Finding the most precise sampling location (admin-2, city, village, or geographic coordinates), as well as the most precise sampling date available for each sequence, required a bibliographic screening because such metadata are often missing on GenBank. The resulting alignment of 993 geo-referenced genomic sequences of at least 10 kb was made using MAFFT 62 and manually edited in AliView 63 . Based on this alignment, we performed a first phylogenetic analysis using the maximum likelihood method implemented in the programme FastTree 64 with 1000 bootstrap replicates to assess branch supports. The aim of this preliminary phylogenetic inference was solely to identify monophyletic clades of sequences sampled from the same admin-2 area associated with a bootstrap support higher than 70%. Such phylogenetic clusters of sampled sequences largely represent lineage dispersal within a specific admin-2 area. As we randomly draw geographic coordinates from an admin-2 polygon for sequences only associated with an admin-2 area of origin, keeping more than one sequence per phylogenetic cluster would not contribute any meaningful information in subsequent phylogeographic analyses 61 . Therefore, we subsampled the original alignment such that only one sequence is randomly selected per phylogenetic cluster, leading to a final alignment of 801 genomic sequences (Supplementary Fig.  S1 ). In the end, selected sequences were mostly derived from mosquitoes (~50%) and birds (~44%), with very few (~5%) from humans.

Time-scaled phylogenetic analysis

Time-scaled phylogenetic trees were inferred using BEAST 1.10.4 65 and the BEAGLE 3 library 66 to improve computational performance. The substitution process was modelled according to a GTR+Γ parametrisation 67 , branch-specific evolutionary rates were modelled according to a relaxed molecular clock with an underlying log-normal distribution 68 , and the flexible skygrid model was specified as tree prior 9 , 10 . We ran and eventually combined ten independent analyses, sampling Markov chain Monte-Carlo (MCMC) chains every 2 × 10 8 generations. Combined, the different analyses were run for >10 12 generations. For each distinct analysis, the number of sampled trees to discard as burn-in was identified using Tracer 1.7 69 . We used Tracer to inspect the convergence and mixing properties of the combined output, referred to as the “skygrid analysis” throughout the text, to ensure that estimated sampling size (ESS) values associated with estimated parameters were all >200.

Spatially explicit phylogeographic analysis

The spatially explicit phylogeographic analysis was performed using the relaxed random walk (RRW) diffusion model implemented in BEAST 1 , 2 . This model allows the inference of spatially and temporally referenced phylogenies while accommodating variation in dispersal velocity among branches 3 . Following Pybus et al. 2 , we used a gamma distribution to model the among-branch heterogeneity in diffusion velocity. Even when launching multiple analyses and using GPU resources to speed-up the analyses, poor MCMC mixing did not permit reaching an adequate sample from the posterior in a reasonable amount of time. This represents a challenging problem that is currently under further investigation 70 . To circumvent this issue, we performed 100 independent phylogeographic analyses each based on a distinct fixed tree sampled from the posterior distribution of the skygrid analysis. We ran each analysis until ESS values associated with estimated parameters were all greater than 100. We then extracted the last spatially annotated tree sampled in each of the 100 posterior distributions, which is the equivalent of randomly sampling a post-burn-in tree within each distribution. All the subsequent landscape phylogeographic testing approaches were based on the resulting distribution of the 100 spatially annotated trees. Given the computational limitations, we argue that the collection of 100 spatially annotated trees, extracted from distinct posterior distributions each based on a different fixed tree topology, represents a reasonable approach to obtain a phylogeographic reconstruction that accounts for phylogenetic uncertainty. We note that this is similar to the approach of using a set of empirical trees that is frequently employed for discrete phylogeographic inference 71 , 72 , but direct integration over such a set of trees is not appropriate for the RRW model because the proposal distribution for branch-specific scaling factors does not hold in this case. We used TreeAnnotator 1.10.4 65 to obtain the maximum clade credibility (MCC) tree representation of the spatially explicit phylogeographic reconstruction (Supplementary Fig.  S2 ).

In addition to the overall data set encompassing all lineages, we also considered five different subsets of lineages: phylogeny branches occurring before or after the end of the expansion/invasion phase (i.e., 2002; Fig.  1 ), as well as phylogeny branches assigned to each of the three WNV genotypes circulating in North America (NY99, WN02, and SW03; Supplementary Figs.  S1 – S2 ). These genotypes were identified on the basis of the WNV database published on the platform Nextstrain 32 , 73 . For the purpose of comparison, we performed all the subsequent landscape phylogeographic approaches on the overall data set but also on these five different subsets of WNV lineages.

Estimating and comparing lineage dispersal statistics

Phylogenetic branches, or “lineages”, from spatially and temporally referenced trees can be treated as conditionally independent movement vectors 2 . We used the R package “seraphim” 74 to extract the spatio-temporal information embedded within each tree and to summarise lineages as movement vectors. We further used the package “seraphim” to estimate two dispersal statistics based on the collection of such vectors: the mean lineage dispersal velocity ( v mean ) and the weighted lineage dispersal velocity ( v weighted ) 74 . While both metrics measure the dispersal velocity associated with phylogeny branches, the second version involves a weighting by branch time 75 :

where d i and t i are the geographic distance travelled (great-circle distance in km) and the time elapsed (in years) on each phylogeny branch, respectively. The weighted metric is useful for comparing branch dispersal velocity between different data sets or different subsets of the same data set. Indeed, phylogeny branches with short duration have a lower impact on the weighted lineage dispersal velocity, which results in lower‐variance estimates facilitating data set comparison 33 . On the other hand, estimating mean lineage dispersal velocity is useful when aiming to investigate the variability of lineage dispersal velocity within a distinct data set 75 . Finally, we also estimated the evolution of the maximal wavefront distance from the epidemic origin, as well as the evolution of the mean lineage dispersal velocity through time.

Generating a null dispersal model of viral lineages dispersal

To generate a null dispersal model we simulated a forward-in-time RRW diffusion process along each tree topology used for the phylogeographic analyses. These RRW simulations were performed with the “simulatorRRW1” function of the R package “seraphim” and based on the sampled precision matrix parameters estimated by the phylogeographic analyses 61 . For each tree, the RRW simulation started from the root node position inferred by the phylogeographic analysis. Furthermore, these simulations were constrained such that the simulated node positions remain within the study area, which is here defined by the minimum convex hull built around all node positions, minus non-accessible sea areas. As for the annotated trees obtained by phylogeographic inference, hereafter referred to as “inferred trees”, we extracted the spatio-temporal information embedded within their simulated counterparts, hereafter referred as “simulated trees”. As RRW diffusion processes were simulated along fixed tree topologies, each simulated tree shares a common topology with an inferred tree. Such a pair of inferred and simulated trees, thus, only differs by the geographic coordinates associated with their nodes, except for the root node position that was fixed as starting points for the RRW simulation. The distribution of 100 simulated trees served as a null dispersal model for the landscape phylogeographic testing approaches described below.

The first landscape phylogeographic testing approach consisted of testing the association between environmental conditions and dispersal locations of viral lineages. We started by simply visualising and comparing the environmental values explored by viral lineages. For each posterior tree sampled during the phylogeographic analysis, we extracted and then averaged the environmental values at the tree node positions. We then obtained, for each analysed environmental factor, a posterior distribution of mean environmental values at tree node positions for the overall data set as well as for the five subsets of WNV lineages described above. In addition to this visualisation, we also performed a formal test comparing mean environmental values extracted at node positions in inferred ( E estimated ) and simulated trees ( E simulated ). E simulated values constituted the distribution of mean environmental values explored under the null dispersal model, i.e., under a dispersal scenario that is not impacted by any underlying environmental condition. To test if a particular environmental factor e tended to attract viral lineages, we approximated the following Bayes factor (BF) support 76 :

where p e is the posterior probability that E estimated  >  E simulated , i.e., the frequency at which E estimated  >  E simulated in the samples from the posterior distribution. The prior odds is 1 because we can assume an equal prior expectation for E estimated and E simulated . To test if a particular environmental factor e tended to repulse viral lineages, BF e was approximated with p e as the posterior probability that E estimated  <  E simulated . These tests are similar to a previous approach using a null dispersal model based on randomisation of phylogeny branches 75 .

We tested several environmental factors both as factors potentially attracting or repulsing viral lineages: elevation, main land cover variables on the study area, and climatic variables. Each environmental factor was described by a raster that defines its spatial heterogeneity (see Supplementary Table  S1 for the source of each original raster file). Starting from the original categorical land cover raster with an original resolution of 0.5 arcmin (corresponding to cells ~1 km 2 ), we generated distinct land cover rasters by creating lower resolution rasters (10 arcmin) whose cell values equalled the number of occurrences of each land cover category within the 10 arcmin cells. The resolution of the other original rasters of fixed-in-time environmental factors (elevation, mean annual temperature, and annual precipitation) was also decreased to 10 arcmin for tractability, which was mostly important in the context of the second landscape phylogeographic approach detailed below. To obtain the time-series collection of temperature and precipitation rasters analysed in these first tests dedicated to the impact of environmental factors on lineage dispersal locations, we used the thin plate spline method implemented in the R package “fields” to interpolate measures obtained from the database of the US National Oceanic and Atmospheric Administration (NOAA; https://data.nodc.noaa.gov ).

The second landscape phylogeographic testing approach aimed to test the association between several environmental factors, again described by rasters (Fig.  2 ), and the dispersal velocity of WNV lineages in North America. Each environmental raster was tested as a potential conductance factor (i.e., facilitating movement) and as a resistance factor (i.e., impeding movement). In addition, for each environmental factor, we generated several distinct rasters by transforming the original raster cell values with the following formula: v t  = 1 +  k ( v o / v max ), where v t and v o are the transformed and original cell values, and v max the maximum cell value recorded in the raster. The rescaling parameter k here allows the definition and testing of different strengths of raster cell conductance or resistance, relative to the conductance/resistance of a cell with a minimum value set to “1”. For each of the three environmental factors, we tested three different values for k (i.e., k  = 10, 100, and 1000).

The following analytical procedure is adapted from a previous framework 4 and can be summarised in three distinct steps. First, we used each environmental raster to compute an environmentally scaled distance for each branch in inferred and simulated trees. These distances were computed using two different path models: (i) the least-cost path model, which uses a least-cost algorithm to determine the route taken between the starting and ending points 34 , and (ii) the Circuitscape path model, which uses circuit theory to accommodate uncertainty in the route taken 35 . Second, correlations between time elapsed on branches and environmentally scaled distances are estimated with the statistic Q defined as the difference between two coefficients of determination: (i) the coefficient of determination obtained when branch durations are regressed against environmentally scaled distances computed on the environmental raster, and (ii) the coefficient of determination obtained when branch durations are regressed against environmentally scaled distances computed on a uniform null raster. A Q statistic was estimated for each tree and we subsequently obtained two distributions of Q values, one associated with inferred trees and one associated with simulated trees. An environmental factor was only considered as potentially explanatory if both its distribution of regression coefficients and its associated distribution of Q values were positive 5 . Finally, the statistical support associated with a positive Q distribution (i.e., with at least 90% of positive values) was evaluated by comparing it with its corresponding null of distribution of Q values based on simulated trees, and formalised by approximating a BF support using formula (2), but this time defining p e as the posterior probability that Q estimated  >  Q simulated , i.e., the frequency at which Q estimated  >  Q simulated in the samples from the posterior distribution 33 .

In the third landscape phylogeographic testing approach, we investigated the impact of specific environmental factors on the dispersal frequency of viral lineages: we tested if WNV lineages tended to preferentially circulate and then remain within a distinct migratory flyway. We first performed a test based on the four North American Migratory Flyways (NAMF). Based on observed bird migration routes, these four administrative flyways (Fig.  1 ) were defined by the US Fish and Wildlife Service (USFWS; https://www.fws.gov/birds/management/ flyways.php) to facilitate management of migratory birds and their habitats. Although biologically questionable, we here used these administrative limits to discretise the study and investigate if viral lineages tended to remain within the same flyway. In practice, we analysed if viral lineages crossed NAMF borders less frequently than expected by chance, i.e., than expected in the null dispersal model in which simulated dispersal histories were not impacted by these borders. Following a procedure introduced by Dellicour et al. 61 , we computed and compared the number N of changing flyway events for each pair of inferred and simulated tree. Each “inferred” N value ( N inferred ) was thus compared to its corresponding “simulated” value ( N simulated ) by approximating a BF value using the above formula, but this time defining p e as the posterior probability that N inferred  <  N simulated , i.e., the frequency at which N inferred  <  N simulated in the samples from the posterior distribution.

To complement the first test based on an administrative flyway delimitation, we developed and performed a second test based on flyways estimated by La Sorte et al. 36 for terrestrial bird species: the Eastern, Central and Western flyways (Supplementary Fig.  S6 ). Contrary to the NAMF, these three flyways overlap with each other and are here defined by geo-referenced grids indicating the likelihood that studied species are in migration during spring or autumn (see La Sorte et al. 36 for further details). As the spring and autumn grids are relatively similar, we built an averaged raster for each flyway. For our analysis, we then generated normalised rasters obtained by dividing each raster cell by the sum of the values assigned to the same cell in the three averaged rasters (Supplementary Fig.  S6 ). Following a procedure similar to the first test based on NAMFs, we computed and compared the average difference D defined as follows:

where n is the number of branches in the tree, v i ,start the highest cell value among the three flyway normalised rasters to be associated with the position of the starting (oldest) node of tree branch i , and v i ,end the cell value extracted from the same normalised raster but associated with the position of the descendant (youngest) node of the tree branch i . D is thus a measure of the tendency of tree branches to remain within the same flyway. Each “inferred” D value ( D inferred ) is then compared to its corresponding “simulated” value ( D simulated ) by approximating a BF value using formula (2), but this time defining p e as the posterior probability that D simulated  <  D inferred , i.e., the frequency at which D simulated  <  D inferred in the samples from the posterior distribution.

Testing the impact of environmental factors on the viral diversity through time

We used the skygrid-GLM approach 9 , 10 implemented in BEAST 1.10.4 to measure the association between viral effective population size and four covariates: human case numbers, temperature, precipitation, and a greenness index. The monthly number of human cases were provided by the CDC and were considered with lag periods of one and two months (meaning that the viral effective population size was compared to case count data from one and two months later), as well as the absence of lag period. Preliminary skygrid-GLM analyses were used to determine from what lag period we obtained a significant association between viral effective population size and the number of human cases. We then used this lag period (of 1 month) in subsequent analyses. Data used to estimate the average temperature and precipitation time-series were obtained from the same database mentioned above and managed by the NOAA. For each successive month, meteorological stations were selected based on their geographic location. To estimate the average temperature/precipitation value for a specific month, we only considered meteorological stations included in the corresponding monthly minimum convex polygon obtained from the continuous phylogeographic inference. For a given month, the corresponding minimum convex hull polygon was simply defined around all the tree node positions occurring before or during that month. In order to take the uncertainty related to the phylogeographic inference into account, the construction of these minimum convex hull polygons was based on the 100 posterior trees used in the phylogeographic inference (see above). The rationale behind this approach was to base the analysis on covariate values averaged only over measures originating from areas already reached by the epidemic. Finally, the greenness index values were based on bimonthly Normalised Difference Vegetation Index (NDVI) raster files obtained from the NASA Earth Observation database (NEO; https://neo.sci.gsfc.nasa.gov ). To obtain the same level of precision and allow the co-analysis of NDVI data with human cases and climatic variables, we aggregated NDVI rasters by month. The visual comparison between covariate and skygrid curves shown in Fig.  5 indicates that this is an appropriate level of precision. Monthly NDVI values were then obtained by cropping the NDVI rasters with the series of minimum convex hull polygons introduced above, and then averaging the remaining raster cell values. While univariate skygrid-GLM analyses only involved one covariate at a time, the multivariate analyses included all the four covariates and used inclusion probabilities to assess their relative importance 38 . To allow their inclusion within the same multivariate analysis, the covariates were all log-transformed and standardised.

Reporting summary

Further information on research design is available in the  Nature Research Reporting Summary linked to this article.

Data availability

BEAST XML files of the continuous phylogeographic and skygrid-GLM analyses are available at https://github.com/sdellicour/wnv_north_america . WNV sequences analysed in the present study were available on GenBank and deposited before November 21, 2017. Accession numbers of selected genomic sequences are listed in the file “WNV_GenBank_accessions_numbers.txt” available on the GitHub repository referenced above. The source of the different raster files used in this study is provided in Supplementary Table  S1 . The administrative flyways were obtained from the US Fish and Wildlife Service (USFWS; https://www.fws.gov/birds/management/flyways.php ).

Code availability

The R script to run all the landscape phylogeographic testing analyses is available at https://github.com/sdellicour/wnv_north_america ( https://doi.org/10.5281/zenodo.4035938 ).

Lemey, P., Rambaut, A., Welch, J. J. & Suchard, M. A. Phylogeography takes a relaxed random walk in continuous space and time. Mol. Biol. Evol. 27 , 1877–1885 (2010).

Article   CAS   PubMed Central   PubMed   Google Scholar  

Pybus, O. G. et al. Unifying the spatial epidemiology and molecular evolution of emerging epidemics. Proc. Natl Acad. Sci. USA 109 , 15066–15071 (2012).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Baele, G., Dellicour, S., Suchard, M. A., Lemey, P. & Vrancken, B. Recent advances in computational phylodynamics. Curr. Opin. Virol. 31 , 24–32 (2018).

Article   PubMed   Google Scholar  

Dellicour, S., Rose, R. & Pybus, O. G. Explaining the geographic spread of emerging epidemics: a framework for comparing viral phylogenies and environmental landscape data. BMC Bioinform . 17 , 1–12 (2016).

Article   CAS   Google Scholar  

Jacquot, M., Nomikou, K., Palmarini, M., Mertens, P. & Biek, R. Bluetongue virus spread in Europe is a consequence of climatic, landscape and vertebrate host factors as revealed by phylogeographic inference. Proc. R. Soc. Lond. B 284 , 20170919 (2017).

Google Scholar  

Brunker, K. et al. Landscape attributes governing local transmission of an endemic zoonosis: Rabies virus in domestic dogs. Mol. Ecol. 27 , 773–788 (2018).

Dellicour, S., Vrancken, B., Trovão, N. S., Fargette, D. & Lemey, P. On the importance of negative controls in viral landscape phylogeography. Virus Evol. 4 , vey023 (2018).

Article   PubMed Central   PubMed   Google Scholar  

Minin, V. N., Bloomquist, E. W. & Suchard, M. A. Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics. Mol. Biol. Evol. 25 , 1459–1471 (2008).

Gill, M. S. et al. Improving Bayesian population dynamics inference: A coalescent-based model for multiple loci. Mol. Biol. Evol. 30 , 713–724 (2013).

Article   CAS   PubMed   Google Scholar  

Gill, M. S., Lemey, P., Bennett, S. N., Biek, R. & Suchard, M. A. Understanding past population dynamics: Bayesian coalescent-based modeling with covariates. Syst. Biol. 65 , 1041–1056 (2016).

Reisen, W. K. Ecology of West Nile virus in North America. Viruses 5 , 2079–2105 (2013).

Hayes, E. B. et al. Epidemiology and transmission dynamics of West Nile virus disease. Emerg. Infect. Dis. 11 , 1167–1173 (2005).

May, F. J., Davis, C. T., Tesh, R. B. & Barrett, A. D. T. Phylogeography of West Nile Virus: from the cradle of evolution in Africa to Eurasia, Australia, and the Americas. J. Virol. 85 , 2964–2974 (2011).

Kramer, L. D. & Bernard, K. A. West Nile virus in the western hemisphere. Curr. Opin. Infect. Dis. 14 , 519–525 (2001).

Kilpatrick, A. M., Kramer, L. D., Jones, M. J., Marra, P. P. & Daszak, P. West Nile virus epidemics in North America are driven by shifts in mosquito feeding behavior. PLoS Biol. 4 , 606–610 (2006).

Molaei, G., Andreadis, T. G., Armstrong, P. M., Anderson, J. F. & Vossbrinck, C. R. Host feeding patterns of Culex mosquitoes and West Nile virus transmission, northeastern United States. Emerg. Infect. Dis. 12 , 468–474 (2006).

Colpitts, T. M., Conway, M. J., Montgomery, R. R. & Fikrig, E. West Nile virus: biology, transmission, and human infection. Clin. Microbiol. Rev. 25 , 635–648 (2012).

Bowen, R. A. & Nemeth, N. M. Experimental infections with West Nile virus. Curr. Opin. Infect. Dis. 20 , 293–297 (2007).

Petersen, L. R. & Marfin, A. A. West Nile Virus: A primer for the clinician. Ann. Intern. Med. 137 , 173–179 (2002).

Petersen, L. R. & Fischer, M. Unpredictable and difficult to control—the adolescence of West Nile virus. N. Engl. J. Med. 367 , 1281–1284 (2012).

Lanciotti, R. S. et al. Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern United States. Science 286 , 2333–2337 (1999).

Dohm, D. J., Sardelis, M. R. & Turell, M. J. Experimental vertical transmission of West Nile virus by Culex pipiens (Diptera: Culicidae). J. Med. Entomol. 39 , 640–644 (2002).

Goddard, L. B., Roth, A. E., Reisen, W. K. & Scott, T. W. Vertical transmission of West Nile virus by three California Culex (Diptera: Culicidae) species. J. Med. Entomol. 40 , 743–746 (2003).

Lequime, S. & Lambrechts, L. Vertical transmission of arboviruses in mosquitoes: A historical perspective. Infect. Genet. Evol. 28 , 681–690 (2014).

Ronca, S. E., Murray, K. O. & Nolan, M. S. Cumulative incidence of West Nile virus infection, continental United States, 1999–2016. Emerg. Infect. Dis. 25 , 325–327 (2019).

George, T. L. et al. Persistent impacts of West Nile virus on North American bird populations. Proc. Natl Acad. Sci. USA 112 , 14290–14294 (2015).

Kilpatrick, A. M. & Wheeler, S. S. Impact of West Nile Virus on bird populations: limited lasting effects, evidence for recovery, and gaps in our understanding of impacts on ecosystems. J. Med. Entomol. 56 , 1491–1497 (2019).

LaDeau, S. L., Kilpatrick, A. M. & Marra, P. P. West Nile virus emergence and large-scale declines of North American bird populations. Nature 447 , 710–713 (2007).

Article   ADS   CAS   PubMed   Google Scholar  

Davis, C. T. et al. Phylogenetic analysis of North American West Nile virus isolates, 2001–2004: evidence for the emergence of a dominant genotype. Virology 342 , 252–265 (2005).

Añez, G. et al. Evolutionary dynamics of West Nile virus in the United States, 1999–2011: Phylogeny, selection pressure and evolutionary time-scale analysis. PLoS Negl. Trop. Dis. 7 , e2245 (2013).

Di Giallonardo, F. et al. Fluid spatial dynamics of West Nile Virus in the United States: Rapid spread in a permissive host environment. J. Virol. 90 , 862–872 (2016).

Hadfield, J. et al. Twenty years of West Nile virus spread and evolution in the Americas visualized by Nextstrain. PLOS Pathog. 15 , e1008042 (2019).

Dellicour, S. et al. Using viral gene sequences to compare and explain the heterogeneous spatial dynamics of virus epidemics. Mol. Biol. Evol. 34 , 2563–2571 (2017).

Dijkstra, E. W. A note on two problems in connexion with graphs. Numer. Math. 1 , 269–271 (1959).

Article   MathSciNet   MATH   Google Scholar  

McRae, B. H. Isolation by resistance. Evolution 60 , 1551–1561 (2006).

La Sorte, F. A. et al. The role of atmospheric conditions in the seasonal dynamics of North American migration flyways. J. Biogeogr. 41 , 1685–1696 (2014).

Article   Google Scholar  

Holmes, E. C. & Grenfell, B. T. Discovering the phylodynamics of RNA viruses. PLoS Comput. Biol. 5 , e1000505 (2009).

Article   ADS   PubMed Central   PubMed   CAS   Google Scholar  

Faria, N. R. et al. Genomic and epidemiological monitoring of yellow fever virus transmission potential. Science 361 , 894–899 (2018).

Article   ADS   CAS   PubMed Central   PubMed   Google Scholar  

Carrington, C. V. F., Foster, J. E., Pybus, O. G., Bennett, S. N. & Holmes, E. C. Invasion and maintenance of dengue virus type 2 and Type 4 in the Americas. J. Virol. 79 , 14680–14687 (2005).

Rappole, J. H. et al. Modeling movement of West Nile virus in the western hemisphere. Vector Borne Zoonotic Dis. 6 , 128–139 (2006).

Goldberg, T. L., Anderson, T. K. & Hamer, G. L. West Nile virus may have hitched a ride across the Western United States on Culex tarsalis mosquitoes. Mol. Ecol. 19 , 1518–1519 (2010).

Swetnam, D. et al. Terrestrial bird migration and West Nile virus circulation, United States. Emerg. Infect. Dis . 24 , 12 (2018).

Kwan, J. L., Kluh, S. & Reisen, W. K. Antecedent avian immunity limits tangential transmission of West Nile virus to humans. PLoS ONE 7 , e34127 (2012).

Duggal, N. K. et al. Genotype-specific variation in West Nile virus dispersal in California. Virology 485 , 79–85 (2015).

McMullen, A. R. et al. Evolution of new genotype of West Nile virus in North America. Emerg. Infect. Dis. 17 , 785–793 (2011).

Hepp, C. M. et al. Phylogenetic analysis of West Nile Virus in Maricopa County, Arizona: evidence for dynamic behavior of strains in two major lineages in the American Southwest. PLOS ONE 13 , e0205801 (2018).

Article   PubMed Central   PubMed   CAS   Google Scholar  

Goddard, L. B., Roth, A. E., Reisen, W. K. & Scott, T. W. Vector competence of California mosquitoes for West Nile virus. Emerg. Infect. Dis. 8 , 1385–1391 (2002).

Richards, S. L., Mores, C. N., Lord, C. C. & Tabachnick, W. J. Impact of extrinsic incubation temperature and virus exposure on vector competence of Culex pipiens quinquefasciatus say (Diptera: Culicidae) for West Nile virus. Vector Borne Zoonotic Dis. 7 , 629–636 (2007).

Anderson, S. L., Richards, S. L., Tabachnick, W. J. & Smartt, C. T. Effects of West Nile virus dose and extrinsic incubation temperature on temporal progression of vector competence in Culex pipiens quinquefasciatus . J. Am. Mosq. Control Assoc. 26 , 103–107 (2010).

Worwa, G. et al. Increases in the competitive fitness of West Nile virus isolates after introduction into California. Virology 514 , 170–181 (2018).

Duggal, N. K., Langwig, K. E., Ebel, G. D. & Brault, A. C. On the fly: interactions between birds, mosquitoes, and environment that have molded west nile virus genomic structure over two decades. J. Med. Entomol. 56 , 1467–1474 (2019).

Reed, K. D., Meece, J. K., Henkel, J. S. & Shukla, S. K. Birds, migration and emerging zoonoses: West Nile virus, Lyme disease, influenza A and enteropathogens. Clin. Med. Res. 1 , 5–12 (2003).

Dusek, R. J. et al. Prevalence of West Nile virus in migratory birds during spring and fall migration. Am . J. Trop. Med. Hyg. 81 , 1151–1158 (2009).

Samuel, G. H., Adelman, Z. N. & Myles, K. M. Temperature-dependent effects on the replication and transmission of arthropod-borne viruses in their insect hosts. Curr. Opin. Insect Sci. 16 , 108–113 (2016).

Paz, S. & Semenza, J. C. Environmental drivers of West Nile fever epidemiology in Europe and Western Asia-a review. Int. J. Environ. Res. Public Health 10 , 3543–3562 (2013).

Dohm, D. J., O’Guinn, M. L. & Turell, M. J. Effect of environmental temperature on the ability of Culex pipiens (Diptera: Culicidae) to transmit West Nile virus. J. Med. Entomol. 39 , 221–225 (2002).

Kilpatrick, A. M., Meola, M. A., Moudy, R. M. & Kramer, L. D. Temperature, viral genetics, and the transmission of West Nile virus by Culex pipiens mosquitoes. PLoS Path . 4 , e1000092 (2008).

DeFelice, N. B. et al. Use of temperature to improve West Nile virus forecasts. PLoS Comput. Biol. 14 , e1006047 (2018).

Morin, C. W. & Comrie, A. C. Regional and seasonal response of a West Nile virus vector to climate change. Proc. Natl Acad. Sci. USA 110 , 15620–15625 (2013).

Samy, A. M. et al. Climate change influences on the global potential distribution of the mosquito Culex quinquefasciatus , vector of West Nile virus and lymphatic filariasis. PLoS ONE 11 , e0163863 (2016).

Dellicour, S. et al. Phylodynamic assessment of intervention strategies for the West African Ebola virus outbreak. Nat. Commun. 9 , 2222 (2018).

Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30 , 772–780 (2013).

Larsson, A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30 , 3276–3278 (2014).

Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE 5 , e9490 (2010).

Suchard, M. A. et al. Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol. 4 , vey016 (2018).

Ayres, D. L. et al. BEAGLE 3: Improved performance, scaling, and usability for a high-performance computing library for statistical phylogenetics. Syst. Biol ., https://doi.org/10.1093/sysbio/syz020 (2019).

Tavaré, S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures Math. Life Sci. 17 , 57–86 (1986).

MathSciNet   MATH   Google Scholar  

Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4 , 699–710 (2006).

Rambaut, A., Drummond, A. J., Xie, D., Baele, G. & Suchard, M. A. Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst. Biol. 67 , 901–904 (2018).

Fisher, A. A., Ji, X., Zhang, Z., Lemey, P. & Suchard, M. A. Relaxed random walks at scale. Syst. Biol ., https://doi.org/10.1093/sysbio/syaa056 (2020).

Lemey, P. et al. Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2. PLoS Path. 10 , e1003932 (2014).

Bedford, T. et al. Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature 523 , 217 (2015).

Hadfield, J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34 , 4121–4123 (2018).

Dellicour, S., Rose, R., Faria, N. R., Lemey, P. & Pybus, O. G. SERAPHIM: studying environmental rasters and phylogenetically informed movements. Bioinformatics 32 , 3204–3206 (2016).

Dellicour, S. et al. Using phylogeographic approaches to analyse the dispersal history, velocity, and direction of viral lineages–application to rabies virus spread in Iran. Mol. Ecol. 28 , 4335–4350 (2019).

Suchard, M. A., Weiss, R. E. & Sinsheimer, J. S. Models for estimating Bayes factors with applications to phylogeny and tests of monophyly. Biometrics 61 , 665–673 (2005).

Article   MathSciNet   MATH   PubMed   Google Scholar  

Download references

Acknowledgements

We are grateful to Frank La Sorte for sharing their estimated flyway grids. The research leading to these results has received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 725422-ReservoirDOCS), from the Welcome Trust (Artic Network, project 206298/Z/17/Z), and from the European Union’s Horizon 2020 project MOOD (grant agreement no. 874850). S.D. is supported by the Fonds National de la Recherche Scientifiqu e (FNRS, Belgium) and was previously funded by the Fonds Wetenschappelijk Onderzoek (FWO, Belgium). S.L. and P.B. were funded by the Fonds Wetenschappelijk Onderzoek (FWO, Belgium). B.V. was supported by a postdoctoral grant (12U7118N) of the Research Foundation - Flanders ( Fonds voor Wetenschappelijk Onderzoek ). L.d.P. and O.G.P. are supported by the European Research Council under the European Commission Seventh Framework Programme (grant agreement no. 614725-PATHPHYLODYN) and by the Oxford Martin School. M.A.S. is partially supported by NSF grant DMS 1264153 and NIH grants R01 AI107034, U19 AI135995, and R56 AI149004. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. P.L. acknowledges support by the Research Foundation-Flanders ( Fonds voor Wetenschappelijk Onderzoek-Vlaanderen , G066215N, G0D5117N, and G0B9317N).

Author information

Authors and affiliations.

Spatial Epidemiology Lab (SpELL), Université Libre de Bruxelles, CP160/12, 50 Avenue FD Roosevelt, 1050, Bruxelles, Belgium

Simon Dellicour & Marius Gilbert

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, 3000, Leuven, Belgium

Simon Dellicour, Sebastian Lequime, Bram Vrancken, Mandev S. Gill, Paul Bastide & Philippe Lemey

Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, 92037, USA

Karthik Gangavarapu, Nathaniel L. Matteson & Kristian G. Andersen

Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Infectious Diseases Group, J. Craig Venter Institute, Rockville, MD, USA

Department of Zoology, University of Oxford, Oxford, UK

Louis du Plessis & Oliver G. Pybus

Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA

Alexander A. Fisher & Marc A. Suchard

Fogarty International Center, National Institutes of Health, Bethesda, MD, 20894, USA

Martha I. Nelson

Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, CA, USA

Marc A. Suchard

Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA

Scripps Research Translational Institute, La Jolla, CA, 92037, USA

Kristian G. Andersen

Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, 06510, USA

Nathan D. Grubaugh

You can also search for this author in PubMed   Google Scholar

Contributions

S.D., K.G.A., N.D.G., O.G.P., and P.L. designed the study. S.D., M.S.G., P.B., M.A.S., and P.L. developed the analytical framework. S.D., S.L., B.V., M.S.G., P.B., K.G., N.L.M., and Y.T. analysed the data. L.d.P., A.A.F., and M.A.S. provided statistical guidance. S.D. wrote the first draft of the manuscript. All the authors interpreted and discussed the results. S.D., S.L., M.I.N., M.G., K.G.A., N.D.G., O.G.P., and P.L. discussed the epidemiological implications. All the authors edited and approved the contents of the manuscript.

Corresponding author

Correspondence to Simon Dellicour .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Christine Carrington and the other, anonymous, reviewer for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, peer review file, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Dellicour, S., Lequime, S., Vrancken, B. et al. Epidemiological hypothesis testing using a phylogeographic and phylodynamic framework. Nat Commun 11 , 5620 (2020). https://doi.org/10.1038/s41467-020-19122-z

Download citation

Received : 11 March 2020

Accepted : 30 September 2020

Published : 06 November 2020

DOI : https://doi.org/10.1038/s41467-020-19122-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Integrating full and partial genome sequences to decipher the global spread of canine rabies virus.

  • Andrew Holtz
  • Anna Zhukova

Nature Communications (2023)

Spatial and temporal dynamics of West Nile virus between Africa and Europe

  • Giulia Mencattelli
  • Marie Henriette Dior Ndione
  • Giovanni Savini

Phylogenetic and phylodynamic approaches to understanding and combating the early SARS-CoV-2 pandemic

  • Stephen W. Attwood
  • Sarah C. Hill
  • Oliver G. Pybus

Nature Reviews Genetics (2022)

West Nile virus transmission potential in Portugal

  • José Lourenço
  • Sílvia C. Barros
  • Uri Obolski

Communications Biology (2022)

Predicting the evolution of the Lassa virus endemic area and population at risk over the next decades

  • Raphaëlle Klitting
  • Liana E. Kafetzopoulou
  • Simon Dellicour

Nature Communications (2022)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

hypothesis formulation epidemiology

4. Test Hypotheses Using Epidemiologic and Environmental Investigation

Once a hypothesis is generated, it should be tested to determine if the source has been correctly identified. Investigators use several methods to test their hypotheses.

Epidemiologic Investigation

Case-control studies and cohort studies are the most common type of analytic studies conducted to assist investigators in determining statistical association of exposures to ill persons. These types of studies compare information collected from ill persons with comparable well persons.

Cohort studies use well-defined groups and compare the risk of developing of illness among people who were exposed to a source with the risk of developing illness among the unexposed. In a cohort study, you are determining the risk of developing illness among the exposed.

Case-control studies compare the exposures between ill persons with exposures among well persons (called controls). Controls for a case-control study should have the same risk of exposure as the cases. In a case-control study, the comparison is the odds of illness among those exposed with those not exposed.

Using statistical tests, the investigators can determine the strength of the association to the implicated water source instead of how likely it is to have occurred by chance alone. Investigators look at many factors when interpreting results from these studies:

  • Frequencies of exposure
  • Strength of the statistical association
  • Dose-response relationships
  • Biologic /toxicological plausibility

For more information and examples on designing and conducting analytic studies in the field, please see The CDC Field Epidemiology Manual .

Information on the clinical course of illness and results of clinical laboratory testing are very important for outbreak investigations. Evaluating symptoms and sequelae across patients can guide formulation of a clinical diagnosis. Results of advance molecular diagnostics can be evaluated to compare isolates from patient and the outbreak sources (e.g., water).

Environmental Investigation

Investigating an implicated water source with an onsite environmental investigation is often important for determining the outbreak’s cause and for pinpointing which factors at the water source were responsible. This requires understanding the implicated water system, potential contamination sources, the environmental controls in effect (e.g., water disinfection), and the ways that people interact with the water source. The factors considered in this investigation will differ depending on the type of implicated water source (e.g., drinking water system, swimming pool). Environmental investigation tools for different settings and venues are available.

The investigation might include collecting water samples. Sampling strategy should include the goal of water testing and what information will be gained by evaluating water quality parameters including measurement of disinfection residuals, and/or possible detection of particular contaminants. The epidemiology of each situation will typically inform the sampling effort.

  • Drinking Water
  • Healthy Swimming
  • Water, Sanitation, and Environmentally-related Hygiene
  • Harmful Algal Blooms
  • Global WASH
  • WASH Surveillance
  • WASH-related Emergencies and Outbreaks
  • Other Uses of Water

To receive updates highlighting our recent work to prevent infectious disease, enter your email address:

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Epidemiol Infect
  • v.147; 2019

Logo of epidinfect

Methods for generating hypotheses in human enteric illness outbreak investigations: a scoping review of the evidence

1 School of Public Health, University of Alberta, Edmonton, Canada

2 Outbreak Management Division, Centre for Food-borne, Environmental and Zoonotic Infectious Diseases, Public Health Agency of Canada, Guelph, Canada

3 Public Health Risk Sciences Division, National Microbiology Laboratory at Guelph, Public Health Agency of Canada, Guelph, Canada

M. Mascarenhas

Associated data.

For supplementary material accompanying this paper visit https://doi.org/10.1017/S0950268819001699.

Enteric illness outbreaks are complex events, therefore, outbreak investigators use many different hypothesis generation methods depending on the situation. This scoping review was conducted to describe methods used to generate a hypothesis during enteric illness outbreak investigations. The search included five databases and grey literature for articles published between 1 January 2000 and 2 May 2015. Relevance screening and article characterisation were conducted by two independent reviewers using pretested forms. There were 903 outbreaks that described hypothesis generation methods and 33 papers which focused on the evaluation of hypothesis generation methods. Common hypothesis generation methods described are analytic studies (64.8%), descriptive epidemiology (33.7%), food or environmental sampling (32.8%) and facility inspections (27.9%). The least common methods included the use of a single interviewer (0.4%) and investigation of outliers (0.4%). Most studies reported using two or more methods to generate hypotheses (81.2%), with 29.2% of studies reporting using four or more. The use of multiple different hypothesis generation methods both within and between outbreaks highlights the complexity of enteric illness outbreak investigations. Future research should examine the effectiveness of each method and the contexts for which each is most effective in efficiently leading to source identification.

Introduction

Enteric illnesses cause considerable morbidity and mortality worldwide. Waterborne enteric diseases cause 2 million deaths each year, the majority of which occur in children aged 5 and under [ 1 ]. Foodborne enteric diseases are responsible for 600 million illnesses and 420 000 deaths annually [ 2 ]. These illnesses impact the quality of life of those affected and result in enormous financial consequences for individuals and nations [ 3 ]. Although most enteric illnesses are transient, significant chronic sequelae associated with some foodborne pathogens can have long-term public health impacts [ 4 – 6 ].

Enteric illness outbreak investigations seek to identify the source of illnesses to prevent further illness in the population. Timely source identification is a key step towards reducing the incidence of enteric illness worldwide and can lead to change in public health policy or recommendations to prevent future outbreaks, such as changes to food manufacturing processes or regulations. Timely source identification can also lead to public health notices and recalls that may prevent further illnesses in a specific outbreak. Accurate source identification can also provide opportunities to learn more about known and emerging diseases, increase understanding of the impact of current disease prevention practices and improve public confidence in public health agencies responsiveness to disease outbreaks [ 7 ].

Outbreak investigations take many forms, depending on the pathogen, context, affected population and suspected route of transmission. Initial cases often alert public health officials that a possible outbreak is occurring. Once an outbreak has been identified a case definition is established to support case finding activities. As cases are identified, information is gathered about the outbreak to generate hypotheses about the potential source(s) and route(s) of exposure. Information can come from a range of sources, including the cases themselves, their friends or family, staff members of businesses and institutions, experts or literature and physical and environmental sampling and inspections. Taken together, this information supports the development of hypotheses about the source of the outbreak.

Hypothesis generation about both the potential source(s) and route(s) of exposure is a key step in outbreak investigations, as it begins the process of narrowing the search for the transmission vehicle. Although some hypothesis generation methods have been described in summaries of outbreak investigation steps [ 7 , 8 ], the full range of possible methods used in outbreak investigations or the frequency that they are used is not readily available. We conducted a scoping review to summarise the methods for hypothesis generation used during human enteric illness investigations and to understand the frequency and breadth of methods, as well as to identify knowledge gaps and areas for future research.

A scoping review protocol was created a priori using the framework established by Arksey and O'Malley [ 9 ]. A copy of the protocol, including the search strategy, the screening tool and the data characterisation tool can be found in Supplementary Material S1. A full list of the articles identified in this scoping review can be found in Supplementary Material S2. A review team was established and included expertise in synthesis research, food safety, epidemiology and outbreak investigation.

The research question:

What methods have been used, or could be used, in human enteric illness outbreak investigations for hypothesis generation?

Search terms and strategy

A search algorithm (Supplementary Material S1) was constructed using key terms from 30 pre-selected relevant articles and implemented in five databases (PubMed, Scopus, Embase, Cumulative Index to Nursing and Allied Health Literature (CINAHL) and ProQuest Public Health) on 25 May 2015 with a date filter of 1 January 2000–25 May 2015.

The search was evaluated for capture sensitivity by searching reference lists of 12 randomly selected relevant primary methodology papers and 10 of the most recent relevant literature reviews in PubMed (Supplementary Material S1). The grey literature search targeted websites of government and research organisations, and relevant articles from Conference Proceedings (Supplementary Material S1). A total of 202 articles were identified by the grey literature search that were not captured by the search strategy and were added to the literature review ( Fig. 1 ). All citations were exported and de-duplicated in RefWorks (ProQuest, LLC), an online bibliographic management program, before being uploaded into a web-based systematic review management program, DistillerSR™ (Evidence Partners, Ottawa, Canada), for evaluation and characterisation.

An external file that holds a picture, illustration, etc.
Object name is S0950268819001699_fig1.jpg

PRISMA flow chart documenting the literature retrieval and inclusion/exclusion criteria for citations to identify methods of hypothesis generation during human illness investigations.

Relevance screening of abstracts and full-text citation

Each title and abstract was screened by two independent reviewers using a relevance screening form (Supplementary Material S1). Articles were included if they met the following criteria: (1) used or described methods applicable to enteric illness outbreak investigations to assist in hypothesis generation and source identification; (2) published after 1 January 2000 and (3) were reported in either English or French language. No geographic location was used as an exclusion criterion. The relevance screening form was pretested on 50 citations and resulted in a kappa agreement >0.8, indicating good agreement. Two reviewers screened each citation independently and conflicts were resolved by consensus.

Potentially relevant articles were procured, confirmed to be in English or French and relevant before broadly being characterised by two independent reviewers using a secondary relevance screening tool (Supplementary Material S1) to gather information on the outbreak, such as geographic location, type of pathogen, setting (single or general) and implicated source (Supplementary Material S1). This form was pretested on 10 papers to ensure good agreement and clarity within the form.

Data extraction and analysis

The data characterisation and utility tool was used to gather data on the hypothesis generation methods used in the outbreak investigation. The form contained check boxes for 23 known hypothesis generation methods and an option for reviewers to add other methods not captured in the form. Clearly established definitions were used to help data extractors distinguish between instances when a method was used for hypothesis generation or hypothesis testing. Hypothesis generation was defined as the process of developing one or more tentative explanations about the source of the outbreak used to inform further investigation. This was distinguished from hypothesis testing, which was defined as the process of confirming that a specific exposure is or is not the cause of an outbreak. Hypothesis testing is performed on a small number of suspect exposures and may include statistical testing or traceback investigation. Sometimes, when the hypothesis is refuted, additional rounds of hypothesis generation may be initiated. Several methods included in the form could be used for either hypothesis generation or for hypothesis testing in outbreak investigations. For example, analytic studies can be used to examine a wide range of exposures to help generate hypotheses about plausible sources. However, analytic studies can also be used to test a hypothesis when a specific source is suspected. Instances where methods were used to test a hypothesis were not relevant to this review and were not captured on the form. Where more than one outbreak was described in a single paper, multiple forms were completed to capture methods used in different investigations. This form was pretested on five papers to ensure agreement between reviewers was adequate and to improve the clarity of the questions/answers where necessary. Two reviewers independently reviewed each paper and disagreements between reviewers were discussed until a consensus was reached or settled with a third reviewer. Articles with no hypothesis generation methods described or with a known source at the outset of the investigation were excluded at this stage. Papers describing methodology, but not specific outbreak investigations, were identified and are described separately. Descriptive statistics were used to summarise the dataset using Stata 15 (StataCorp, 2017).

In total, there were 10 615 unique citations captured by the search ( Fig. 1 ). Of these, 889 (8.4%) papers were fully characterised and included 903 reported outbreaks (Supplementary Material S2). Of the reported outbreaks, 25 (2.8%) were described in 11 multi-outbreak articles and the remaining 878 (97.2%) were described in single outbreak articles ( Fig. 1 ).

The pathogens associated with the outbreaks included: bacteria ( n  = 622, 68.9%), viruses ( n  = 192, 21.3%), parasites ( n  = 64, 7.1%), bio-toxins ( n  = 3, 0.3%), fungi ( n  = 1, 0.1%) and multiple pathogens ( n  = 11, 1.2%). The pathogen was not identified in 10 (1.1%) outbreaks. In terms of outbreak source, 552 (61.1%) identified food as the source, while 103 (11.4%) identified water, 34 (3.8%) identified direct contact with animals, 25 (2.8%) identified person-to-person transmission, 25 (2.8%) identified multiple modes of transmission, 20 (2.2%) identified food-handlers, 8 (0.9%) identified soil or environment and 5 (0.6%) reported other modes of transmission as the source. In 131 (14.5%) of the outbreaks, no source was identified.

Hypothesis generation methods used in the enteric illness outbreak investigations are listed and defined in Table 1 . The majority ( n  = 733, 81.2%) of investigations employed two or more methods to generate hypotheses; the median number of methods used was three (interquartile range: 2–4). Analytic studies ( n  = 585, 64.8%) were the most commonly reported method category, followed by descriptive epidemiology ( n  = 304, 33.7%), and food or environmental sampling ( n  = 296, 32.8%). Uncommon methods included tracer testing ( n  = 1, 0.1%), anthropological investigation ( n  = 1, 0.1%) and industry consultation ( n  = 1, 0.1%).

Description and frequency of methods used to generate a hypothesis in 903 human enteric illness outbreak investigations identified in scoping review citations

Single setting outbreaks

The proportion that each method was used within single setting outbreaks, such as a restaurant, nursing home, or event, is reported in Figure 2 . The most commonly reported methods used in single setting outbreaks included analytic studies ( n  = 345, 27.2%), facility inspections ( n  = 209, 16.5%) and food or environmental sampling ( n  = 202, 15.9%). The least common methods used in single setting outbreaks included focus groups ( n  = 1, 0.1%) and tracer testing ( n  = 1, 0.1%). Binomial probability/comparison to population estimates, single interviewer and anthropological investigation were not reported in single setting outbreaks.

An external file that holds a picture, illustration, etc.
Object name is S0950268819001699_fig2.jpg

Hypothesis generation methods used in single setting outbreaks.

General population outbreaks

The proportion that each method was used in general population outbreaks, outbreaks not related to a single event or venue, is reported in Figure 3 . The most commonly used methods in general population outbreaks included analytic studies ( n  = 240, 18.7%), interesting descriptive epidemiology ( n  = 186, 14.5%) and hypothesis generation questionnaires ( n  = 141, 11.0%). The least common methods used in general population outbreaks included anthropological investigation ( n  = 1, 0.1%), contact tracing/social network analysis ( n  = 1, 0.1%) and industry consultation ( n  = 1, 0.1%). Tracer testing and food displays were not reported in general population outbreaks.

An external file that holds a picture, illustration, etc.
Object name is S0950268819001699_fig3.jpg

Hypothesis generation methods used in general population outbreaks.

Hypothesis generation innovation and trends 2000–2015

Trends in method use over the 15-year span were examined in 5-year increments (Supplementary Material S3). Small increases were observed in the use of anecdotal reports, purchase records, binomial probability/population comparison, facility inspections and review of existing information. A decline was observed in the use of analytic studies. Other methods had variable use over the time period or were relatively stable.

Methodology papers

Of the 10 615 citations screened, 33 (0.3%) methods papers were identified (Supplementary Material S2). These papers focused on evaluating existing methods or comparing standard vs. a novel approach to hypothesis generation (Supplementary Material S4). Of these, the most commonly discussed method was analytic studies ( n  = 11, 33.3%). This included five on the validity of case-chaos methodology [ 10 – 14 ], two on case-case methodology [ 15 , 16 ], two on case-control methodology [ 17 , 18 ], one discussing the validity of case-cohort methodology [ 19 ] and one discussing the validity of case-crossover methodology [ 20 ].

The use of laboratory methods, including whole genome sequencing, was described in five (15.2%) papers [ 21 – 25 ]. Traceback procedures were explored in five (15.2%) papers, including three on the use of network analysis [ 26 – 28 ], one on the use of food flow information [ 29 ] and one examining the use of relational systems to identify sources common to different cases [ 30 ]. Four (12.1%) papers described broad outbreak investigation activities, which included the hypothesis generation step, one from the United Kingdom [ 31 ], one from Quebec, Canada [ 32 ], one from Minnesota [ 33 ] and one from the Centers for Disease Control and Prevention (CDC) in the United States [ 34 ]. Three (9.1%) papers explored interviewing techniques, two examining the use of computer assisted telephone interviews (CATI) technology [ 35 , 36 ] and one on when to collect interview-intensive dose-response data [ 37 ]. Three (9.1%) papers compared online questionnaires to phone or paper questionnaires [ 38 – 40 ]. Finally, one (3.0%) paper examined the use of mathematical topology methods to generate hypotheses [ 41 ] and another (3.0%) paper examined the use of sales record data to generate hypotheses [ 42 ].

The most commonly reported hypothesis generation methods identified in this scoping review included analytic studies, descriptive epidemiology, food or environmental sampling and facility inspections. Uncommon methods included industry consultation, tracer testing, anthropologic investigations and the use of food displays. Most outbreak investigations employed multiple methods to generate hypotheses and the context of the outbreak was an important determinant for some methods.

The multitude of hypothesis generation methods described and the use of multiple methods by most outbreak investigators point to the complexity of investigating enteric illness outbreaks. Many methods described are complementary with other methods or may be used in sequence as an investigation progress. For example, routine and enhanced surveillance questionnaires will often be collected before an outbreak is even identified, while hypothesis generating questionnaires are frequently used at the beginning of an outbreak when the focus of the investigation is quite broad. The use of descriptive epidemiology is generally based on questionnaire data and is often one of the first hypothesis generation methods employed in outbreak investigations. Other methods, such as food or environmental sampling, facility inspections and food handler testing may be used in conjunction with questionnaires, particularly if the outbreak occurred in one setting or at an event. Both open-ended and iterative interviewing frequently occur later in investigations when no obvious source has emerged or as new cases are identified.

Investigators consider many factors when choosing a hypothesis generation method. For example, the length of time that has elapsed between case exposure and the identification of outbreak impact investigation tools such as the collection of contaminated food and environmental samples or facility inspections and traceback investigations [ 43 – 45 ]. Cost and feasibility are also important considerations for many hypothesis generation methods. Analytic studies can be expensive and time consuming [ 46 ], while food and environmental sampling requires laboratory resources for testing [ 47 , 48 ]. Changes in method type used over time, for example increases in the use of anecdotal reports and purchase records, likely reflect the increase in available technology such as online reporting through social media, and availability of online records. The decline in the use of analytic methods may reflect the increased availability of other, less expensive, hypothesis generation methods such as population comparisons or purchase records.

Outbreak setting can impact the choice of hypothesis generation methods. Methods frequently used in single setting outbreaks include tailored menu-based interviewing, facility inspections and food handler testing. These methods are well-suited to these settings because the common connection across cases is obvious and the source is expected to be identified at a single location common to the cases, such as a restaurant or hospital. For outbreaks related to a single event such as weddings or conferences, analytic studies such as a retrospective cohort are well suited to investigating known exposed populations. In contrast, the use of purchase records, such as store loyalty cards or credit card statements, is utilised when the outbreak is among the general population and there appears to be no obvious connection between cases. Similarly, a review of existing information is a method used frequently in outbreaks among the general population when the range of plausible sources of illness is substantially larger than would be present in single event outbreaks. Outbreak setting thus has implications for the feasibility and usefulness of many hypothesis generation methods.

One finding of this scoping review is that hypothesis generation methods are not well reported within outbreak reports. Descriptions of hypothesis generation methods and sequence of events were often limited or entirely omitted from the publications. This incomplete reporting makes it difficult to interpret how frequently some methods are used by outbreak investigation teams compared to what outbreaks are written up and published in detail. Thus, it is likely that some common methods such as routine questionnaires were underreported and are thus underrepresented in this review. Methods that did not contribute to the identification of the source may also not be reported. Thorough reporting of all hypothesis generation methods used by outbreak investigators would allow for a more comprehensive understanding of the range and frequency of methods used to investigate outbreaks.

Most of the methods papers identified in this review focused on analytic studies, laboratory methods, traceback, interviews and questionnaires. No methods papers were identified related to several hypothesis generation methods reported in this review, including focus groups, iterative interviewing, open-ended interviewing, descriptive epidemiology, sub-cluster and outlier investigation, food or environmental sampling, facility inspections, food handler testing, review of existing information, menu or recipe analysis, anecdotal reports and social network analysis. The paucity of methods papers exploring hypothesis generation methods is an important literature gap. The relative merits of different hypothesis generation methods, their validity and reliability and comparable effectiveness across outbreak investigations, are needed to support outbreak investigator decision-making.

The frequencies of hypothesis generation methods reported in this scoping review may differ from their frequencies in practice as most outbreaks identified had successfully identified the source of the outbreak. Only 15% failed to identify the source of the outbreak, which is a much lower proportion than expected in practice [ 49 , 50 ]. This suggests that investigations where the source is not identified are less likely to be published and/or are published with few details, so they did not fulfil the inclusion criteria. This underreporting makes it impossible to accurately assess individual hypothesis generation methods' relative impact on investigation success based solely on published literature. Increased reporting of outbreak investigations where the source is not identified would improve our understanding of effective vs. ineffective hypothesis generation method use. Alternatively, organisations with access to administrative data on a full complement of outbreaks could analyse the relationship between the hypothesis generation methods used and associated outcomes of all outbreak investigations. For instance, Murphree et al . [ 49 ] compared the success of analytic studies to other methods in identifying a food vehicle across all outbreaks in the United States Foodborne Diseases Active Surveillance Network (FoodNet) catchment area. Analytic studies had a 47% success rate compared to all other methods with a 14% success rate [ 49 ], suggesting that analytic studies, where feasible, are more likely to lead to the identification of the source. However, given that analytic studies are not always feasible or appropriate, additional information on the relative success of other methods would help outbreak investigators choose appropriate methods to optimise the likelihood of successfully identifying the source. It would be valuable if outbreak investigators reported brief evaluations of their hypothesis generation methods to improve our understanding of the strengths and limitations of each method.

This review employed a comprehensive search strategy to identify enteric outbreak investigations and articles on hypothesis generation methods for outbreaks or other foodborne illness investigations. It is possible that despite our efforts some outbreak reports with hypothesis generation information were missed, as outbreaks are often not reported in the peer-reviewed literature and thus are not indexed in searchable bibliographic databases. To circumvent this shortfall, we performed a comprehensive grey literature search, however, it is possible some relevant reports were missed. It is also possible that there is some language bias, as the search was conducted in English and only papers reported in English or French were included in the review. This may have resulted in a failure of the search to identify relevant non-English papers. The effect of this on our results and conclusions is unknown. Lastly, because some methods identified in this review could be used for either hypothesis generation or hypothesis testing, we may have misclassified some uses of those methods as hypothesis generation when the investigators actually used the method for hypothesis testing. We relied on author reporting to understand when hypothesis generation was taking place, but incomplete or inadequate reporting may have resulted in misclassification that overestimated the extent to which some methods, such as analytic studies, are used to generate hypotheses.

This review demonstrated the range of hypothesis generation methods used in enteric illness outbreak investigations in humans. Most outbreaks were investigated using a combination of methods, highlighting the complexity of outbreak investigations and the requirement to have a suite of hypothesis generation approaches to choose from, as a single approach may not be appropriate in all situations. Research is needed to comprehensively understand the effectiveness of each hypothesis generation method in identifying the source of the outbreak, improving investigators' ability to choose the most suitable hypothesis generation methods to enable successful source identification.

Acknowledgements

The Public Health Agency of Canada library for their help in the procurement of publications. The Public Health Agency of Canada Centre for Food-borne, Environmental and Zoonotic Infectious Diseases, Outbreak Management Division contributors: Jennifer Cutler, Kristyn Franklin, Ashley Kerr, Vanessa Morton, Florence Tanguay, Joanne Tataryn, Kashmeera Meghnath, Mihaela Gheorghe, Shiona Glass-Kaastra.

Conflict of interest

Financial support.

This research received no specific grant from any funding agency, commercial or not-for-profit sectors.

Supplementary material

How Do You Formulate (Important) Hypotheses?

  • Open Access
  • First Online: 03 December 2022

Cite this chapter

You have full access to this open access chapter

Book cover

  • James Hiebert 6 ,
  • Jinfa Cai 7 ,
  • Stephen Hwang 7 ,
  • Anne K Morris 6 &
  • Charles Hohensee 6  

Part of the book series: Research in Mathematics Education ((RME))

10k Accesses

Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so on. Many resources feed this process, including reading what others have found about similar phenomena, talking with colleagues, conducting pilot studies, and writing drafts as you revise your thinking. Although you might think you cannot predict what you will find, it is always possible—with enough reading and conversations and pilot studies—to make some good guesses. And, once you guess what you will find and write out the reasons for these guesses you are on your way to scientific inquiry. As you refine your hypotheses, you can assess their research importance by asking how connected they are to problems your research community really wants to solve.

Download chapter PDF

Part I. Getting Started

We want to begin by addressing a question you might have had as you read the title of this chapter. You are likely to hear, or read in other sources, that the research process begins by asking research questions . For reasons we gave in Chap. 1 , and more we will describe in this and later chapters, we emphasize formulating, testing, and revising hypotheses. However, it is important to know that asking and answering research questions involve many of the same activities, so we are not describing a completely different process.

We acknowledge that many researchers do not actually begin by formulating hypotheses. In other words, researchers rarely get a researchable idea by writing out a well-formulated hypothesis. Instead, their initial ideas for what they study come from a variety of sources. Then, after they have the idea for a study, they do lots of background reading and thinking and talking before they are ready to formulate a hypothesis. So, for readers who are at the very beginning and do not yet have an idea for a study, let’s back up. Where do research ideas come from?

There are no formulas or algorithms that spawn a researchable idea. But as you begin the process, you can ask yourself some questions. Your answers to these questions can help you move forward.

What are you curious about? What are you passionate about? What have you wondered about as an educator? These are questions that look inward, questions about yourself.

What do you think are the most pressing educational problems? Which problems are you in the best position to address? What change(s) do you think would help all students learn more productively? These are questions that look outward, questions about phenomena you have observed.

What are the main areas of research in the field? What are the big questions that are being asked? These are questions about the general landscape of the field.

What have you read about in the research literature that caught your attention? What have you read that prompted you to think about extending the profession’s knowledge about this? What have you read that made you ask, “I wonder why this is true?” These are questions about how you can build on what is known in the field.

What are some research questions or testable hypotheses that have been identified by other researchers for future research? This, too, is a question about how you can build on what is known in the field. Taking up such questions or hypotheses can help by providing some existing scaffolding that others have constructed.

What research is being done by your immediate colleagues or your advisor that is of interest to you? These are questions about topics for which you will likely receive local support.

Exercise 2.1

Brainstorm some answers for each set of questions. Record them. Then step back and look at the places of intersection. Did you have similar answers across several questions? Write out, as clearly as you can, the topic that captures your primary interest, at least at this point. We will give you a chance to update your responses as you study this book.

Part II. Paths from a General Interest to an Informed Hypothesis

There are many different paths you might take from conceiving an idea for a study, maybe even a vague idea, to formulating a prediction that leads to an informed hypothesis that can be tested. We will explore some of the paths we recommend.

We will assume you have completed Exercise 2.1 in Part I and have some written answers to the six questions that preceded it as well as a statement that describes your topic of interest. This very first statement could take several different forms: a description of a problem you want to study, a question you want to address, or a hypothesis you want to test. We recommend that you begin with one of these three forms, the one that makes most sense to you. There is an advantage to using all three and flexibly choosing the one that is most meaningful at the time and for a particular study. You can then move from one to the other as you think more about your research study and you develop your initial idea. To get a sense of how the process might unfold, consider the following alternative paths.

Beginning with a Prediction If You Have One

Sometimes, when you notice an educational problem or have a question about an educational situation or phenomenon, you quickly have an idea that might help solve the problem or answer the question. Here are three examples.

You are a teacher, and you noticed a problem with the way the textbook presented two related concepts in two consecutive lessons. Almost as soon as you noticed the problem, it occurred to you that the two lessons could be taught more effectively in the reverse order. You predicted better outcomes if the order was reversed, and you even had a preliminary rationale for why this would be true.

You are a graduate student and you read that students often misunderstand a particular aspect of graphing linear functions. You predicted that, by listening to small groups of students working together, you could hear new details that would help you understand this misconception.

You are a curriculum supervisor and you observed sixth-grade classrooms where students were learning about decimal fractions. After talking with several experienced teachers, you predicted that beginning with percentages might be a good way to introduce students to decimal fractions.

We begin with the path of making predictions because we see the other two paths as leading into this one at some point in the process (see Fig. 2.1 ). Starting with this path does not mean you did not sense a problem you wanted to solve or a question you wanted to answer.

The process flow diagram of initiation of hypothesis. It starts with a problem situation and leads to a prediction following the question to the hypothesis.

Three Pathways to Formulating Informed Hypotheses

Notice that your predictions can come from a variety of sources—your own experience, reading, and talking with colleagues. Most likely, as you write out your predictions you also think about the educational problem for which your prediction is a potential solution. Writing a clear description of the problem will be useful as you proceed. Notice also that it is easy to change each of your predictions into a question. When you formulate a prediction, you are actually answering a question, even though the question might be implicit. Making that implicit question explicit can generate a first draft of the research question that accompanies your prediction. For example, suppose you are the curriculum supervisor who predicts that teaching percentages first would be a good way to introduce decimal fractions. In an obvious shift in form, you could ask, “In what ways would teaching percentages benefit students’ initial learning of decimal fractions?”

The picture has a difference between a question and a prediction: a question simply asks what you will find whereas a prediction also says what you expect to find; written.

There are advantages to starting with the prediction form if you can make an educated guess about what you will find. Making a prediction forces you to think now about several things you will need to think about at some point anyway. It is better to think about them earlier rather than later. If you state your prediction clearly and explicitly, you can begin to ask yourself three questions about your prediction: Why do I expect to observe what I am predicting? Why did I make that prediction? (These two questions essentially ask what your rationale is for your prediction.) And, how can I test to see if it’s right? This is where the benefits of making predictions begin.

Asking yourself why you predicted what you did, and then asking yourself why you answered the first “why” question as you did, can be a powerful chain of thought that lays the groundwork for an increasingly accurate prediction and an increasingly well-reasoned rationale. For example, suppose you are the curriculum supervisor above who predicted that beginning by teaching percentages would be a good way to introduce students to decimal fractions. Why did you make this prediction? Maybe because students are familiar with percentages in everyday life so they could use what they know to anchor their thinking about hundredths. Why would that be helpful? Because if students could connect hundredths in percentage form with hundredths in decimal fraction form, they could bring their meaning of percentages into decimal fractions. But how would that help? If students understood that a decimal fraction like 0.35 meant 35 of 100, then they could use their understanding of hundredths to explore the meaning of tenths, thousandths, and so on. Why would that be useful? By continuing to ask yourself why you gave the previous answer, you can begin building your rationale and, as you build your rationale, you will find yourself revisiting your prediction, often making it more precise and explicit. If you were the curriculum supervisor and continued the reasoning in the previous sentences, you might elaborate your prediction by specifying the way in which percentages should be taught in order to have a positive effect on particular aspects of students’ understanding of decimal fractions.

Developing a Rationale for Your Predictions

Keeping your initial predictions in mind, you can read what others already know about the phenomenon. Your reading can now become targeted with a clear purpose.

By reading and talking with colleagues, you can develop more complete reasons for your predictions. It is likely that you will also decide to revise your predictions based on what you learn from your reading. As you develop sound reasons for your predictions, you are creating your rationales, and your predictions together with your rationales become your hypotheses. The more you learn about what is already known about your research topic, the more refined will be your predictions and the clearer and more complete your rationales. We will use the term more informed hypotheses to describe this evolution of your hypotheses.

The picture says you develop sound reasons for your predictions, you are creating your rationales, and your predictions together with your rationales become your hypotheses.

Developing more informed hypotheses is a good thing because it means: (1) you understand the reasons for your predictions; (2) you will be able to imagine how you can test your hypotheses; (3) you can more easily convince your colleagues that they are important hypotheses—they are hypotheses worth testing; and (4) at the end of your study, you will be able to more easily interpret the results of your test and to revise your hypotheses to demonstrate what you have learned by conducting the study.

Imagining Testing Your Hypotheses

Because we have tied together predictions and rationales to constitute hypotheses, testing hypotheses means testing predictions and rationales. Testing predictions means comparing empirical observations, or findings, with the predictions. Testing rationales means using these comparisons to evaluate the adequacy or soundness of the rationales.

Imagining how you might test your hypotheses does not mean working out the details for exactly how you would test them. Rather, it means thinking ahead about how you could do this. Recall the descriptor of scientific inquiry: “experience carefully planned in advance” (Fisher, 1935). Asking whether predictions are testable and whether rationales can be evaluated is simply planning in advance.

You might read that testing hypotheses means simply assessing whether predictions are correct or incorrect. In our view, it is more useful to think of testing as a means of gathering enough information to compare your findings with your predictions, revise your rationales, and propose more accurate predictions. So, asking yourself whether hypotheses can be tested means asking whether information could be collected to assess the accuracy of your predictions and whether the information will show you how to revise your rationales to sharpen your predictions.

Cycles of Building Rationales and Planning to Test Your Predictions

Scientific reasoning is a dialogue between the possible and the actual, an interplay between hypotheses and the logical expectations they give rise to: there is a restless to-and-fro motion of thought, the formulation and rectification of hypotheses (Medawar, 1982 , p.72).

As you ask yourself about how you could test your predictions, you will inevitably revise your rationales and sharpen your predictions. Your hypotheses will become more informed, more targeted, and more explicit. They will make clearer to you and others what, exactly, you plan to study.

When will you know that your hypotheses are clear and precise enough? Because of the way we define hypotheses, this question asks about both rationales and predictions. If a rationale you are building lets you make a number of quite different predictions that are equally plausible rather than a single, primary prediction, then your hypothesis needs further refinement by building a more complete and precise rationale. Also, if you cannot briefly describe to your colleagues a believable way to test your prediction, then you need to phrase it more clearly and precisely.

Each time you strengthen your rationales, you might need to adjust your predictions. And, each time you clarify your predictions, you might need to adjust your rationales. The cycle of going back and forth to keep your predictions and rationales tightly aligned has many payoffs down the road. Every decision you make from this point on will be in the interests of providing a transparent and convincing test of your hypotheses and explaining how the results of your test dictate specific revisions to your hypotheses. As you make these decisions (described in the succeeding chapters), you will probably return to clarify your hypotheses even further. But, you will be in a much better position, at each point, if you begin with well-informed hypotheses.

Beginning by Asking Questions to Clarify Your Interests

Instead of starting with predictions, a second path you might take devotes more time at the beginning to asking questions as you zero in on what you want to study. Some researchers suggest you start this way (e.g., Gournelos et al., 2019 ). Specifically, with this second path, the first statement you write to express your research interest would be a question. For example, you might ask, “Why do ninth-grade students change the way they think about linear equations after studying quadratic equations?” or “How do first graders solve simple arithmetic problems before they have been taught to add and subtract?”

The first phrasing of your question might be quite general or vague. As you think about your question and what you really want to know, you are likely to ask follow-up questions. These questions will almost always be more specific than your first question. The questions will also express more clearly what you want to know. So, the question “How do first graders solve simple arithmetic problems before they have been taught to add and subtract” might evolve into “Before first graders have been taught to solve arithmetic problems, what strategies do they use to solve arithmetic problems with sums and products below 20?” As you read and learn about what others already know about your questions, you will continually revise your questions toward clearer and more explicit and more precise versions that zero in on what you really want to know. The question above might become, “Before they are taught to solve arithmetic problems, what strategies do beginning first graders use to solve arithmetic problems with sums and products below 20 if they are read story problems and given physical counters to help them keep track of the quantities?”

Imagining Answers to Your Questions

If you monitor your own thinking as you ask questions, you are likely to begin forming some guesses about answers, even to the early versions of the questions. What do students learn about quadratic functions that influences changes in their proportional reasoning when dealing with linear functions? It could be that if you analyze the moments during instruction on quadratic equations that are extensions of the proportional reasoning involved in solving linear equations, there are times when students receive further experience reasoning proportionally. You might predict that these are the experiences that have a “backward transfer” effect (Hohensee, 2014 ).

These initial guesses about answers to your questions are your first predictions. The first predicted answers are likely to be hunches or fuzzy, vague guesses. This simply means you do not know very much yet about the question you are asking. Your first predictions, no matter how unfocused or tentative, represent the most you know at the time about the question you are asking. They help you gauge where you are in your thinking.

Shifting to the Hypothesis Formulation and Testing Path

Research questions can play an important role in the research process. They provide a succinct way of capturing your research interests and communicating them to others. When colleagues want to know about your work, they will often ask “What are your research questions?” It is good to have a ready answer.

However, research questions have limitations. They do not capture the three images of scientific inquiry presented in Chap. 1 . Due, in part, to this less expansive depiction of the process, research questions do not take you very far. They do not provide a guide that leads you through the phases of conducting a study.

Consequently, when you can imagine an answer to your research question, we recommend that you move onto the hypothesis formulation and testing path. Imagining an answer to your question means you can make plausible predictions. You can now begin clarifying the reasons for your predictions and transform your early predictions into hypotheses (predictions along with rationales). We recommend you do this as soon as you have guesses about the answers to your questions because formulating, testing, and revising hypotheses offers a tool that puts you squarely on the path of scientific inquiry. It is a tool that can guide you through the entire process of conducting a research study.

This does not mean you are finished asking questions. Predictions are often created as answers to questions. So, we encourage you to continue asking questions to clarify what you want to know. But your target shifts from only asking questions to also proposing predictions for the answers and developing reasons the answers will be accurate predictions. It is by predicting answers, and explaining why you made those predictions, that you become engaged in scientific inquiry.

Cycles of Refining Questions and Predicting Answers

An example might provide a sense of how this process plays out. Suppose you are reading about Vygotsky’s ( 1987 ) zone of proximal development (ZPD), and you realize this concept might help you understand why your high school students had trouble learning exponential functions. Maybe they were outside this zone when you tried to teach exponential functions. In order to recognize students who would benefit from instruction, you might ask, “How can I identify students who are within the ZPD around exponential functions?” What would you predict? Maybe students in this ZPD are those who already had knowledge of related functions. You could write out some reasons for this prediction, like “students who understand linear and quadratic functions are more likely to extend their knowledge to exponential functions.” But what kind of data would you need to test this? What would count as “understanding”? Are linear and quadratic the functions you should assess? Even if they are, how could you tell whether students who scored well on tests of linear and quadratic functions were within the ZPD of exponential functions? How, in the end, would you measure what it means to be in this ZPD? So, asking a series of reasonable questions raised some red flags about the way your initial question was phrased, and you decide to revise it.

You set the stage for revising your question by defining ZPD as the zone within which students can solve an exponential function problem by making only one additional conceptual connection between what they already know and exponential functions. Your revised question is, “Based on students’ knowledge of linear and quadratic functions, which students are within the ZPD of exponential functions?” This time you know what kind of data you need: the number of conceptual connections students need to bridge from their knowledge of related functions to exponential functions. How can you collect these data? Would you need to see into the minds of the students? Or, are there ways to test the number of conceptual connections someone makes to move from one topic to another? Do methods exist for gathering these data? You decide this is not realistic, so you now have a choice: revise the question further or move your research in a different direction.

Notice that we do not use the term research question for all these early versions of questions that begin clarifying for yourself what you want to study. These early versions are too vague and general to be called research questions. In this book, we save the term research question for a question that comes near the end of the work and captures exactly what you want to study . By the time you are ready to specify a research question, you will be thinking about your study in terms of hypotheses and tests. When your hypotheses are in final form and include clear predictions about what you will find, it will be easy to state the research questions that accompany your predictions.

To reiterate one of the key points of this chapter: hypotheses carry much more information than research questions. Using our definition, hypotheses include predictions about what the answer might be to the question plus reasons for why you think so. Unlike research questions, hypotheses capture all three images of scientific inquiry presented in Chap. 1 (planning, observing and explaining, and revising one’s thinking). Your hypotheses represent the most you know, at the moment, about your research topic. The same cannot be said for research questions.

Beginning with a Research Problem

When you wrote answers to the six questions at the end of Part I of this chapter, you might have identified a research interest by stating it as a problem. This is the third path you might take to begin your research. Perhaps your description of your problem might look something like this: “When I tried to teach my middle school students by presenting them with a challenging problem without showing them how to solve similar problems, they didn’t exert much effort trying to find a solution but instead waited for me to show them how to solve the problem.” You do not have a specific question in mind, and you do not have an idea for why the problem exists, so you do not have a prediction about how to solve it. Writing a statement of this problem as clearly as possible could be the first step in your research journey.

As you think more about this problem, it will feel natural to ask questions about it. For example, why did some students show more initiative than others? What could I have done to get them started? How could I have encouraged the students to keep trying without giving away the solution? You are now on the path of asking questions—not research questions yet, but questions that are helping you focus your interest.

As you continue to think about these questions, reflect on your own experience, and read what others know about this problem, you will likely develop some guesses about the answers to the questions. They might be somewhat vague answers, and you might not have lots of confidence they are correct, but they are guesses that you can turn into predictions. Now you are on the hypothesis-formulation-and-testing path. This means you are on the path of asking yourself why you believe the predictions are correct, developing rationales for the predictions, asking what kinds of empirical observations would test your predictions, and refining your rationales and predictions as you read the literature and talk with colleagues.

A simple diagram that summarizes the three paths we have described is shown in Fig. 2.1 . Each row of arrows represents one pathway for formulating an informed hypothesis. The dotted arrows in the first two rows represent parts of the pathways that a researcher may have implicitly travelled through already (without an intent to form a prediction) but that ultimately inform the researcher’s development of a question or prediction.

Part III. One Researcher’s Experience Launching a Scientific Inquiry

Martha was in her third year of her doctoral program and beginning to identify a topic for her dissertation. Based on (a) her experience as a high school mathematics teacher and a curriculum supervisor, (b) the reading she has done to this point, and (c) her conversations with her colleagues, she has developed an interest in what kinds of professional development experiences (let’s call them learning opportunities [LOs] for teachers) are most effective. Where does she go from here?

Exercise 2.2

Before you continue reading, please write down some suggestions for Martha about where she should start.

A natural thing for Martha to do at this point is to ask herself some additional questions, questions that specify further what she wants to learn: What kinds of LOs do most teachers experience? How do these experiences change teachers’ practices and beliefs? Are some LOs more effective than others? What makes them more effective?

To focus her questions and decide what she really wants to know, she continues reading but now targets her reading toward everything she can find that suggests possible answers to these questions. She also talks with her colleagues to get more ideas about possible answers to these or related questions. Over several weeks or months, she finds herself being drawn to questions about what makes LOs effective, especially for helping teachers teach more conceptually. She zeroes in on the question, “What makes LOs for teachers effective for improving their teaching for conceptual understanding?”

This question is more focused than her first questions, but it is still too general for Martha to define a research study. How does she know it is too general? She uses two criteria. First, she notices that the predictions she makes about the answers to the question are all over the place; they are not constrained by the reasons she has assembled for her predictions. One prediction is that LOs are more effective when they help teachers learn content. Martha makes this guess because previous research suggests that effective LOs for teachers include attention to content. But this rationale allows lots of different predictions. For example, LOs are more effective when they focus on the content teachers will teach; LOs are more effective when they focus on content beyond what teachers will teach so teachers see how their instruction fits with what their students will encounter later; and LOs are more effective when they are tailored to the level of content knowledge participants have when they begin the LOs. The rationale she can provide at this point does not point to a particular prediction.

A second measure Martha uses to decide her question is too general is that the predictions she can make regarding the answers seem very difficult to test. How could she test, for example, whether LOs should focus on content beyond what teachers will teach? What does “content beyond what teachers teach” mean? How could you tell whether teachers use their new knowledge of later content to inform their teaching?

Before anticipating what Martha’s next question might be, it is important to pause and recognize how predicting the answers to her questions moved Martha into a new phase in the research process. As she makes predictions, works out the reasons for them, and imagines how she might test them, she is immersed in scientific inquiry. This intellectual work is the main engine that drives the research process. Also notice that revisions in the questions asked, the predictions made, and the rationales built represent the updated thinking (Chap. 1 ) that occurs as Martha continues to define her study.

Based on all these considerations and her continued reading, Martha revises the question again. The question now reads, “Do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?” Although she feels like the question is more specific, she realizes that the answer to the question is either “yes” or “no.” This, by itself, is a red flag. Answers of “yes” or “no” would not contribute much to understanding the relationships between these LOs for teachers and changes in their teaching. Recall from Chap. 1 that understanding how things work, explaining why things work, is the goal of scientific inquiry.

Martha continues by trying to understand why she believes the answer is “yes.” When she tries to write out reasons for predicting “yes,” she realizes that her prediction depends on a variety of factors. If teachers already have deep knowledge of the content, the LOs might not affect them as much as other teachers. If the LOs do not help teachers develop their own conceptual understanding, they are not likely to change their teaching. By trying to build the rationale for her prediction—thus formulating a hypothesis—Martha realizes that the question still is not precise and clear enough.

Martha uses what she learned when developing the rationale and rephrases the question as follows: “ Under what conditions do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?” Through several additional cycles of thinking through the rationale for her predictions and how she might test them, Martha specifies her question even further: “Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from LOs that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?”

Each version of Martha’s question has become more specific. This has occurred as she has (a) identified a starting condition for the teachers—they lack conceptual knowledge of linear functions, (b) specified the mathematics content as linear functions, and (c) included a condition or purpose of the LO—it is aimed at conceptual learning.

Because of the way Martha’s question is now phrased, her predictions will require thinking about the conditions that could influence what teachers learn from the LOs and how this learning could affect their teaching. She might predict that if teachers engaged in LOs that extended over multiple sessions, they would develop deeper understanding which would, in turn, prompt changes in their teaching. Or she might predict that if the LOs included examples of how their conceptual learning could translate into different instructional activities for their students, teachers would be more likely to change their teaching. Reasons for these predictions would likely come from research about the effects of professional development on teachers’ practice.

As Martha thinks about testing her predictions, she realizes it will probably be easier to measure the conditions under which teachers are learning than the changes in the conceptual emphasis in their instruction. She makes a note to continue searching the literature for ways to measure the “conceptualness” of teaching.

As she refines her predictions and expresses her reasons for the predictions, she formulates a hypothesis (in this case several hypotheses) that will guide her research. As she makes predictions and develops the rationales for these predictions, she will probably continue revising her question. She might decide, for example, that she is not interested in studying the condition of different numbers of LO sessions and so decides to remove this condition from consideration by including in her question something like “. . . over five 2-hour sessions . . .”

At this point, Martha has developed a research question, articulated a number of predictions, and developed rationales for them. Her current question is: “Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from five 2-hour LO sessions that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?” Her hypothesis is:

Prediction: Participating teachers will show changes in their teaching with a greater emphasis on conceptual understanding, with larger changes on linear function topics directly addressed in the LOs than on other topics.

Brief Description of Rationale: (1) Past research has shown correlations between teachers’ specific mathematics knowledge of a topic and the quality of their teaching of that topic. This does not mean an increase in knowledge causes higher quality teaching but it allows for that possibility. (2) Transfer is usually difficult for teachers, but the examples developed during the LO sessions will help them use what they learned to teach for conceptual understanding. This is because the examples developed during the LO sessions are much like those that will be used by the teachers. So larger changes will be found when teachers are teaching the linear function topics addressed in the LOs.

Notice it is more straightforward to imagine how Martha could test this prediction because it is more precise than previous predictions. Notice also that by asking how to test a particular prediction, Martha will be faced with a decision about whether testing this prediction will tell her something she wants to learn. If not, she can return to the research question and consider how to specify it further and, perhaps, constrain further the conditions that could affect the data.

As Martha formulates her hypotheses and goes through multiple cycles of refining her question(s), articulating her predictions, and developing her rationales, she is constantly building the theoretical framework for her study. Because the theoretical framework is the topic for Chap. 3 , we will pause here and pick up Martha’s story in the next chapter. Spoiler alert: Martha’s experience contains some surprising twists and turns.

Before leaving Martha, however, we point out two aspects of the process in which she has been engaged. First, it can be useful to think about the process as identifying (1) the variables targeted in her predictions, (2) the mechanisms she believes explain the relationships among the variables, and (3) the definitions of all the terms that are special to her educational problem. By variables, we mean things that can be measured and, when measured, can take on different values. In Martha’s case, the variables are the conceptualness of teaching and the content topics addressed in the LOs. The mechanisms are cognitive processes that enable teachers to see the relevance of what they learn in PD to their own teaching and that enable the transfer of learning from one setting to another. Definitions are the precise descriptions of how the important ideas relevant to the research are conceptualized. In Martha’s case, definitions must be provided for terms like conceptual understanding, linear functions, LOs, each of the topics related to linear functions, instructional setting, and knowledge transfer.

A second aspect of the process is a practice that Martha acquired as part of her graduate program, a practice that can go unnoticed. Martha writes out, in full sentences, her thinking as she wrestles with her research question, her predictions of the answers, and the rationales for her predictions. Writing is a tool for organizing thinking and we recommend you use it throughout the scientific inquiry process. We say more about this at the end of the chapter.

Here are the questions Martha wrote as she developed a clearer sense of what question she wanted to answer and what answer she predicted. The list shows the increasing refinement that occurred as she continued to read, think, talk, and write.

Early questions: What kinds of LOs do most teachers experience? How do these experiences change teachers’ practices and beliefs? Are some LOs more effective than others? What makes them more effective?

First focused question: What makes LOs for teachers effective for improving their teaching for conceptual understanding?

Question after trying to predict the answer and imagining how to test the prediction: Do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?

Question after developing an initial rationale for her prediction: Under what conditions do LOs that engage middle school mathematics teachers in studying mathematics content help teachers teach this same content with more of a conceptual emphasis?

Question after developing a more precise prediction and richer rationale: Under what conditions do middle school teachers who lack conceptual knowledge of linear functions benefit from five 2-hour LO sessions that engage them in conceptual learning of linear functions as assessed by changes in their teaching toward a more conceptual emphasis on linear functions?

Part IV. An Illustrative Dialogue

The story of Martha described the major steps she took to refine her thinking. However, there is a lot of work that went on behind the scenes that wasn’t part of the story. For example, Martha had conversations with fellow students and professors that sharpened her thinking. What do these conversations look like? Because they are such an important part of the inquiry process, it will be helpful to “listen in” on the kinds of conversations that students might have with their advisors.

Here is a dialogue between a beginning student, Sam (S), and their advisor, Dr. Avery (A). They are meeting to discuss data Sam collected for a course project. The dialogue below is happening very early on in Sam’s conceptualization of the study, prior even to systematic reading of the literature.

Thanks for meeting with me today. As you know, I was able to collect some data for a course project a few weeks ago, but I’m having trouble analyzing the data, so I need your help. Let me try to explain the problem. As you know, I wanted to understand what middle-school teachers do to promote girls’ achievement in a mathematics class. I conducted four observations in each of three teachers’ classrooms. I also interviewed each teacher once about the four lessons I observed, and I interviewed two girls from each of the teachers’ classes. Obviously, I have a ton of data. But when I look at all these data, I don’t really know what I learned about my topic. When I was observing the teachers, I thought I might have observed some ways the teachers were promoting girls’ achievement, but then I wasn’t sure how to interpret my data. I didn’t know if the things I was observing were actually promoting girls’ achievement.

What were some of your observations?

Well, in a couple of my classroom observations, teachers called on girls to give an answer, even when the girls didn’t have their hands up. I thought that this might be a way that teachers were promoting the girls’ achievement. But then the girls didn’t say anything about that when I interviewed them and also the teachers didn’t do it in every class. So, it’s hard to know what effect, if any, this might have had on their learning or their motivation to learn. I didn’t want to ask the girls during the interview specifically about the teacher calling on them, and without the girls bringing it up themselves, I didn’t know if it had any effect.

Well, why didn’t you want to ask the girls about being called on?

Because I wanted to leave it as open as possible; I didn’t want to influence what they were going to say. I didn’t want to put words in their mouths. I wanted to know what they thought the teacher was doing that promoted their mathematical achievement and so I only asked the girls general questions, like “Do you think the teacher does things to promote girls’ mathematical achievement?” and “Can you describe specific experiences you have had that you believe do and do not promote your mathematical achievement?”

So then, how did they answer those general questions?

Well, with very general answers, such as that the teacher knows their names, offers review sessions, grades their homework fairly, gives them opportunities to earn extra credit, lets them ask questions, and always answers their questions. Nothing specific that helps me know what teaching actions specifically target girls’ mathematics achievement.

OK. Any ideas about what you might do next?

Well, I remember that when I was planning this data collection for my course, you suggested I might want to be more targeted and specific about what I was looking for. I can see now that more targeted questions would have made my data more interpretable in terms of connecting teaching actions to the mathematical achievement of girls. But I just didn’t want to influence what the girls would say.

Yes, I remember when you were planning your course project, you wanted to keep it open. You didn’t want to miss out on discovering something new and interesting. What do you think now about this issue?

Well, I still don’t want to put words in their mouths. I want to know what they think. But I see that if I ask really open questions, I have no guarantee they will talk about what I want them to talk about. I guess I still like the idea of an open study, but I see that it’s a risky approach. Leaving the questions too open meant I didn’t constrain their responses and there were too many ways they could interpret and answer the questions. And there are too many ways I could interpret their responses.

By this point in the dialogue, Sam has realized that open data (i.e., data not testing a specific prediction) is difficult to interpret. In the next part, Dr. Avery explains why collecting open data was not helping Sam achieve goals for her study that had motivated collecting open data in the first place.

Yes, I totally agree. Even for an experienced researcher, it can be difficult to make sense of this kind of open, messy data. However, if you design a study with a more specific focus, you can create questions for participants that are more targeted because you will be interested in their answers to these specific questions. Let’s reflect back on your data collection. What can you learn from it for the future?

When I think about it now, I realize that I didn’t think about the distinction between all the different constructs at play in my study, and I didn’t choose which one I was focusing on. One construct was the teaching moves that teachers think could be promoting achievement. Another is what teachers deliberately do to promote girls’ mathematics achievement, if anything. Another was the teaching moves that actually do support girls’ mathematics achievement. Another was what teachers were doing that supported girls’ mathematics achievement versus the mathematics achievement of all students. Another was students’ perception of what their teacher was doing to promote girls’ mathematics achievement. I now see that any one of these constructs could have been the focus of a study and that I didn’t really decide which of these was the focus of my course project prior to collecting data.

So, since you told me that the topic of this course project is probably what you’ll eventually want to study for your dissertation, which of these constructs are you most interested in?

I think I’m more interested in the teacher moves that teachers deliberately do to promote girls’ achievement. But I’m still worried about asking teachers directly and getting too specific about what they do because I don’t want to bias what they will say. And I chose qualitative methods and an exploratory design because I thought it would allow for a more open approach, an approach that helps me see what’s going on and that doesn’t bias or predetermine the results.

Well, it seems to me you are conflating three issues. One issue is how to conduct an unbiased study. Another issue is how specific to make your study. And the third issue is whether or not to choose an exploratory or qualitative study design. Those three issues are not the same. For example, designing a study that’s more open or more exploratory is not how researchers make studies fair and unbiased. In fact, it would be quite easy to create an open study that is biased. For example, you could ask very open questions and then interpret the responses in a way that unintentionally, and even unknowingly, aligns with what you were hoping the findings would say. Actually, you could argue that by adding more specificity and narrowing your focus, you’re creating constraints that prevent bias. The same goes for an exploratory or qualitative study; they can be biased or unbiased. So, let’s talk about what is meant by getting more specific. Within your new focus on what teachers deliberately do, there are many things that would be interesting to look at, such as teacher moves that address math anxiety, moves that allow girls to answer questions more frequently, moves that are specifically fitted to student thinking about specific mathematical content, and so on. What are one or two things that are most interesting to you? One way to answer this question is by thinking back to where your interest in this topic began.

In the preceding part of the dialogue, Dr. Avery explained how the goals Sam had for their study were not being met with open data. In the next part, Sam begins to articulate a prediction, which Sam and Dr. Avery then sharpen.

Actually, I became interested in this topic because of an experience I had in college when I was in a class of mostly girls. During whole class discussions, we were supposed to critically evaluate each other’s mathematical thinking, but we were too polite to do that. Instead, we just praised each other’s work. But it was so different in our small groups. It seemed easier to critique each other’s thinking and to push each other to better solutions in small groups. I began wondering how to get girls to be more critical of each other’s thinking in a whole class discussion in order to push everyone’s thinking.

Okay, this is great information. Why not use this idea to zoom-in on a more manageable and interpretable study? You could look specifically at how teachers support girls in critically evaluating each other’s thinking during whole class discussions. That would be a much more targeted and specific topic. Do you have predictions about what teachers could do in that situation, keeping in mind that you are looking specifically at girls’ mathematical achievement, not students in general?

Well, what I noticed was that small groups provided more social and emotional support for girls, whereas the whole class discussion did not provide that same support. The girls felt more comfortable critiquing each other’s thinking in small groups. So, I guess I predict that when the social and emotional supports that are present in small groups are extended to the whole class discussion, girls would be more willing to evaluate each other’s mathematical thinking critically during whole class discussion . I guess ultimately, I’d like to know how the whole class discussion could be used to enhance, rather than undermine, the social and emotional support that is present in the small groups.

Okay, then where would you start? Would you start with a study of what the teachers say they will do during whole class discussion and then observe if that happens during whole class discussion?

But part of my prediction also involves the small groups. So, I’d also like to include small groups in my study if possible. If I focus on whole groups, I won’t be exploring what I am interested in. My interest is broader than just the whole class discussion.

That makes sense, but there are many different things you could look at as part of your prediction, more than you can do in one study. For instance, if your prediction is that when the social and emotional supports that are present in small groups are extended to whole class discussions, girls would be more willing to evaluate each other’s mathematical thinking critically during whole class discussions , then you could ask the following questions: What are the social and emotional supports that are present in small groups?; In which small groups do they exist?; Is it groups that are made up only of girls?; Does every small group do this, and for groups that do this, when do these supports get created?; What kinds of small group activities that teachers ask them to work on are associated with these supports?; Do the same social and emotional supports that apply to small groups even apply to whole group discussion?

All your questions make me realize that my prediction about extending social and emotional supports to whole class discussions first requires me to have a better understanding of the social and emotional supports that exist in small groups. In fact, I first need to find out whether those supports commonly exist in small groups or is that just my experience working in small groups. So, I think I will first have to figure out what small groups do to support each other and then, in a later study, I could ask a teacher to implement those supports during whole class discussions and find out how you can do that. Yeah, now I’m seeing that.

The previous part of the dialogue illustrates how continuing to ask questions about one’s initial prediction is a good way to make it more and more precise (and researchable). In the next part, we see how developing a precise prediction has the added benefit of setting the researcher up for future studies.

Yes, I agree that for your first study, you should probably look at small groups. In other words, you should focus on only a part of your prediction for now, namely the part that says there are social and emotional supports in small groups that support girls in critiquing each other’s thinking . That begins to sharpen the focus of your prediction, but you’ll want to continue to refine it. For example, right now, the question that this prediction leads to is a question with a yes or no answer, but what you’ve said so far suggests to me that you are looking for more than that.

Yes, I want to know more than just whether there are supports. I’d like to know what kinds. That’s why I wanted to do a qualitative study.

Okay, this aligns more with my thinking about research as being prediction driven. It’s about collecting data that would help you revise your existing predictions into better ones. What I mean is that you would focus on collecting data that would allow you to refine your prediction, make it more nuanced, and go beyond what is already known. Does that make sense, and if so, what would that look like for your prediction?

Oh yes, I like that. I guess that would mean that, based on the data I collect for this next study, I could develop a more refined prediction that, for example, more specifically identifies and differentiates between different kinds of social and emotional supports that are present in small groups, or maybe that identifies the kinds of small groups that they occur in, or that predicts when and how frequently or infrequently they occur, or about the features of the small group tasks in which they occur, etc. I now realize that, although I chose qualitative research to make my study be more open, really the reason qualitative research fits my purposes is because it will allow me to explore fine-grained aspects of social and emotional supports that may exist for girls in small groups.

Yes, exactly! And then, based on the data you collect, you can include in your revised prediction those new fine-grained aspects. Furthermore, you will have a story to tell about your study in your written report, namely the story about your evolving prediction. In other words, your written report can largely tell how you filled out and refined your prediction as you learned more from carrying out the study. And even though you might not use them right away, you are also going to be able to develop new predictions that you would not have even thought of about social and emotional supports in small groups and your aim of extending them to whole-class discussions, had you not done this study. That will set you up to follow up on those new predictions in future studies. For example, you might have more refined ideas after you collect the data about the goals for critiquing student thinking in small groups versus the goals for critiquing student thinking during whole class discussion. You might even begin to think that some of the social and emotional supports you observe are not even replicable or even applicable to or appropriate for whole-class discussions, because the supports play different roles in different contexts. So, to summarize what I’m saying, what you look at in this study, even though it will be very focused, sets you up for a research program that will allow you to more fully investigate your broader interest in this topic, where each new study builds on your prior body of work. That’s why it is so important to be explicit about the best place to start this research, so that you can build on it.

I see what you are saying. We started this conversation talking about my course project data. What I think I should have done was figure out explicitly what I needed to learn with that study with the intention of then taking what I learned and using it as the basis for the next study. I didn’t do that, and so I didn’t collect data that pushed forward my thinking in ways that would guide my next study. It would be as if I was starting over with my next study.

Sam and Dr. Avery have just explored how specifying a prediction reveals additional complexities that could become fodder for developing a systematic research program. Next, we watch Sam beginning to recognize the level of specificity required for a prediction to be testable.

One thing that would have really helped would have been if you had had a specific prediction going into your data collection for your course project.

Well, I didn’t really have much of an explicit prediction in mind when I designed my methods.

Think back, you must have had some kind of prediction, even if it was implicit.

Well, yes, I guess I was predicting that teachers would enact moves that supported girls’ mathematical achievement. And I observed classrooms to identify those teacher moves, I interviewed teachers to ask them about the moves I observed, and I interviewed students to see if they mentioned those moves as promoting their mathematical achievement. The goal of my course project was to identify teacher moves that support girls’ mathematical achievement. And my specific research question was: What teacher moves support girls’ mathematical achievement?

So, really you were asking the teacher and students to show and tell you what those moves are and the effects of those moves, as a result putting the onus on your participants to provide the answers to your research question for you. I have an idea, let’s try a thought experiment. You come up with data collection methods for testing the prediction that there are social and emotional supports in small groups that support girls in critiquing each other’s thinking that still puts the onus on the participants. And then I’ll see if I can think of data collection methods that would not put the onus on the participants.

Hmm, well. .. I guess I could simply interview girls who participated in small groups and ask them “are there social and emotional supports that you use in small groups that support your group in critiquing each other’s thinking and if so, what are they?” In that case, I would be putting the onus on them to be aware of the social dynamics of small groups and to have thought about these constructs as much as I have. Okay now can you continue the thought experiment? What might the data collection methods look like if I didn’t put the onus on the participants?

First, I would pick a setting in which it was only girls at this point to reduce the number of variables. Then, personally I would want to observe a lot of groups of girls interacting in groups around tasks. I would be looking for instances when the conversation about students’ ideas was shut down and instances when the conversation about students’ ideas involved critiquing of ideas and building on each other’s thinking. I would also look at what happened just before and during those instances, such as: did the student continue to talk after their thinking was critiqued, did other students do anything to encourage the student to build on their own thinking (i.e., constructive criticism) or how did they support or shut down continued participation. In fact, now that I think about it, “critiquing each other’s thinking” can be defined in a number of different ways. I could mean just commenting on someone’s thinking, judging correctness and incorrectness, constructive criticism that moves the thinking forward, etc. If you put the onus on the participants to answer your research question, you are stuck with their definition, and they won’t have thought about this very much, if at all.

I think that what you are also saying is that my definitions would affect my data collection. If I think that critiquing each other’s thinking means that the group moves their thinking forward toward more valid and complete mathematical solutions, then I’m going to focus on different moves than if I define it another way, such as just making a comment on each other’s thinking and making each other feel comfortable enough to keep participating. In fact, am I going to look at individual instances of critiquing or look at entire sequences in which the critiquing leads to a goal? This seems like a unit of analysis question, and I would need to develop a more nuanced prediction that would make explicit what that unit of analysis is.

I agree, your definition of “critiquing each other’s thinking” could entirely change what you are predicting. One prediction could be based on defining critiquing as a one-shot event in which someone makes one comment on another person’s thinking. In this case the prediction would be that there are social and emotional supports in small groups that support girls in making an evaluative comment on another student’s thinking. Another prediction could be based on defining critiquing as a back-and-forth process in which the thinking gets built on and refined. In that case, the prediction would be something like that there are social and emotional supports in small groups that support girls in critiquing each other’s thinking in ways that do not shut down the conversation but that lead to sustained conversations that move each other toward more valid and complete solutions.

Well, I think I am more interested in the second prediction because it is more compatible with my long-term interests, which are that I’m interested in extending small group supports to whole class discussions. The second prediction is more appropriate for eventually looking at girls in whole class discussion. During whole class discussion, the teacher tries to get a sustained conversation going that moves the students’ thinking forward. So, if I learn about small group supports that lead to sustained conversations that move each other toward more valid and complete solutions , those supports might transfer to whole class discussions.

In the previous part of the dialogue, Dr. Avery and Sam showed how narrowing down a prediction to one that is testable requires making numerous important decisions, including how to define the constructs referred to in the prediction. In the final part of the dialogue, Dr. Avery and Sam begin to outline the reading Sam will have to do to develop a rationale for the specific prediction.

Do you see how your prediction and definitions are getting more and more specific? You now need to read extensively to further refine your prediction.

Well, I should probably read about micro dynamics of small group interactions, anything about interactions in small groups, and what is already known about small group interactions that support sustained conversations that move students’ thinking toward more valid and complete solutions. I guess I could also look at research on whole-class discussion methods that support sustained conversations that move the class to more mathematically valid and complete solutions, because it might give me ideas for what to look for in the small groups. I might also need to focus on research about how learners develop understandings about a particular subject matter so that I know what “more valid and complete solutions” look like. I also need to read about social and emotional supports but focus on how they support students cognitively, rather than in other ways.

Sounds good, let’s get together after you have processed some of this literature and we can talk about refining your prediction based on what you read and also the methods that will best suit testing that prediction.

Great! Thanks for meeting with me. I feel like I have a much better set of tools that push my own thinking forward and allow me to target something specific that will lead to more interpretable data.

Part V. Is It Always Possible to Formulate Hypotheses?

In Chap. 1 , we noted you are likely to read that research does not require formulating hypotheses. Some sources describe doing research without making predictions and developing rationales for these predictions. Some researchers say you cannot always make predictions—you do not know enough about the situation. In fact, some argue for the value of not making predictions (e.g., Glaser & Holton, 2004 ; Merton, 1968 ; Nemirovsky, 2011 ). These are important points of view, so we will devote this section to discussing them.

Can You Always Predict What You Will Find?

One reason some researchers say you do not need to make predictions is that it can be difficult to imagine what you will find. This argument comes up most often for descriptive studies. Suppose you want to describe the nature of a situation you do not know much about. Can you still make a prediction about what you will find? We believe that, although you do not know exactly what you will find, you probably have a hunch or, at a minimum, a very fuzzy idea. It would be unusual to ask a question about a situation you want to know about without at least a fuzzy inkling of what you might find. The original question just would not occur to you. We acknowledge you might have only a vague idea of what you will find and you might not have much confidence in your prediction. However, we expect if you monitor your own thinking you will discover you have developed a suspicion along the way, regardless how vague the suspicion might be. Through the cyclic process we discussed above, that suspicion or hunch gradually evolves and turns into a prediction.

The Benefits of Making Predictions Even When They Are Wrong: An Example from the 1970s

One of us was a graduate student at the University of Wisconsin in the late 1970s, assigned as a research assistant to a project that was investigating young children’s thinking about simple arithmetic. A new curriculum was being written, and the developers wanted to know how to introduce the earliest concepts and skills to kindergarten and first-grade children. The directors of the project did not know what to expect because, at the time, there was little research on five- and six-year-olds’ pre-instruction strategies for adding and subtracting.

After consulting what literature was available, talking with teachers, analyzing the nature of different types of addition and subtraction problems, and debating with each other, the research team formulated some hypotheses about children’s performance. Following the usual assumptions at the time and recognizing the new curriculum would introduce the concepts, the researchers predicted that, before instruction, most children would not be able to solve the problems. Based on the rationale that some young children did not yet recognize the simple form for written problems (e.g., 5 + 3 = ___), the researchers predicted that the best chance for success would be to read problems as stories (e.g., Jesse had 5 apples and then found 3 more. How many does she have now?). They reasoned that, even though children would have difficulty on all the problems, some story problems would be easier because the semantic structure is easier to follow. For example, they predicted the above story about adding 3 apples to 5 would be easier than a problem like, “Jesse had some apples in the refrigerator. She put in 2 more and now has 6. How many were in the refrigerator at the beginning?” Based on the rationale that children would need to count to solve the problems and that it can be difficult to keep track of the numbers, they predicted children would be more successful if they were given counters. Finally, accepting the common reasoning that larger numbers are more difficult than smaller numbers, they predicted children would be more successful if all the numbers in a problem were below 10.

Although these predictions were not very precise and the rationales were not strongly convincing, these hypotheses prompted the researchers to design the study to test their predictions. This meant they would collect data by presenting a variety of problems under a variety of conditions. Because the goal was to describe children’s thinking, problems were presented to students in individual interviews. Problems with different semantic structures were included, counters were available for some problems but not others, and some problems had sums to 9 whereas others had sums to 20 or more.

The punchline of this story is that gathering data under these conditions, prompted by the predictions, made all the difference in what the researchers learned. Contrary to predictions, children could solve addition and subtraction problems before instruction. Counters were important because almost all the solution strategies were based on counting which meant that memory was an issue because many strategies require counting in two ways simultaneously. For example, subtracting 4 from 7 was usually solved by counting down from 7 while counting up from 1 to 4 to keep track of counting down. Because children acted out the stories with their counters, the semantic structure of the story was also important. Stories that were easier to read and write were also easier to solve.

To make a very long story very short, other researchers were, at about the same time, reporting similar results about children’s pre-instruction arithmetic capabilities. A clear pattern emerged regarding the relative difficulty of different problem types (semantic structures) and the strategies children used to solve each type. As the data were replicated, the researchers recognized that kindergarten and first-grade teachers could make good use of this information when they introduced simple arithmetic. This is how Cognitively Guided Instruction (CGI) was born (Carpenter et al., 1989 ; Fennema et al., 1996 ).

To reiterate, the point of this example is that the study conducted to describe children’s thinking would have looked quite different if the researchers had made no predictions. They would have had no reason to choose the particular problems and present them under different conditions. The fact that some of the predictions were completely wrong is not the point. The predictions created the conditions under which the predictions were tested which, in turn, created learning opportunities for the researchers that would not have existed without the predictions. The lesson is that even research that aims to simply describe a phenomenon can benefit from hypotheses. As signaled in Chap. 1 , this also serves as another example of “failing productively.”

Suggestions for What to Do When You Do Not Have Predictions

There likely are exceptions to our claim about being able to make a prediction about what you will find. For example, there could be rare cases where researchers truly have no idea what they will find and can come up with no predictions and even no hunches. And, no research has been reported on related phenomena that would offer some guidance. If you find yourself in this position, we suggest one of three approaches: revise your question, conduct a pilot study, or choose another question.

Because there are many advantages to making predictions explicit and then writing out the reasons for these predictions, one approach is to adjust your question just enough to allow you to make a prediction. Perhaps you can build on descriptions that other researchers have provided for related situations and consider how you can extend this work. Building on previous descriptions will enable you to make predictions about the situation you want to describe.

A second approach is to conduct a small pilot study or, better, a series of small pilot studies to develop some preliminary ideas of what you might find. If you can identify a small sample of participants who are similar to those in your study, you can try out at least some of your research plans to help make and refine your predictions. As we detail later, you can also use pilot studies to check whether key aspects of your methods (e.g., tasks, interview questions, data collection methods) work as you expect.

A third approach is to return to your list of interests and choose one that has been studied previously. Sometimes this is the wisest choice. It is very difficult for beginning researchers to conduct research in brand-new areas where no hunches or predictions are possible. In addition, the contributions of this research can be limited. Recall the earlier story about one of us “failing productively” by completing a dissertation in a somewhat new area. If, after an exhaustive search, you find that no one has investigated the phenomenon in which you are interested or even related phenomena, it can be best to move in a different direction. You will read recommendations in other sources to find a “gap” in the research and develop a study to “fill the gap.” This can be helpful advice if the gap is very small. However, if the gap is large, too large to predict what you might find, the study will present severe challenges. It will be more productive to extend work that has already been done than to launch into an entirely new area.

Should You Always Try to Predict What You Will Find?

In short, our answer to the question in the heading is “yes.” But this calls for further explanation.

Suppose you want to observe a second-grade classroom in order to investigate how students talk about adding and subtracting whole numbers. You might think, “I don’t want to bias my thinking; I want to be completely open to what I see in the classroom.” Sam shared a similar point of view at the beginning of the dialogue: “I wanted to leave it as open as possible; I didn’t want to influence what they were going to say.” Some researchers say that beginning your research study by making predictions is inappropriate precisely because it will bias your observations and results. The argument is that by bringing a set of preconceptions, you will confirm what you expected to find and be blind to other observations and outcomes. The following quote illustrates this view: “The first step in gaining theoretical sensitivity is to enter the research setting with as few predetermined ideas as possible—especially logically deducted, a priori hypotheses. In this posture, the analyst is able to remain sensitive to the data by being able to record events and detect happenings without first having them filtered through and squared with pre-existing hypotheses and biases” (Glaser, 1978, pp. 2–3).

We take a different point of view. In fact, we believe there are several compelling reasons for making your predictions explicit.

Making Your Predictions Explicit Increases Your Chances of Productive Observations

Because your predictions are an extension of what is already known, they prepare you to identify more nuanced relationships that can advance our understanding of a phenomenon. For example, rather than simply noticing, in a general sense, that students talking about addition and subtraction leads them to better understandings, you might, based on your prediction, make the specific observation that talking about addition and subtraction in a particular way helps students to think more deeply about a particular concept related to addition and subtraction. Going into a study without predictions can bring less sensitivity rather than more to the study of a phenomenon. Drawing on knowledge about related phenomena by reading the literature and conducting pilot studies allows you to be much more sensitive and your observations to be more productive.

Making Your Predictions Explicit Allows You to Guard Against Biases

Some genres and methods of educational research are, in fact, rooted in philosophical traditions (e.g., Husserl, 1929/ 1973 ) that explicitly call for researchers to temporarily “bracket” or set aside existing theory as well as their prior knowledge and experience to better enter into the experience of the participants in the research. However, this does not mean ignoring one’s own knowledge and experience or turning a blind eye to what has been learned by others. Much more than the simplistic image of emptying one’s mind of preconceptions and implicit biases (arguably an impossible feat to begin with), the goal is to be as reflective as possible about one’s prior knowledge and conceptions and as transparent as possible about how they may guide observations and shape interpretations (Levitt et al., 2018 ).

We believe it is better to be honest about the predictions you are almost sure to have because then you can deliberately plan to minimize the chances they will influence what you find and how you interpret your results. For starters, it is important to recognize that acknowledging you have some guesses about what you will find does not make them more influential. Because you are likely to have them anyway, we recommend being explicit about what they are. It is easier to deal with biases that are explicit than those that lurk in the background and are not acknowledged.

What do we mean by “deal with biases”? Some journals require you to include a statement about your “positionality” with respect to the participants in your study and the observations you are making to gather data. Formulating clear hypotheses is, in our view, a direct response to this request. The reasons for your predictions are your explicit statements about your positionality. Often there are methodological strategies you can use to protect the study from undue influences of bias. In other words, making your vague predictions explicit can help you design your study so you minimize the bias of your findings.

Making Your Predictions Explicit Can Help You See What You Did Not Predict

Making your predictions explicit does not need to blind you to what is different than expected. It does not need to force you to see only what you want to see. Instead, it can actually increase your sensitivity to noticing features of the situation that are surprising, features you did not predict. Results can stand out when you did not expect to see them.

In contrast, not bringing your biases to consciousness might subtly shift your attention away from these unexpected results in ways that you are not aware of. This path can lead to claiming no biases and no unexpected findings without being conscious of them. You cannot observe everything, and some things inevitably will be overlooked. If you have predicted what you will see, you can design your study so that the unexpected results become more salient rather than less.

Returning to the example of observing a second-grade classroom, we note that the field already knows a great deal about how students talk about addition and subtraction. Being cognizant of what others have observed allows you to enter the classroom with some clear predictions about what will happen. The rationales for these predictions are based on all the related knowledge you have before stepping into the classroom, and the predictions and rationales help you to better deal with what you see. This is partly because you are likely to be surprised by the things you did not anticipate. There is almost always something that will surprise you because your predictions will almost always be incomplete or too general. This sensitivity to the unanticipated—the sense of surprise that sparks your curiosity—is an indication of your openness to the phenomenon you are studying.

Making Your Predictions Explicit Allows You to Plan in Advance

Recall from Chap. 1 the descriptor of scientific inquiry: “Experience carefully planned in advance.” If you make no predictions about what might happen, it is very difficult, if not impossible, to plan your study in advance. Again, you cannot observe everything, so you must make decisions about what you will observe. What kind of data will you plan to collect? Why would you collect these data instead of others? If you have no idea what to expect, on what basis will you make these consequential decisions? Even if your predictions are vague and your rationales for the predictions are a bit shaky, at least they provide a direction for your plan. They allow you to explain why you are planning this study and collecting these data. They allow you to “carefully plan in advance.”

Making Your Predictions Explicit Allows You to Put Your Rationales in Harm’s Way

Rationales are developed to justify the predictions. Rationales represent your best reasoning about the research problem you are studying. How can you tell whether your reasoning is sound? You can try it out with colleagues. However, the best way to test it is to put it in “harm’s way” (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003 p. 10). And the best approach to putting your reasoning in harm’s way is to test the predictions it generates. Regardless if you are conducting a qualitative or quantitative study, rationales can be improved only if they generate testable predictions. This is possible only if predictions are explicit and precise. As we described earlier, rationales are evaluated for their soundness and refined in light of the specific differences between predictions and empirical observations.

Making Your Predictions Explicit Forces You to Organize and Extend Your (and the Field’s) Thinking

By writing out your predictions (even hunches or fuzzy guesses) and by reflecting on why you have these predictions and making these reasons explicit for yourself, you are advancing your thinking about the questions you really want to answer. This means you are making progress toward formulating your research questions and your final hypotheses. Making more progress in your own thinking before you conduct your study increases the chances your study will be of higher quality and will be exactly the study you intended. Making predictions, developing rationales, and imagining tests are tools you can use to push your thinking forward before you even collect data.

Suppose you wonder how preservice teachers in your university’s teacher preparation program will solve particular kinds of math problems. You are interested in this question because you have noticed several PSTs solve them in unexpected ways. As you ask the question you want to answer, you make predictions about what you expect to see. When you reflect on why you made these predictions, you realize that some PSTs might use particular solution strategies because they were taught to use some of them in an earlier course, and they might believe you expect them to solve the problems in these ways. By being explicit about why you are making particular predictions, you realize that you might be answering a different question than you intend (“How much do PSTs remember from previous courses?” or even “To what extent do PSTs believe different instructors have similar expectations?”). Now you can either change your question or change the design of your study (i.e., the sample of students you will use) or both. You are advancing your thinking by being explicit about your predictions and why you are making them.

The Costs of Not Making Predictions

Avoiding making predictions, for whatever reason, comes with significant costs. It prevents you from learning very much about your research topic. It would require not reading related research, not talking with your colleagues, and not conducting pilot studies because, if you do, you are likely to find a prediction creeping into your thinking. Not doing these things would forego the benefits of advancing your thinking before you collect data. It would amount to conducting the study with as little forethought as possible.

Part VI. How Do You Formulate Important Hypotheses?

We provided a partial answer in Chap. 1 to the question of a hypothesis’ importance when we encouraged considering the ultimate goal to which a study’s findings might contribute. You might want to reread Part III of Chap. 1 where we offered our opinions about the purposes of doing research. We also recommend reading the March 2019 editorial in the Journal for Research in Mathematics Education (Cai et al., 2019b ) in which we address what constitutes important educational research.

As we argued in Chap. 1 and in the March 2019 editorial, a worthy ultimate goal for educational research is to improve the learning opportunities for all students. However, arguments can be made for other ultimate goals as well. To gauge the importance of your hypotheses, think about how clearly you can connect them to a goal the educational community considers important. In addition, given the descriptors of scientific inquiry proposed in Chap. 1 , think about how testing your hypotheses will help you (and the community) understand what you are studying. Will you have a better explanation for the phenomenon after your study than before?

Although we address the question of importance again, and in more detail, in Chap. 5 , it is useful to know here that you can determine the significance or importance of your hypotheses when you formulate them. The importance need not depend on the data you collect or the results you report. The importance can come from the fact that, based on the results of your study, you will be able to offer revised hypotheses that help the field better understand an important issue. In large part, it is these revised hypotheses rather than the data that determine a study’s importance.

A critical caveat to this discussion is that few hypotheses are self-evidently important. They are important only if you make the case for their importance. Even if you follow closely the guidelines we suggest for formulating an important hypothesis, you must develop an argument that convinces others. This argument will be presented in the research paper you write.

The picture has a few hypotheses that are self-evidently important. They are important only if you make the case for their importance; written.

Consider Martha’s hypothesis presented earlier. When we left Martha, she predicted that “Participating teachers will show changes in their teaching with a greater emphasis on conceptual understanding with larger changes on linear function topics directly addressed in the LOs than on other topics.” For researchers and educators not intimately familiar with this area of research, it is not apparent why someone should spend a year or more conducting a dissertation to test this prediction. Her rationale, summarized earlier, begins to describe why this could be an important hypothesis. But it is by writing a clear argument that explains her rationale to readers that she will convince them of its importance.

How Martha fills in her rationale so she can create a clear written argument for its importance is taken up in Chap. 3 . As we indicated, Martha’s work in this regard led her to make some interesting decisions, in part due to her own assessment of what was important.

Part VII. Beginning to Write the Research Paper for Your Study

It is common to think that researchers conduct a study and then, after the data are collected and analyzed, begin writing the paper about the study. We recommend an alternative, especially for beginning researchers. We believe it is better to write drafts of the paper at the same time you are planning and conducting your study. The paper will gradually evolve as you work through successive phases of the scientific inquiry process. Consequently, we will call this paper your evolving research paper .

The picture has, we believe it is better to write drafts of the paper at the same time you are planning and conducting your study; written.

You will use your evolving research paper to communicate your study, but you can also use writing as a tool for thinking and organizing your thinking while planning and conducting the study. Used as a tool for thinking, you can write drafts of your ideas to check on the clarity of your thinking, and then you can step back and reflect on how to clarify it further. Be sure to avoid jargon and general terms that are not well defined. Ask yourself whether someone not in your field, maybe a sibling, a parent, or a friend, would be able to understand what you mean. You are likely to write multiple drafts with lots of scribbling, crossing out, and revising.

Used as a tool for communicating, writing the best version of what you know before moving to the next phase will help you record your decisions and the reasons for them before you forget important details. This best-version-for-now paper also provides the basis for your thinking about the next phase of your scientific inquiry.

At this point in the process, you will be writing your (research) questions, the answers you predict, and the rationales for your predictions. The predictions you make should be direct answers to your research questions and should flow logically from (or be directly supported by) the rationales you present. In addition, you will have a written statement of the study’s purpose or, said another way, an argument for the importance of the hypotheses you will be testing. It is in the early sections of your paper that you will convince your audience about the importance of your hypotheses.

In our experience, presenting research questions is a more common form of stating the goal of a research study than presenting well-formulated hypotheses. Authors sometimes present a hypothesis, often as a simple prediction of what they might find. The hypothesis is then forgotten and not used to guide the analysis or interpretations of the findings. In other words, authors seldom use hypotheses to do the kind of work we describe. This means that many research articles you read will not treat hypotheses as we suggest. We believe these are missed opportunities to present research in a more compelling and informative way. We intend to provide enough guidance in the remaining chapters for you to feel comfortable organizing your evolving research paper around formulating, testing, and revising hypotheses.

While we were editing one of the leading research journals in mathematics education ( JRME ), we conducted a study of reviewers’ critiques of papers submitted to the journal. Two of the five most common concerns were: (1) the research questions were unclear, and (2) the answers to the questions did not make a substantial contribution to the field. These are likely to be major concerns for the reviewers of all research journals. We hope the knowledge and skills you have acquired working through this chapter will allow you to write the opening to your evolving research paper in a way that addresses these concerns. Much of the chapter should help make your research questions clear, and the prior section on formulating “important hypotheses” will help you convey the contribution of your study.

Exercise 2.3

Look back at your answers to the sets of questions before part II of this chapter.

Think about how you would argue for the importance of your current interest.

Write your interest in the form of (1) a research problem, (2) a research question, and (3) a prediction with the beginnings of a rationale. You will update these as you read the remaining chapters.

Part VIII. The Heart of Scientific Inquiry

In this chapter, we have described the process of formulating hypotheses. This process is at the heart of scientific inquiry. It is where doing research begins. Conducting research always involves formulating, testing, and revising hypotheses. This is true regardless of your research questions and whether you are using qualitative, quantitative, or mixed methods. Without engaging in this process in a deliberate, intense, relentless way, your study will reveal less than it could. By engaging in this process, you are maximizing what you, and others, can learn from conducting your study.

In the next chapter, we build on the ideas we have developed in the first two chapters to describe the purpose and nature of theoretical frameworks . The term theoretical framework, along with closely related terms like conceptual framework, can be somewhat mysterious for beginning researchers and can seem like a requirement for writing a paper rather than an aid for conducting research. We will show how theoretical frameworks grow from formulating hypotheses—from developing rationales for the predicted answers to your research questions. We will propose some practical suggestions for building theoretical frameworks and show how useful they can be. In addition, we will continue Martha’s story from the point at which we paused earlier—developing her theoretical framework.

Cai, J., Morris, A., Hohensee, C., Hwang, S., Robison, V., Cirillo, M., Kramer, S. L., & Hiebert, J. (2019b). Posing significant research questions. Journal for Research in Mathematics Education, 50 (2), 114–120. https://doi.org/10.5951/jresematheduc.50.2.0114

Article   Google Scholar  

Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C. P., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American Educational Research Journal, 26 (4), 385–531.

Fennema, E., Carpenter, T. P., Franke, M. L., Levi, L., Jacobs, V. R., & Empson, S. B. (1996). A longitudinal study of learning to use children’s thinking in mathematics instruction. Journal for Research in Mathematics Education, 27 (4), 403–434.

Glaser, B. G., & Holton, J. (2004). Remodeling grounded theory. Forum: Qualitative Social Research, 5(2). https://www.qualitative-research.net/index.php/fqs/article/view/607/1316

Gournelos, T., Hammonds, J. R., & Wilson, M. A. (2019). Doing academic research: A practical guide to research methods and analysis . Routledge.

Book   Google Scholar  

Hohensee, C. (2014). Backward transfer: An investigation of the influence of quadratic functions instruction on students’ prior ways of reasoning about linear functions. Mathematical Thinking and Learning, 16 (2), 135–174.

Husserl, E. (1973). Cartesian meditations: An introduction to phenomenology (D. Cairns, Trans.). Martinus Nijhoff. (Original work published 1929).

Google Scholar  

Levitt, H. M., Bamberg, M., Creswell, J. W., Frost, D. M., Josselson, R., & Suárez-Orozco, C. (2018). Journal article reporting standards for qualitative primary, qualitative meta-analytic, and mixed methods research in psychology: The APA Publications and Communications Board Task Force report. American Psychologist, 73 (1), 26–46.

Medawar, P. (1982). Pluto’s republic [no typo]. Oxford University Press.

Merton, R. K. (1968). Social theory and social structure (Enlarged edition). Free Press.

Nemirovsky, R. (2011). Episodic feelings and transfer of learning. Journal of the Learning Sciences, 20 (2), 308–337. https://doi.org/10.1080/10508406.2011.528316

Vygotsky, L. (1987). The development of scientific concepts in childhood: The design of a working hypothesis. In A. Kozulin (Ed.), Thought and language (pp. 146–209). The MIT Press.

Download references

Author information

Authors and affiliations.

School of Education, University of Delaware, Newark, DE, USA

James Hiebert, Anne K Morris & Charles Hohensee

Department of Mathematical Sciences, University of Delaware, Newark, DE, USA

Jinfa Cai & Stephen Hwang

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2023 The Author(s)

About this chapter

Hiebert, J., Cai, J., Hwang, S., Morris, A.K., Hohensee, C. (2023). How Do You Formulate (Important) Hypotheses?. In: Doing Research: A New Researcher’s Guide. Research in Mathematics Education. Springer, Cham. https://doi.org/10.1007/978-3-031-19078-0_2

Download citation

DOI : https://doi.org/10.1007/978-3-031-19078-0_2

Published : 03 December 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-19077-3

Online ISBN : 978-3-031-19078-0

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

hypothesis formulation epidemiology

Outbreak Investigations

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  
  • |   9  
  • |   10  

On This Page sidebar

Step 6: Develop Hypotheses

Step 7: evaluate hypotheses.

Learn More sidebar

All Modules

As noted previously, these steps are not undertaken in a rigid serial order. In fact, the order may vary depending on the circumstances, and some steps will be undertaken simultaneously. As soon as an outbreak is suspected, one automatically considers what the cause might be and the factors that are fueling it. One of the most important steps in generating hypotheses when investigating an outbreak is to consider what is known about the biology of the disease, including it's possible modes of transmission, whether there are animal reservoirs of disease, and the length of its incubation and infectious periods. Consider this Fact Sheet for Hepatitis A:

This succinct fact sheet provides excellent clues about what to look for when investigating an outbreak of hepatitis A.

Nevertheless, once descriptive epidemiology has been conducted and information about person, place, and time is available, it is useful to reflect on the collected information in order to re-evaluate and rank hypotheses about the causes. Hypotheses are generated by consciously or subconsciously looking for differences, similarities, and correlations.

  • Differences: If the frequency of disease differs in two locations or circumstances, it may be due to a factor that differs in the two circumstances.
  • Similarities: If there are similarities among the cases (e.g., many reported eating at a particular restaurant), then that common factor may be the cause.
  • Correlations: If the frequency of disease varies in relation to some factor, then that factor may be a cause of the disease. For example, communities with low rates of measles immunization may have high rates of measles cases.

Consider the information obtaining during hypothesis-generating interviews, and also consider the location of cases (spot map) and the time course of the epidemic in relation to the incubation period of the disease (the epidemic curve).

The next step is to evaluate the hypotheses. In some outbreaks the descriptive epidemiology rapidly points convincingly to a particular source, and further analysis is unnecessary. For example, in 1991 Massachusetts had an outbreak of vitamin D intoxication in which all of the affected cases reported drinking milk delivered to their homes by a local dairy. Inspection of the dairy revealed that excessive quantities of vitamin D were being added t the milk. However, in other situations the source is unclear, and analytic epidemiology must be utilized to more formally test the hypotheses.

There are two general study designs that can be used in analytical epidemiology: a cohort study or a case control study. Both of these evaluate specific hypotheses by comparing groups of people, but the strategies for sampling subjects for the study are very different. The following illustration summarizes the key differences between these two study designs.

hypothesis formulation epidemiology

return to top | previous page | next page

Content ©2016. All Rights Reserved. Date last modified: May 3, 2016. Wayne W. LaMorte, MD, PhD, MPH

[The formulation of epidemiological hypotheses by using the methods of multivariate statistical analysis]

  • PMID: 2201148

The work demonstrates the main approaches to the use of the methods of multidimensional analysis for the creation of a hypothesis on the mechanism of the epidemiological process of dysentery in organized groups. The main risk factors have been established, and their role in the formation of annual, all-the-year-round and seasonal dysentery morbidity has been quantitatively evaluated. The results of analysis show the existence of diverse variants of the alimentary route of the transmission of infection, maintaining the epidemic process of dysentery, and the necessity of differentiating measures for the prophylaxis of all-the-year-round and seasonal morbidity.

Publication types

  • English Abstract
  • Acute Disease
  • Disease Outbreaks / statistics & numerical data*
  • Dysentery, Bacillary / epidemiology*
  • Epidemiologic Methods
  • Factor Analysis, Statistical
  • Multivariate Analysis
  • Risk Factors
  • Shigella flexneri

IMAGES

  1. 🏷️ Formulation of hypothesis in research. How to Write a Strong

    hypothesis formulation epidemiology

  2. 🏷️ Formulation of hypothesis in research. How to Write a Strong

    hypothesis formulation epidemiology

  3. 🏷️ Formulation of hypothesis in research. How to Write a Strong

    hypothesis formulation epidemiology

  4. COM 202-321 Descriptive Epidemiology Including Hypothesis Formulation

    hypothesis formulation epidemiology

  5. PPT

    hypothesis formulation epidemiology

  6. Hypothesis Formulation

    hypothesis formulation epidemiology

VIDEO

  1. PhD course work examination

  2. class 9 biology lec # 03 chp #02 topic formulation of hypothesis

  3. Formulation of hypothesis |Biological method

  4. Formulation of Hypothesis

  5. Hypothesis Testing

  6. testing hypothesis, formulation of hypothesis and characteristics

COMMENTS

  1. Hypothesis Formulation

    Hypothesis Formulation - Characteristics of Person, Place, and Time. Descriptive epidemiology searches for patterns by examining characteristics of person, place, & time. These characteristics are carefully considered when a disease outbreak occurs, because they provide important clues regarding the source of the outbreak.

  2. 1.4

    Hypotheses. An epidemiologic hypothesis is a testable statement of a putative relationship between exposure and disease. The hypothesis should be: Clear. Testable or resolvable. State the relationship between exposure and disease. Limited in scope. Not inconsistent with known facts.

  3. 7.1.4

    Evaluating Hypotheses. There are two approaches to evaluating hypotheses: comparison of the hypotheses with the established facts and analytic epidemiology, which allows testing hypotheses. A comparison with established facts is useful when the evidence is so strong that the hypothesis does not need to be tested.

  4. Formulating Hypotheses for Different Study Designs

    Abstract. Generating a testable working hypothesis is the first step towards conducting original research. Such research may prove or disprove the proposed hypothesis. Case reports, case series, online surveys and other observational studies, clinical trials, and narrative reviews help to generate hypotheses.

  5. PDF Hypothesis Generation During Outbreaks

    Overview of hypothesis generation. When an outbreak has been identi-fied, demographic, clinical and/or laboratory data are usually ob-tained from the health department, clinicians, or laboratories, and these data are organized in a line listing (see FOCUS Issue 4 for more information about line listings). The next step in the investigation in ...

  6. The Epidemiologic Toolbox: Identifying, Honing, and Using the Right

    Loosely speaking, these research goals fall along a spectrum with purely descriptive epidemiology at 1 end; hypothesis generation, prediction, and outbreak investigation somewhere in the middle; and causal effect estimation and program evaluation at the other end. Here, we envision the spectrum signifying the approximate strength of assumptions ...

  7. Principles of Epidemiology

    The next step is formulation of hypothesis. It is a statement of assumption (s) of relationship or association of a disease with a particular factor (s). These assumptions are made on the basis previous reported observations in the scientific literature on the subject, or own theories and concepts. ... Testing Hypothesis. Analytic epidemiology ...

  8. Descriptive Epidemiology

    Hypothesis Formulation - Characteristics of Person, Place, and Time Descriptive epidemiology searches for patterns by examining characteristics of person, place, & time . These characteristics are carefully considered when a disease outbreak occurs, because they provide important clues regarding the source of the outbreak.

  9. Using Epidemiologic Methods to Test Hypotheses Regarding Causal

    Epidemiology is a set of methods developed to support inferences regarding the causes of health problems and other socially significant outcomes (Susser, Schwartz, Morabia, & Bromet, 2006).Epidemiologic studies often use population-based samples to avoid biases that may confound causal relations and often use longitudinal designs to measure putative causal risk factors prior to the onset of ...

  10. Cross-Sectional Study: The Role of Observation in Epidemiological

    Cross-sectional studies, therefore, play an important role in the examination of health data and hypothesis formulation. Additionally, they may be used to provide initial evidence of causality. This chapter will explain the conceptualization, conduct, and interpretation of cross-sectional studies. ... Modern epidemiology is, however, commonly ...

  11. Advanced epidemiologic and analytical methods

    Epidemiologic evidence plays a critical role to inform the formulation of complex theoretic models and relevant hypotheses. Observational epidemiology prompts the formulation of specific hypotheses that may be tested in experimental studies, and further refined after testing (Fig. 3.1).A conceptual framework is needed to maximize this circular process.

  12. Table of Contents

    Hypothesis Formulation - Characteristics of Person, Place, and Time. Descriptive Epidemiology for Infectious Disease Outbreaks. "Person". "Place". "Time". Page 3. Epidemic Curves. Page 4. (Optional) -Two Methods for Creating an Epidemic Curve in Excel.

  13. Epidemiological hypothesis testing using a phylogeographic and ...

    Here, Dellicour et al. illustrate how phylodynamic and phylogeographic analyses can be leveraged for hypothesis testing in molecular epidemiology using West Nile virus in North America as an example.

  14. 4. Test Hypotheses Using Epidemiologic and Environmental Investigation

    Once a hypothesis is generated, it should be tested to determine if the source has been correctly identified. ... Evaluating symptoms and sequelae across patients can guide formulation of a clinical diagnosis. Results of advance molecular diagnostics can be evaluated to compare isolates from patient and the outbreak sources (e.g., water ...

  15. PDF Frameworks for Causal Inference in Epidemiology

    became the main focus of epidemiology textbooks, at the expense of little attention devoted to the discussion of such fundamental issues as theories of causation or hypothesis formulation (Krieger 1994). Indeed, there has b een a shift in recent years towards framing

  16. PDF Part I. Theory and Practice of Epidemiological Models

    CHAPTER 1 Construction ofepidemiological models An epidemiological model represents a dynamic systemofstrictlyinterrelatedepidemiologicalfactors, able to "mirror ...

  17. Methods for generating hypotheses in human enteric illness outbreak

    The use of descriptive epidemiology is generally based on questionnaire data and is often one of the first hypothesis generation methods employed in outbreak investigations. Other methods, such as food or environmental sampling, facility inspections and food handler testing may be used in conjunction with questionnaires, particularly if the ...

  18. How Do You Formulate (Important) Hypotheses?

    Building on the ideas in Chap. 1, we describe formulating, testing, and revising hypotheses as a continuing cycle of clarifying what you want to study, making predictions about what you might find together with developing your reasons for these predictions, imagining tests of these predictions, revising your predictions and rationales, and so ...

  19. Step 6: Develop Hypotheses

    Step 6: Develop Hypotheses. As noted previously, these steps are not undertaken in a rigid serial order. In fact, the order may vary depending on the circumstances, and some steps will be undertaken simultaneously. As soon as an outbreak is suspected, one automatically considers what the cause might be and the factors that are fueling it.

  20. [Generation and evaluation of etiologic hypotheses in epidemiology]

    Abstract. So far the problems of the generation and evaluation of etiologic hypotheses have been of too little concern to epidemiologists. Epidemiologic research usually deals with two fundamental etiologic questions: the first is 'why' an epidemiological phenomenon occurs; the second is 'how', and the question relates to the mediating mechanism.

  21. Microbiology Ch. 14 (Epidemiology) Flashcards

    1. Observation leads to formulation of a question. 2. Formulate a hypothesis to address a question. 3. Design and conduct experiments. 4. Scientist decides to accept, reject, or modify a hypothesis. Drag the description of Dr. Snow's actions or activities to the step of the scientific method it most closely fits.

  22. Descriptive epidemiology of Parkinson's disease: disease ...

    Descriptive epidemiology of Parkinson's disease: disease distribution and hypothesis formulation Adv Neurol. 1987:45:277-83. Author B S Schoenberg. PMID: 3493626 Abstract In the last 30 years, there have been several studies reporting morbidity rates for PD. Age-adjusted prevalence ratios from these investigations range from a low of 30/100,000 ...

  23. [The formulation of epidemiological hypotheses by using the ...

    The work demonstrates the main approaches to the use of the methods of multidimensional analysis for the creation of a hypothesis on the mechanism of the epidemiological process of dysentery in organized groups. The main risk factors have been established, and their role in the formation of annual, …