Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 20, Issue 5
  • An introduction to power and sample size estimation
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

This article has corrections. Please see:

  • Correction - January 01, 2004
  • Correction: An introduction to power and sample size estimation - October 01, 2023

Download PDF

  • S R Jones 1 ,
  • S Carley 2 ,
  • M Harrison 3
  • 1 North Manchester Hospital, Manchester, UK
  • 2 Royal Bolton Hospital, Bolton, UK
  • 3 North Staffordshire Hospital, UK
  • Correspondence to: Dr S R Jones, Emergency Department, Manchester Royal Infirmary, Oxford Road, Manchester M13 9WL, UK; steve.r.jones{at}bigfoot.com

The importance of power and sample size estimation for study design and analysis.

  • research design
  • sample size

https://doi.org/10.1136/emj.20.5.453

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Understand power and sample size estimation.

Understand why power is an important part of both study design and analysis.

Understand the differences between sample size calculations in comparative and diagnostic studies.

Learn how to perform a sample size calculation.

– (a) For continuous data

– (b) For non-continuous data

– (c) For diagnostic tests

POWER AND SAMPLE SIZE ESTIMATION

Power and sample size estimations are measures of how many patients are needed in a study. Nearly all clinical studies entail studying a sample of patients with a particular characteristic rather than the whole population. We then use this sample to draw inferences about the whole population.

In previous articles in the series on statistics published in this journal, statistical inference has been used to determine if the results found are true or possibly due to chance alone. Clearly we can reduce the possibility of our results coming from chance by eliminating bias in the study design using techniques such as randomisation, blinding, etc. However, another factor influences the possibility that our results may be incorrect, the number of patients studied. Intuitively we assume that the greater the proportion of the whole population studied, the closer we will get to true answer for that population. But how many do we need to study in order to get as close as we need to the right answer?

WHAT IS POWER AND WHY DOES IT MATTER

Power and sample size estimations are used by researchers to determine how many subjects are needed to answer the research question (or null hypothesis).

An example is the case of thrombolysis in acute myocardial infarction (AMI). For many years clinicians felt that this treatment would be of benefit given the proposed aetiology of AMI, however successive studies failed to prove the case. It was not until the completion of adequately powered “mega-trials” that the small but important benefit of thrombolysis was proved.

Generally these trials compared thrombolysis with placebo and often had a primary outcome measure of mortality at a certain number of days. The basic hypothesis for the studies may have compared, for example, the day 21 mortality of thrombolysis compared with placebo. There are two hypotheses then that we need to consider:

The null hypothesis is that there is no difference between the treatments in terms of mortality.

The alternative hypothesis is that there is a difference between the treatments in terms of mortality.

In trying to determine whether the two groups are the same (accepting the null hypothesis) or they are different (accepting the alternative hypothesis) we can potentially make two kinds of error. These are called a type I error and a type II error.

A type I error is said to have occurred when we reject the null hypothesis incorrectly (that is, it is true and there is no difference between the two groups) and report a difference between the two groups being studied.

A type II error is said to occur when we accept the null hypothesis incorrectly (that is, it is false and there is a difference between the two groups which is the alternative hypothesis) and report that there is no difference between the two groups.

They can be expressed as a two by two table (table 1 ⇓ ).

  • View inline

Two by two table

Power calculations tell us how many patients are required in order to avoid a type I or a type II error.

The term power is commonly used with reference to all sample size estimations in research. Strictly speaking “power” refers to the probability of avoiding a type II error in a comparative study. Sample size estimation is a more encompassing term that looks at more than just the type II error and is applicable to all types of studies. In common parlance the terms are used interchangeably.

WHAT AFFECTS THE POWER OF A STUDY?

There are several factors that can affect the power of a study. These should be considered early on in the development of a study. Some of the factors we have control over, others we do not.

The precision and variance of measurements within any sample

Why might a study not find a difference if there truly is one? For any given result from a sample of patients we can only determine a probability distribution around that value that will suggest where the true population value lies. The best known example of this would be 95% confidence intervals. The size of the confidence interval is inversely proportional to the number of subjects studied. So the more people we study the more precise we can be about where the true population value lies.

Figure 1 ⇓ shows that for a single measurement, the more subjects studied the narrower the probability distribution becomes. In group 1 the mean is 5 with wide confidence intervals (3–7). By doubling the number of patients studied (but in our example keeping the values the same) the confidence intervals have narrowed (3.5–6.5) giving a more precise estimate of the true population mean.

  • Download figure
  • Open in new tab
  • Download powerpoint

Change in confidence interval width with increasing numbers of subjects.

The probability distribution of where the true value lies is an integral part of most statistical tests for comparisons between groups (for example, t tests). A study with a small sample size will have large confidence intervals and will only show up as statistically abnormal if there is a large difference between the two groups. Figure 2 ⇓ demonstrates how increasing the number of subjects can give a more precise estimate of differences.

Effect of confidence interval reduction to demonstrate a true difference in means. This example shows that the initial comparison between groups 1 and 3 showed no statistical difference as the confidence intervals overlapped. In groups 3 and 4 the number of patients is doubled (although the mean remains the same). We see that the confidence intervals no longer overlap indicating that the difference in means is unlikely to have occurred by chance.

The magnitude of a clinically significant difference

If we are trying to detect very small differences between treatments, very precise estimates of the true population value are required. This is because we need to find the true population value very precisely for each treatment group. Conversely, if we find, or are looking for, a large difference a fairly wide probability distribution may be acceptable.

In other words if we are looking for a big difference between treatments we might be able to accept a wide probability distribution, if we want to detect a small difference we will need great precision and small probability distributions. As the width of probability distributions is largely determined by how many subjects we study it is clear that the difference sought affects sample size calculations.

Factors affecting a power calculation

Magnitude of a clinically significant difference

How certain we want to be to avoid type 1 error

The type of statistical test we are performing

When comparing two or more samples we usually have little control over the size of the effect. However, we need to make sure that the difference is worth detecting. For example, it may be possible to design a study that would demonstrate a reduction in the onset time of local anaesthesia from 60 seconds to 59 seconds, but such a small difference would be of no clinical importance. Conversely a study demonstrating a difference of 60 seconds to 10 minutes clearly would. Stating what the “clinically important difference” is a key component of a sample size calculation.

How important is a type I or type II error for the study in question?

We can specify how concerned we would be to avoid a type I or type II error. A type I error is said to have occurred when we reject the null hypothesis incorrectly. Conventionally we choose a probability of <0.05 for a type I error. This means that if we find a positive result the chances of finding this (or a greater difference) would occur on less than 5% of occasions. This figure, or significance level, is designated as pα and is usually pre-set by us early in the planning of a study, when performing a sample size calculation. By convention, rather than design, we more often than not choose 0.05. The lower the significance level the lower the power, so using 0.01 will reduce our power accordingly.

(To avoid a type I error—that is, if we find a positive result the chances of finding this, or a greater difference, would occur on less than α% of occasions)

A type II error is said to occur when we accept the null hypothesis incorrectly and report that there is no difference between the two groups. If there truly is a difference between the interventions we express the probability of getting a type II error and how likely are we to find it. This figure is referred to as pβ. There is less convention as to the accepted level of pβ, but figures of 0.8–0.9 are common (that is, if a difference truly exists between interventions then we will find it on 80%–90% of occasions.)

The avoidance of a type II error is the essence of power calculations. The power of a study, pβ, is the probability that the study will detect a predetermined difference in measurement between the two groups, if it truly exists, given a pre-set value of pα and a sample size, N.

Sample size calculations indicate how the statistical tests used in the study are likely to perform. Therefore, it is no surprise that the type of test used affects how the sample size is calculated. For example, parametric tests are better at finding differences between groups than non-parametric tests (which is why we often try to convert basic data to normal distributions). Consequently, an analysis reliant upon a non-parametric test (for example, Mann-Whitney U) will need more patients than one based on a parametric test (for example, Student’s t test).

SHOULD SAMPLE SIZE CALCULATIONS BE PERFORMED BEFORE OR AFTER THE STUDY?

The answer is definitely before, occasionally during, and sometimes after.

In designing a study we want to make sure that the work that we do is worthwhile so that we get the correct answer and we get it in the most efficient way. This is so that we can recruit enough patients to give our results adequate power but not too many that we waste time getting more data than we need. Unfortunately, when designing the study we may have to make assumptions about desired effect size and variance within the data.

Interim power calculations are occasionally used when the data used in the original calculation are known to be suspect. They must be used with caution as repeated analysis may lead to a researcher stopping a study as soon as statistical significance is obtained (which may occur by chance at several times during subject recruitment). Once the study is underway analysis of the interim results may be used to perform further power calculations and adjustments made to the sample size accordingly. This may be done to avoid the premature ending of a study, or in the case of life saving, or hazardous therapies, to avoid the prolongation of a study. Interim sample size calculations should only be used when stated in the a priori research method.

When we are assessing results from trials with negative results it is particularly important to question the sample size of the study. It may well be that the study was underpowered and that we have incorrectly accepted the null hypothesis, a type II error. If the study had had more subjects, then a difference may well have been detected. In an ideal world this should never happen because a sample size calculation should appear in the methods section of all papers, reality shows us that this is not the case. As a consumer of research we should be able to estimate the power of a study from the given results.

Retrospective sample size calculation are not covered in this article. Several calculators for retrospective sample size are available on the internet (UCLA power calculators ( http://calculators.stat.ucla.edu/powercalc/ ), Interactive statistical pages ( http://www.statistics.com/content/javastat.html ).

WHAT TYPE OF STUDY SHOULD HAVE A POWER CALCULATION PERFORMED?

Nearly all quantitative studies can be subjected to a sample size calculation. However, they may be of little value in early exploratory studies where scarce data are available on which to base the calculations (though this may be addressed by performing a pilot study first and using the data from that).

Clearly sample size calculations are a key component of clinical trials as the emphasis in most of these studies is in finding the magnitude of difference between therapies. All clinical trials should have an assessment of sample size.

In other study types sample size estimation should be performed to improve the precision of our final results. For example, the principal outcome measures for many diagnostic studies will be the sensitivity and specificity for a particular test, typically reported with confidence intervals for these values. As with comparative studies, the greater number of patients studied the more likely the sample finding is to reflect the true population value. By performing a sample size calculation for a diagnostic study we can specify the precision with which we would like to report the confidence intervals for the sensitivity and specificity.

As clinical trials and diagnostic studies are likely to form the core of research work in emergency medicine we have concentrated on these in this article.

POWER IN COMPARATIVE TRIALS

Studies reporting continuous normally distributed data.

Suppose that Egbert Everard had become involved in a clinical trial involving hypertensive patients. A new antihypertensive drug, Jabba Juice, was being compared with bendrofluazide as a new first line treatment for hypertension (table 2 ⇓ ).

Egbert writes down some things that he thinks are important for the calculation

As you can see the figures for pα and pβ are somewhat typical. These are usually set by convention, rather than changing between one study and another, although as we see below they can change.

A key requirement is the “clinically important difference” we want to detect between the treatment groups. As discussed above this needs to be a difference that is clinically important as, if it is very small, it may not be worth knowing about.

Another figure that we require to know is the standard deviation of the variable within the study population. Blood pressure measurements are a form of normally distributed continuous data and as such will have standard deviation, which Egbert has found from other studies looking at similar groups of people.

Once we know these last two figures we can work out the standardised difference and then use a table to give us an idea of the number of patients required.

The difference between the means is the clinically important difference—that is, it represents the difference between the mean blood pressure of the bendrofluazide group and the mean blood pressure of the new treatment group.

From Egbert’s scribblings:

Using table 3 ⇓ we can see that with a standardised difference of 0.5 and a power level (pβ) of 0.8 the number of patients required is 64. This table is for a one tailed hypothesis, (?) the null hypothesis requires the study to be powerful enough to detect either treatment being better or worse than the other, so we will need a minimum of 64×2=128 patients. This is so that we make sure we get patients that fall both sides of the mean difference we have set.

How power changes with standardised difference

Another method of setting the sample size is to use the nomogram developed by Gore and Altman 2 as shown in figure 3 ⇓ .

Nomogram for the calculation of sample size.

From this we can use a straight edge to join the standardised difference to the power required for the study. Where the edge crosses the middle variable gives an indication as to the number, N, required.

The nomogram can also be used to calculate power for a two tailed hypothesis comparison of a continuous measurement with the same number of patients in each group.

If the data are not normally distributed the nomogram is unreliable and formal statistical help should be sought.

Studies reporting categorical data

Suppose that Egbert Everard, in his constant quest to improve care for his patients suffering from myocardial infarction, had been persuaded by a pharmaceutical representative to help conduct a study into the new post-thrombolysis drug, Jedi Flow. He knew from previous studies that large numbers would be needed so performed a sample size calculation to determine just how daunting the task would be (table 4 ⇓ ).

Sample size calculation

Once again the figures for pα and pβ are standard, and we have set the level for a clinically important difference.

Unlike continuous data, the sample size calculation for categorical data is based on proportions. However, similar to continuous data we still need to calculate a standardised difference. This enables us to use the nomogram to work out how many patients are needed.

p 1 =proportional mortality in thrombolysis group =12% or 0.12

p 2 =proportional mortality in Jedi Flow group =9% or 0.09 (This is the 3% clinically important difference in mortality we want to show).

P=(p 1+ p 2 )/2=

The standardised difference is 0.1. If we use the nomogram, and draw a line from 0.1 to the power axis at 0.8, we can see from the intersect with the central axis, at 0.05 pα level, we need 3000 patients in the study. This means we need 1500 patients in the Jedi Flow group and 1500 in the thrombolysis group.

POWER IN DIAGNOSTIC TESTS

Power calculations are rarely reported in diagnostic studies and in our experience few people are aware of them. They are of particular relevance to emergency medicine practice because of the nature of our work. The methods described here are taken from the work by Buderer. 3

Dr Egbert Everard decides that the diagnosis of ankle fractures may be improved by the use of a new hand held ultrasound device in the emergency department at Death Star General. The DefRay device is used to examine the ankle and gives a read out of whether the ankle is fractured or not. Dr Everard thinks this new device may reduce the need for patients having to wait hours in the radiology department thereby avoiding all the ear ache from patients when they come back. He thinks that the DefRay may be used as a screening tool, only those patients with a positive DefRay test would be sent to the radiology department to demonstrate the exact nature of the injury.

He designs a diagnostic study where all patients with suspected ankle fracture are examined in the emergency department using the DefRay. This result is recorded and then the patients are sent around for a radiograph regardless of the result of the DefRay test. Dr Everard and a colleague will then compare the results of the DefRay against the standard radiograph.

Missed ankle fractures cost Dr Everard’s department a lot of money in the past year and so it is very important that the DefRay performs well if it be accepted as a screening test. Egbert wonders how many patients he will need. He writes down some notes (table 5 ⇓ ).

Everard’s calculations

For a diagnostic study we calculate the power required to achieve either an adequate sensitivity or an adequate specificity. The calculations work around the standard two by two way of reporting diagnostic data as shown in table 6 ⇓ .

Two by two reporting table for diagnostic tests

To calculate the need for adequate sensitivity

To calculate the need for adequate specificity.

If Egbert were equally interested in having a test with a specificity and sensitivity we would take the greater of the two, but he is not. He is most interested in making sure the test has a high sensitivity to rule out ankle fractures. He therefore takes the figure for sensitivity, 243 patients.

Sample size estimation is key in performing effective comparative studies. An understanding of the concepts of power, sample size, and type I and II errors will help the researcher and the critical reader of the medical literature.

What factors affect a power calculation for a trial of therapy?

Dr Egbert Everard wants to test a new blood test (Sithtastic) for the diagnosis of the dark side gene. He wants the test to have a sensitivity of at least 70% and a specificity of 90% with 5% confidence levels. Disease prevalence in this population is 10%.

– (i) How many patients does Egbert need to be 95% sure his test is more than 70% sensitive?

– (ii) How many patients does Egbert need to be 95% sure that his test is more than 90% specific?

If Dr Everard was to trial a new treatment for light sabre burns that was hoped would reduce mortality from 55% to 45%. He sets the pα to 0.05 and pβ to 0.99 but finds that he needs lots of patients, so to make his life easier he changes the power to 0.80.

How many patients in each group did he need with the pα to 0.05 and pβ to 0.80?

How many patients did he need with the higher (original) power?

Quiz answers

(i) 2881 patients; (ii) 81 patients

(i) about 400 patients in each group; (ii) about 900 patients in each group

Acknowledgments

We would like to thank Fiona Lecky, honorary senior lecturer in emergency medicine, Hope Hospital, Salford for her help in the preparation of this paper.

  • Driscoll P , Wardrope J. An introduction to statistics. J Accid Emerg Med 2000 ; 17 : 205 . OpenUrl FREE Full Text
  • ↵ Gore SM , Altman DG. How large a sample. In: Statistics in practice . London: BMJ Publishing, 2001 :6–8.
  • ↵ Buderer NM . Statistical methodology: I. Incorporating the prevalence of disease into the sample size calculation for sensitivity and specificity. Acad Emerg Med 1996 ; 3 : 895 –900. OpenUrl CrossRef PubMed Web of Science

Correction notice Following recent feedback from a reader, the authors have corrected this article. The original version of this paper stated that: “Strictly speaking, “power” refers to the number of patients required to avoid a type II error in a comparative study.” However, the formal definition of “power” is that it is the probability of avoiding a type II error (rejecting the alternative hypothesis when it is true), rather than a reference to the number of patients. Power is, however, related to sample size as power increases as the number of patients in the study increases. This statement has therefore been corrected to: “Strictly speaking, “power” refers to the probability of avoiding a type II error in a comparative study.

Linked Articles

  • Correction Correction BMJ Publishing Group Ltd and the British Association for Accident & Emergency Medicine Emergency Medicine Journal 2004; 21 126-126 Published Online First: 20 Jan 2004.
  • Correction Correction: An introduction to power and sample size estimation BMJ Publishing Group Ltd and the British Association for Accident & Emergency Medicine Emergency Medicine Journal 2023; 40 e4-e4 Published Online First: 27 Sep 2023. doi: 10.1136/emj.20.5.453corr2

Read the full text or download the PDF:

Study design in clinical research: sample size estimation and power analysis

  • Published: February 1996
  • Volume 43 , pages 184–191, ( 1996 )

Cite this article

  • Jerrold Lerman 1  

4652 Accesses

121 Citations

4 Altmetric

Explore all metrics

An Erratum to this article was published on 01 August 1996

The purpose of this review is to describe the statistical methods available to determine sample size and power analysis in clinical trials. The information was obtained from standard textbooks and personal experience. Equations are provided for the calculations and suggestions are made for the use of power tables. It is concluded that sample size calculations and power analysis can be performed with the information provided and that the validity of clinical investigation would be improved by greater use of such analyses.

Cet article de revue décrit les méthodes statistiques utilisées au cours des épreuves cliniques pour déterminer la taille d’un échantillon et l’analyse de sa puissance. L’information provient des manuels standards et de l’expérience de l’auteur. Des équations sont fournies avec des suggestions sur l’usage des tables de puissance. En conclusion, avec cette information, il est possible d’effectuer les calculs de la taille d’un échantillon et l’analyse de sa puissance; ces analyses amélioreraient la validité d’une étude clinique si on les utilisaient plus.

Article PDF

Download to read the full article text

Similar content being viewed by others

clinical research sample size estimation

How to use and assess qualitative research methods

Loraine Busetto, Wolfgang Wick & Christoph Gumbinger

clinical research sample size estimation

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Zachary Munn, Micah D. J. Peters, … Edoardo Aromataris

clinical research sample size estimation

A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research

Taulant Muka, Marija Glisic, … Oscar H. Franco

Avoid common mistakes on your manuscript.

Moodie PF, Craig DB . Experimental design and statistical analysis. Can Anaesth Soc J 1986; 33: 63–5.

Article   PubMed   CAS   Google Scholar  

Villeneuve E, Mathieu A, Goldsmith CH . Power and sample size calculations in clinical trials from anesthesia journals. Anesth Analg 1992; 74: S337.

Google Scholar  

Mathieu A, Villeneuve E, Goldsmith CH . Critical appraisal of methodological reporting in the anaesthesia literature. Anesth Analg 1992; 74: S195.

Freiman JA, Chalmers TC, Smith H Jr., Keubler RR . The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 “negative” trials. N Engl J Med 1978; 299: 690–4.

PubMed   CAS   Google Scholar  

Gardner MJ, Bond J . An exploratory study of statistical assessment of papers published in the British Medical Journal. JAMA 1990; 263: 1355–7

Fisher DM . Statistics in Anesthesia. In : Anesthesia. Miller RD (ed.). 4th ed. New York: Churchill Livingstone, 1994: 782–5

Cohen J . The statistical power of abnormal social psychological research: a review. J Abnorm Soc Psychol 1962; 65: 145–53.

Cohen J . Statistical Power Analysis for the Behavioural Sciences. 2nd ed. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1988: 1–74,179–213

Glantz SA . Primer of Biostatistics. 3rd ed. New York: McGraw-Hill, 1992:91–4, 133–8, 387

Zar JH . Biostatistical Analysis. 2nd ed. Englewood Cliffs: Prentice-Hall, Inc., 1984: 134–7, 171–6, 397–400,484–5, 586–8

Snedecor GW, Cochran WG . Statistical Methods. 7th ed. Ames, Iowa: The Iowa State University Press, 1980: 469.

Download references

Author information

Authors and affiliations.

Department of Anaesthesia and the Research Institute, The Hospital for Sick Children and University of Toronto, Toronto, Ontario, Canada

Jerrold Lerman

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Jerrold Lerman .

Rights and permissions

Reprints and permissions

About this article

Lerman, J. Study design in clinical research: sample size estimation and power analysis. Can J Anaesth 43 , 184–191 (1996). https://doi.org/10.1007/BF03011261

Download citation

Accepted : 05 October 1995

Issue Date : February 1996

DOI : https://doi.org/10.1007/BF03011261

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • statistics : sample size calculation, power analysis
  • Find a journal
  • Publish with us
  • Track your research
  • Skip to main content
  • Skip to FDA Search
  • Skip to in this section menu
  • Skip to footer links

U.S. flag

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

U.S. Food and Drug Administration

  •   Search
  •   Menu
  • Science & Research
  • Science and Research Special Topics
  • Advancing Regulatory Science

Statistical methods to improve precision and reduce the required sample size in many phase 2 and 3 clinical trials, including COVID-19 trials, by covariate adjustment

CERSI Collaborators: Michael Rosenblum, PhD; Joshua Betz, MS; Kelly Van Lancker, PhD; Bingkai Wang, PhD

FDA Collaborators:  Daniel Rubin, PhD, CDER; Greg Levin, PhD, CDER; Boguang Zhen, PhD, CBER; Gene Pennello, PhD, CDRH

CERSI Subcontractors:  Weill Cornell Medicine Art Sedrakyan, MD, PhD; Jim C. Hu, MD, MPH; Jialin Mao, MD, MS; Miko Yu, MA; Sendong Zhao, PHD, Vahan Simonyan, PhD

Project Start Date: March 1, 2020

Regulatory Science Challenge

Investigators are addressing the following FDA research priority: “Developing methods and tools to improve and streamline clinical and post-market evaluation of FDA-regulated products.”

Project Description and Goals

Clinical trials are often conducted to learn whether new medical treatments are safe and effective. Data collected when participants first enter a trial are called baseline variables. Examples of baseline variables include age, sex, and disease severity. Sometimes, due to chance, there are imbalances in these variables between those assigned to the experimental treatment arm (or group) and those assigned to the control arm (or group). For example, in some trials participants in the treatment arm may have higher or lower baseline disease severity compared with participants in the control arm.

When baseline variables (e.g., older age as a risk factor for worse outcome among those with COVID-19 infection) are related to the outcome, (e.g., one intended to support effectiveness of a treatment), taking the baseline variables into account in the data analysis of a trial’s results can lead to more precise estimates of treatment effectiveness (i.e., smaller standard errors for the estimates). More precise estimation means that at the planning stage, trials could be conducted with fewer participants than when baseline variables are not considered in the data analysis. Unfortunately, in many clinical trials information in baseline variables is not considered, leading to a greater chance that the trial will fail due to greater uncertainty regarding conclusions that can be drawn, potentially wasting resources. At the planning stage, not accounting for baseline information may lead to the planning of a larger trial size and longer trial duration than is necessary.

A major barrier to using baseline variables (called covariate adjustment) is that for many common types of clinical trial outcomes, such as binary, ordinal, and time-to-event outcomes, confusion remains as to what statistical approach is appropriate. The results of this project will help to overcome this barrier by demonstrating how to appropriately adjust for baseline variables to improve precision of treatment effect estimates for outcomes of these types. The statistical adjustments will be demonstrated on case studies in several disease areas as examples of best practices for the use of baseline variables in clinical trial data analysis. Investigators plan to disseminate these case studies to the public through a free, online tutorial on the FDA public website.

Publications

Williams, N., Rosenblum, M. & Díaz, I. (2022) Optimising precision and power by machine learning in randomised trials with ordinal and time-to-event outcomes with an application to COVID-19. Journal of the Royal Statistical Society: Series A (Statistics in Society) , 1– 23. https://doi.org/10.1111/rssa.12915

Wang, B., Susukida, R., Mojtabai, R., Amin-Esmaeili, M., and Rosenblum, M. (2021) Model-Robust Inference for Clinical Trials that Improve Precision by Stratified Randomization and Adjustment for Additional Baseline Variables. Journal of the American Statistical Association, Theory and Methods Section. https://www.tandfonline.com/doi/full/10.1080/01621459.2021.1981338

Kelly Van Lancker, Joshua Betz, Michael Rosenblum. Combining Covariate Adjustment with Group Sequential, Information Adaptive Designs to Improve Randomized Trial Efficiency. Under review: https://arxiv.org/abs/2201.12921

Europe PMC requires Javascript to function effectively.

Either your web browser doesn't support Javascript or it is currently turned off. In the latter case, please turn on Javascript support in your web browser and reload this page.

Search life-sciences literature (43,931,734 articles, preprints and more)

  • Full text links
  • Citations & impact
  • Similar Articles

Sample Size Estimation in Clinical Research: From Randomized Controlled Trials to Observational Studies.

Author information, affiliations.

Chest , 01 Jul 2020 , 158(1S): S12-S20 https://doi.org/10.1016/j.chest.2020.03.010   PMID: 32658647 

Abstract 

Full text links .

Read article at publisher's site: https://doi.org/10.1016/j.chest.2020.03.010

Citations & impact 

Impact metrics, citations of article over time, alternative metrics.

Altmetric item for https://www.altmetric.com/details/96556715

Smart citations by scite.ai Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by EuropePMC if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles. Explore citation contexts and check if this article has been supported or disputed. https://scite.ai/reports/10.1016/j.chest.2020.03.010

Article citations, timing of cholecystectomy in severe pancreatitis (chispa): study protocol for a randomized controlled trial..

Ramírez-Giraldo C , Conde Monroy D , Daza Vergara JA , Isaza-Restrepo A , Van-Londoño I , Trujillo-Guerrero L

BMJ Surg Interv Health Technol , 6(1):e000246, 07 Mar 2024

Cited by: 0 articles | PMID: 38463464 | PMCID: PMC10921534

Development of a Multiplatform Tool for the Prevention of Prevalent Mental Health Pathologies in Adults: Protocol for a Randomized Control Trial.

Ramos N , Besoain F , Cancino N , Gallardo I , Albornoz P , Fresno A , Spencer R , Schott S , Núñez D , Salgado C , Campos S

JMIR Res Protoc , 13:e52324, 11 Mar 2024

Cited by: 0 articles | PMID: 38466982 | PMCID: PMC10964138

Assessment of color changes and adverse effects of over-the-counter bleaching protocols: a systematic review and network meta-analysis.

de Oliveira MN , Vidigal MTC , Vieira W , Lins-Candeiro CL , Oliveira LM , Nascimento GG , da Silva GR , Paranhos LR

Clin Oral Investig , 28(3):189, 02 Mar 2024

Cited by: 0 articles | PMID: 38430338

Development of Continuous Assessment of Muscle Quality and Frailty in Older Patients Using Multiparametric Combinations of Ultrasound and Blood Biomarkers: Protocol for the ECOFRAIL Study.

Virto N , Río X , Angulo-Garay G , García Molina R , Avendaño Céspedes A , Cortés Zamora EB , Gómez Jiménez E , Alcantud Córcoles R , Rodriguez Mañas L , Costa-Grille A , Matheu A , Marcos-Pérez D , Lazcano U , Vergara I , Arjona L , Saeteros M , Lopez-de-Ipiña D , Coca A , Abizanda Soler P , Sanabria SJ

JMIR Res Protoc , 13:e50325, 23 Feb 2024

Cited by: 0 articles | PMID: 38393761

An observational study ascertaining the prevalence of bullae and blebs in young, healthy adults and its possible implications for scuba diving.

Bresser MF , Wingelaar TT , Van Weering JAF , Bresser P , Van Hulst RA

Front Physiol , 15:1349229, 14 Feb 2024

Cited by: 0 articles | PMID: 38420621 | PMCID: PMC10899502

Similar Articles 

To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.

Sample size estimation: an overview with applications to orthodontic clinical trial designs.

Pandis N , Polychronopoulou A , Eliades T

Am J Orthod Dentofacial Orthop , 140(4):e141-6, 01 Oct 2011

Cited by: 20 articles | PMID: 21967951

Randomized Controlled Trials.

Zabor EC , Kaizer AM , Hobbs BP

Chest , 158(1s):S79-S87, 01 Jul 2020

Cited by: 33 articles | PMID: 32658656 | PMCID: PMC8176647

Review Free full text in Europe PMC

Considerations on sample size and power calculations in randomized clinical trials.

Karlsson J , Engebretsen L , Dainty K , ISAKOS Scientific Committee

Arthroscopy , 19(9):997-999, 01 Nov 2003

Cited by: 18 articles | PMID: 14608320

Efficient designs: factorial randomized trials.

Whelan DB , Dainty K , Chahal J

J Bone Joint Surg Am , 94 Suppl 1:34-38, 01 Jul 2012

Cited by: 6 articles | PMID: 22810445

Sample size estimation for GEE method for comparing slopes in repeated measurements data.

Jung SH , Ahn C

Stat Med , 22(8):1305-1315, 01 Apr 2003

Cited by: 48 articles | PMID: 12687656

Europe PMC is part of the ELIXIR infrastructure

Sample Size Calculator

Determines the minimum number of subjects for adequate study power, clincalc.com » statistics » sample size calculator, study group design.

Two independent study groups

One study group vs. population

Primary Endpoint

Dichotomous (yes/no)

Continuous (means)

Statistical Parameters

Dichotomous endpoint, two independent sample study, about this calculator.

This calculator uses a number of different equations to determine the minimum number of subjects that need to be enrolled in a study in order to have sufficient statistical power to detect a treatment effect. 1

Before a study is conducted, investigators need to determine how many subjects should be included. By enrolling too few subjects, a study may not have enough statistical power to detect a difference (type II error). Enrolling too many patients can be unnecessarily costly or time-consuming.

Generally speaking, statistical power is determined by the following variables:

  • Baseline Incidence: If an outcome occurs infrequently, many more patients are needed in order to detect a difference.
  • Population Variance: The higher the variance (standard deviation), the more patients are needed to demonstrate a difference.
  • Treatment Effect Size: If the difference between two treatments is small, more patients will be required to detect a difference.
  • Alpha: The probability of a type-I error -- finding a difference when a difference does not exist. Most medical literature uses an alpha cut-off of 5% (0.05) -- indicating a 5% chance that a significant difference is actually due to chance and is not a true difference.
  • Beta: The probability of a type-II error -- not detecting a difference when one actually exists. Beta is directly related to study power (Power = 1 - β). Most medical literature uses a beta cut-off of 20% (0.2) -- indicating a 20% chance that a significant difference is missed.

Post-Hoc Power Analysis

To calculate the post-hoc statistical power of an existing trial, please visit the post-hoc power analysis calculator .

References and Additional Reading

  • Rosner B. Fundamentals of Biostatistics . 7th ed. Boston, MA: Brooks/Cole; 2011.

Related Calculators

  • Post-hoc Power Calculator

New and Popular

Cite this page.

Show AMA citation

We've filled out some of the form to show you this clinical calculator in action. Click here to start from scratch and enter your own patient data.

Sample size re-estimation in clinical trials

Affiliation.

  • 1 Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina, USA.
  • PMID: 34433225
  • DOI: 10.1002/sim.9175

In clinical trials, sample size re-estimation is often conducted at interim. The purpose is to determine whether the study will achieve study objectives if the observed treatment effect at interim preserves till end of the study. A traditional approach is to conduct a conditional power analysis for sample size only based on observed treatment effect. This approach, however, does not take into consideration the variabilities of (i) the observed (estimate) treatment effect and (ii) the observed (estimate) variability associated with the treatment effect. Thus, the resultant re-estimated sample sizes may not be robust and hence may not be reliable. In this article, a couple of methods are proposed, namely, adjusted effect size (AES) approach and iterated expectation/variance (IEV) approach, which can account for the variability associated with the observed responses at interim. The proposed methods provide interval estimates of sample size required for the intended trial, which is useful for making critical go/no go decision. Statistical properties of the proposed methods are evaluated in terms of controlling of type I error rate and statistical power. The results show that traditional approach performs poorly in controlling type I error inflation, whereas IEV approach has the best performance in most cases. Additionally, all re-estimation approaches can keep the statistical power over 80 % ; especially, IEV approach's statistical power, using adjusted significance level, is over 95 % . However, IEV approach may lead to a greater increment in sample size when detecting a smaller effect size. In general, IEV approach is effective when effect size is large; otherwise, AES approach is more suitable for controlling type I error rate and keep power over 80 % with a more reasonable re-estimated sample size.

Keywords: adjusted effect size approach; adjusted statistical power; control of type I error rate; iterated expectation/variance approach.

© 2021 John Wiley & Sons Ltd.

  • Clinical Trials as Topic*
  • Research Design*
  • Sample Size

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Hum Reprod Sci
  • v.5(1); Jan-Apr 2012

This article has been retracted.

Sample size estimation and power analysis for clinical research studies.

Department of Biostatistics, National Institute of Animal Nutrition and Physiology, Bangalore, India

S Chandrashekara

1 Department of Immunology and Reumatology, ChanRe Rheumatology and Immunology Center and Research, Bangalore, India

Determining the optimal sample size for a study assures an adequate power to detect statistical significance. Hence, it is a critical step in the design of a planned research protocol. Using too many participants in a study is expensive and exposes more number of subjects to procedure. Similarly, if study is underpowered, it will be statistically inconclusive and may make the whole protocol a failure. This paper covers the essentials in calculating power and sample size for a variety of applied study designs. Sample size computation for single group mean, survey type of studies, 2 group studies based on means and proportions or rates, correlation studies and for case-control for assessing the categorical outcome are presented in detail.

INTRODUCTION

Clinical research studies can be classified into surveys, experiments, observational studies etc. They need to be carefully planned to achieve the objective of the study. The planning of a good research has many aspects. First step is to define the problem and it should be operational. Second step is to define the experimental or observational units and the appropriate subjects and controls. Meticulously, one has to define the inclusion and exclusion criteria, which should take care of all possible variables which could influence the observations and the units which are measured. The study design must be clear and the procedures are defined to the best possible and available methodology. Based on these factors, the study must have an adequate sample size, relative to the goals and the possible variabilities of the study. Sample must be ‘big enough’ such that the effect of expected magnitude of scientific significance, to be also statistically significant. Same time, It is important that the study sample should not be ‘too big’ where an effect of little scientific importance is nevertheless statistically detectable. In addition, sample size is important for economic reasons: An under-sized study can be a waste of resources since it may not produce useful results while an over-sized study uses more resources than necessary. In an experiment involving human or animal subjects, sample size is a critical ethical issue. Since an ill-designed experiment exposes the subjects to potentially harmful treatments without advancing knowledge.[ 1 , 2 ] Thus, a fundamental step in the design of clinical research is the computation of power and sample size. Power is the probability of correctly rejecting the null hypothesis that sample estimates (e.g. Mean, proportion, odds, correlation co-efficient etc.) does not statistically differ between study groups in the underlying population. Large values of power are desirable, at least 80%, is desirable given the available resources and ethical considerations. Power proportionately increases as the sample size for study increases. Accordingly, an investigator can control the study power by adjusting the sample size and vice versa.[ 3 , 4 ]

A clinical study will be expressed in terms of an estimate of effect, appropriate confidence interval, and P value. The confidence interval indicates the likely range of values for the true effect in the population while the P value determines the how likely that the observed effect in the sample is due to chance. A related quantity is the statistical power; this is the probability of identifying an exact difference between 2 groups in the study samples when one genuinely exists in the populations from which the samples were drawn.

Factors that affect the sample size

The calculation of an appropriate sample size relies on choice of certain factors and in some instances on crude estimates. There are 3 factors that should be considered in calculation of appropriate sample size- summarized in Table 1 . The each of these factors influences the sample size independently, but it is important to combine all these factors in order to arrive at an appropriate sample size.

Factors that affect sample size calculations

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g001.jpg

The Normal deviates for different significance levels (Type I error or Alpha) for one tailed and two tailed alternative hypothesis are shown in Table 2 .

The normal deviates for Type I error (Alpha)

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g002.jpg

The normal deviates for different power, probability of rejecting null hypothesis when it is not true or one minus probability of type II error are in shown Table 3 .

The normal deviates for statistical power

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g003.jpg

Study design, outcome variable and sample size

Study design has a major impact on the sample size. Descriptive studies need hundreds of subjects to give acceptable confidence interval for small effects. Experimental studies generally need lesser sample while the cross-over designs needs one-quarter of the number required compared to a control group because every subject gets the experimental treatment in cross-over study. An evaluation studies in single group with pre-post type of design needs half the number for a similar study with a control group. A study design with one-tailed hypothesis requires 20% lesser subjects compared to two-tailed studies. Non-randomized studies needs 20% more subjects compared to randomized studies in order to accommodate confounding factors. Additional 10 - 20% subjects are required to allow adjustment of other factors such as withdrawals, missing data, lost to follow-up etc.

The “outcome” expected under study should be considered. There are 3 possible categories of outcome. The first is a simple case where 2 alternatives exist: Yes/no, death/alive, vaccinated/not vaccinated, etc. The second category covers multiple, mutually exclusive alternatives such as religious beliefs or blood groups. For these 2 categories of outcome, the data are generally expressed as percentages or rates[ 5 – 7 ] The third category covers continuous response variables such as weight, height, blood pressure, VAS score, IL6, TNF-a, homocysteine etc, which are continuous measures and are summarized as means and standard deviations. The statistical methods appropriates the sample size based on which of these outcomes measure is critical for the study, for example, larger sample size is required to assess the categorical variable compared to continuous outcome variable.

Alpha level

The definition of alpha is the probability of detecting a significant difference when the treatments are equally effective or risk of false positive findings. The alpha level used in determining the sample size in most of academic research studies are either 0.05 or 0.01.[ 7 ] Lower the alpha level, larger is the sample size. For example, a study with alpha level of 0.01 requires more subjects when compared to a study with alpha level of 0.05 for similar outcome variable. Lower alpha viz 0.01 or less is used when the decisions based on the research are critical and the errors may cause substantial, financial, or personal harm.

Variance or standard deviation

The variance or standard deviation for sample size calculation is obtained either from previous studies or from pilot study. Larger the standard deviation, larger is the sample size required in a study. For example, in a study, with primary outcome variable is TNF-a, needs more subjects compared to a variable of birth weight, 10-point Vas score etc. as the natural variability of TNF-a is wide compared to others.

Minimum detectable difference

This is the expected difference or relationship between 2 independent samples, also known as the effect size. The obvious question is how to know the difference in a study, which is not conducted. If available, it may be useful to use the effect size found from prior studies. Where no previous study exists, the effect size is determined from literature review, logical assertion, and conjecture.

The difference between 2 groups in a study will be explored in terms of estimate of effect, appropriate confidence interval, and P value. The confidence interval indicates the likely range of values for the true effect in a population while P value determines how likely it is that the observed effect in the sample is due to chance. A related quantity is the statistical power of the study, is the probability of detecting a predefined clinical significance. The ideal study is the one, which has high power. This means that the study has a high chance of detecting a difference between groups if it exists, consequently, if the study demonstrates no difference between the groups, the researcher can reasonably confident in concluding that none exists. The ideal power for any study is considered to be 80%.[ 8 ]

In research, statistical power is generally calculated with 2 objectives. 1) It can be calculated before data collection based on information from previous studies to decide the sample size needed for the current study. 2) It can also be calculated after data analysis. The second situation occurs when the result turns out to be non-significant. In this case, statistical power is calculated to verify whether the non-significance result is due to lack of relationship between the groups or due to lack of statistical power.

Statistical power is positively correlated with the sample size, which means that given the level of the other factors viz. alpha and minimum detectable difference, a larger sample size gives greater power. However, researchers should be clear to find a difference between statistical difference and scientific difference. Although a larger sample size enables researchers to find smaller difference statistically significant, the difference found may not be scientifically meaningful. Therefore, it is recommended that researchers must have prior idea of what they would expect to be a scientifically meaningful difference before doing a power analysis and determine the actual sample size needed. Power analysis is now integral to the health and behavioral sciences, and its use is steadily increasing whenever the empirical studies are performed.

Withdrawals, missing data and losses to follow-up

Sample size calculated is the total number of subjects who are required for the final study analysis. There are few practical issues, which need to be considered while calculating the number of subjects required. It is a fact that all eligible subjects may not be willing to take part and may be necessary screen more subjects than the final number of subjects entering the study. In addition, even in well-designed and conducted studies, it is unusual to finish with a dataset, which is complete for all the subjects recruited, in a usable format. The reason could be subject factor like- subjects may fail or refuse to give valid responses to particular questions, physical measurements may suffer from technical problems, and in studies involving follow-up (eg. Trials or cohort studies), there will be some degree of attrition. The reason could be technical and the procedural problem- like contamination, failure to get the assessment or test performed in time. It may, therefore, necessary to consider these issues before calculating the number of subjects to be recruited in a study in order to achieve the final desired sample size.

Example, say in a study, a total of N number of subjects are required in the end of the study with all the data being complete for analysis, but a proportion (q) are expected to refuse to participate or drop out before the study ends. In this case, the following total number of subjects (N 1 ) would have to be recruited to ensure that the final sample size (N) is achieved:

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g004.jpg

The proportion of eligible subjects who will refuse to participate or provide the inadequate information will be unknown at the beginning of the study. Approximate estimates is often possible using information from similar studies in comparable populations or from an appropriate pilot study.[ 9 ]

Sample size estimation for proportion in survey type of studies

A common goal of survey research is to collect data representative of population. The researcher uses information gathered from the survey to generalize findings from a drawn sample back to a population, within the limits of random error. The general rule relative to acceptable margins of error in survey research is 5 - 10%. The sample size can be estimated using the following formula

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g005.jpg

Where P is the prevalence or proportion of event of interest for the study, E is the Precision (or margin of error) with which a researcher want to measure something. Generally, E will be 10% of P and Z α/2 is normal deviate for two-tailed alternative hypothesis at a level of significance; for example, for 5% level of significance, Z α/2 is 1.96 and for 1% level of significance it is 2.58 as shown in Table 2 . D is the design effect reflects the sampling design used in the survey type of study. This is 1 for simple random sampling and higher values (usually 1 to 2) for other designs such as stratified, systematic, cluster random sampling etc, estimated to compensate for deviation from simple random sampling procedure. The design effect for cluster random sampling is taken as 1.5 to 2. For the purposive sampling, convenience or judgment sampling, D will cross 10. Higher the D, the more will be sample size required for a study. Simple random sampling is unlikely to be the sampling method in an actual filed survey. If another sampling method such as systematic, stratified, cluster sampling etc. is used, a larger sample size is likely to be needed because of the “design effect”.[ 10 – 12 ] In case of impact study, P may be estimated at 50% to reflect the assumption that an impact is expected in 50% of the population. A P of 50% is also a conservative estimate; Example: Researcher interested to know the sample size for conducting a survey for measuring the prevalence of obesity in certain community. Previous literature gives the estimate of an obesity at 20% in the population to be surveyed, and assuming 95% confidence interval or 5% level of significance and 10% margin of error, the sample size can be calculated as follow as;

N = (Z α/2 ) 2 P(1-P)*1 / E 2 = (1.96) 2 *0.20*(1-0.20)/(0.1*0.20) 2 = 3.8416*0.16/(0.02) 2 = 1537 for a simple random sampling design. Hence, sample size of 1537 is required to conduct community-based survey to estimate the prevalence of obesity. Note-E is the margin of error, in the present example; it is 10% χ 0.20 = 0.02.

To find the final adjusted sample size, allowing non-response rate of 10% in the above example, the adjusted sample size will be 1537/(1-0.10) = 1537/0.90 = 1708.

Sample size estimation with single group mean

If researcher is conducting a study in single group such as outcome assessment in a group of patients subjected to certain treatment or patients with particular type of illness and the primary outcome is a continuous variable for which the mean and standard deviation are expression of results or estimates of population, the sample size can be estimated using the following formula

N = (Z α/2 ) 2 s 2 / d 2 ,

where s is the standard deviation obtained from previous study or pilot study, and d is the accuracy of estimate or how close to the true mean. Z α/2 is normal deviate for two- tailed alternative hypothesis at a level of significance.

Research studies with one tailed hypothesis, above formula can be rewritten as

N = (Z α ) 2 s 2 / d 2 , the Z α values are 1.64 and 2.33 for 5% and 1% level of significance.

Example: In a study for estimating the weight of population and wants the error of estimation to be less than 2 kg of true mean (that is expected difference of weight to be 2 kg), the sample standard deviation was 5 and with a probability of 95%, and (that is) at an error rate of 5%, the sample size estimated as N = (1.96) 2 (5) 2 / 2 2 gives the sample of 24 subjects, if the allowance of 10% for missing, losses to follow-up, withdrawals is assumed, then the corrected sample will be 27 subjects. Corrected sample size thus obtained is 24/(1.0-0.10) ≅ 24/0.9 = 27 and for 20% allowances, the corrected sample size will be 30.

Sample size estimation with two means

In a study with research hypothesis viz; Null hypothesis H o : m 1 = m 2 vs. alternative hypothesis H a : m 1 = m 2 + d where d is the difference between two means and n1 and n2 are the sample size for Group I and Group II such that N = n1 + n2. The ratio r = n1/n2 is considered whenever the researcher needs unequal sample size due to various reasons, such as ethical, cost, availability etc.

Then, the total sample size for the study is as follows

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g006.jpg

Sample size estimation with two proportions

In study based on outcome in proportions of event in two populations (groups), such as percentage of complications, mortality improvement, awareness, surgical or medical outcome etc., the sample size estimation is based on proportions of outcome, which is obtained from previous literature review or conducting pilot study on smaller sample size. A study with null hypothesis of H o : π 1 = π 2 vs. H a : π 1 = π 2 + d , where π are population proportion and p1 and p2 are the corresponding sample estimates, the sample size can be estimated using the following formula

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g008.jpg

If researcher is planning to conduct a study with unequal groups, he or she must calculate N as if we are using equal groups, and then calculate the modified sample size. If r = n1/n2 is the ratio of sample size in 2 groups, then the required sample size is N 1 = N (1+ r ) 2 /4 r , if n1 = 2n2 that is sample size ratio is 2:1 for group 1 and group 2, then N 1 = 9 N /8, a fairly small increase in total sample size.

Example: It is believed that the proportion of patients who develop complications after undergoing one type of surgery is 5% while the proportion of patients who develop complications after a second type of surgery is 15%. How large should the sample be in each of the 2 groups of patients if an investigator wishes to detect, with a power of 90%, whether the second procedure has a complications rate significantly higher than the first at the 5% level of significance?

In the example,

  • a) Test value of difference in complication rate 0%
  • b) Anticipated complication rate 5%, 15% in 2 groups
  • c) Level of significance 5%
  • d) Power of the test 90%
  • e) Alternative hypothesis(one tailed) (p 1 -p 2 ) < 0%

The total sample size required is 74 for equal size distribution, for unequal distribution of sample size with 1.5:1 that is r = 1.5, the total sample size will be 77 with 46 for group I and 31 for group II.

Sample size estimation with correlation co-efficient

In an observational studies, which involves to estimate a correlation (r) between 2 variables of interest say, X and Y, a typical hypothesis of form H 0 : r = 0 against H a :r ≠ 0, the sample size for correlation study can be obtained by computing

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g010.jpg

Example: According to the literature, the correlation between salt intake and systolic blood pressure is around 0.30. A study is conducted to attests this correlation in a population, with the significance level of 1% and power of 90%. The sample size for such a study can be estimated as follows:

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g011.jpg

Sample size estimation with odds ratio

In case-control study, data are usually summarized in odds ratio, rather than difference between two proportions when the outcome variables of interest were categorical in nature. If P1 and P2 are proportion of cases and controls, respectively, exposed to a risk factor, then:

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g012.jpg

Example: The prevalence of vertebral fracture in a population is 25%. When the study is interested to estimate the effect of smoking on the fracture, with an odds ratio of 2, at the significance level of 5% (one-sided test) and power of 80%, the total sample size for the study of equal sample size can be estimated by:

An external file that holds a picture, illustration, etc.
Object name is JHRS-5-7-g014.jpg

The equations in this paper assume that the selection of individual is random and unbiased. The decisions to include a subject in the study depend on whether or not that subject has the characteristic or the outcome studied. Second, in studies in which the mean is calculated, the measurements are assumed to have normal distributions.[ 13 , 14 ]

The concept of statistical power is more associated with sample size, the power of the study increases with an increase in sample size. Ideally, minimum power of a study required is 80%. Hence, the sample size calculation is critical and fundamental for designing a study protocol. Even after completion of study, a retrospective power analysis will be useful, especially when a statistically not a significant results are obtained.[ 15 ] Here, actual sample size and alpha-level are known, and the variance observed in the sample provides an estimate of variance of population. The analysis of power retrospectively re-emphasizes the fact negative finding is a true negative finding.

The ideal study for the researcher is one in which the power is high. This means that the study has a high chance of detecting a difference between groups if one exists; consequently, if the study demonstrates no difference between groups, the researcher can be reasonably confident in concluding that none exists. The Power of the study depends on several factors, but as a general rule, higher power is achieved by increasing the sample size.[ 16 ] Many apparently null studies may be under-powered rather than genuinely demonstrating no difference between groups, absence of evidence is not evidence of absence.[ 9 ]

A Sample size calculation is an essential step in research protocols and is a must to justify the size of clinical studies in papers, reports etc. Nevertheless, one of the most common error in papers reporting clinical trials is a lack of justification of the sample size, and it is a major concern that important therapeutic effects are being missed because of inadequately sized studies.[ 17 , 18 ] The purpose of this review is to make available a collection of formulas for sample size calculations and examples for variety of situations likely to be encountered.

Often, the research is faced with various constraints that may force them to use an inadequate sample size because of both practical and statistical reasons. These constraints may include budget, time, personnel, and other resource limitations. In these cases, the researchers should report both the appropriate sample size along with sample size actually used in the study; the reasons for using inadequate sample sizes and a discussion of the effect of inadequate sample size may have on the results of the study. The researcher should exercise caution when making pragmatic recommendations based on the research with an inadequate sample size.

Sample size determination is an important major step in the design of a research study. Appropriately-sized samples are essential to infer with confidence that sample estimated are reflective of underlying population parameters. The sample size required to reject or accept a study hypothesis is determined by the power of an a-test. A study that is sufficiently powered has a statistical rescannable chance of answering the questions put forth at the beginning of research study. Inadequately sized studies often results in investigator's unrealistic assumptions about the effectiveness of study treatment. Misjudgment of the underlying variability for parameter estimates wrong estimate of follow-up period to observe the intended effects of the treatment and inability to predict the lack of compliance of the study regimen, and a high drop-rate rates and/or the failure to account for the multiplicity of study endpoints are the common error in a clinical research. Conducting a study that has little chance of answering the hypothesis at hand is a misuse of time and valuable resources and may unnecessarily expose participants to potential harm or unwarranted expectations of therapeutic benefits. As scientific and ethical issue go hand-in-hand, the awareness of determination of minimum required sample size and application of appropriate sampling methods are extremely important in achieving scientifically and statistically sound results. Using an adequate sample size along with high quality data collection efforts will result in more reliable, valid and generalizable results, it could also result in saving resources. This paper was designed as a tool that a researcher could use in planning and conducting quality research.

Source of Support: Nil

Conflict of Interest: None declared.

IMAGES

  1. Sample Size Estimation in Clinical Research

    clinical research sample size estimation

  2. Table 2 from Sampling Techniques & Determination of Sample Size in

    clinical research sample size estimation

  3. How to Calculate Sample Size

    clinical research sample size estimation

  4. Quick Guide to Biostatistics in Clinical Research: Sample Size

    clinical research sample size estimation

  5. Sample size calculation for comparison two proportion: RCT

    clinical research sample size estimation

  6. Improved Sample Size Calculator Website and New Calculators for Your

    clinical research sample size estimation

VIDEO

  1. Lecture 10: Sample Size Determination-II

  2. Experimental design and sample size estimation methods

  3. How to calculate/determine the Sample size for difference in proportion/percentage between 2 groups?

  4. SCORE-AA Week 7 Training: Randomization, Sample Size Estimation & Research Protocols

  5. Sample size estimation Version 2 1

  6. Sample Size Estimation and Statistical Estimation

COMMENTS

  1. Sample Size Estimation in Clinical Research

    The results for the sample size estimation with a significance level of 5% and a power of 80% are displayed in Figure 4. The sample size for the exposed group is 540, and the sample size for the unexposed group is 270. Figure 4 Sample size estimation results using the online calculator for the cohort study example.

  2. Sample Size Estimation in Clinical Research: From Randomized Controlled

    Sample size determination is an essential step in planning a clinical study. It is critical to understand that different study designs need different methods of sample size estimation. Although there is a vast literature discussing sample size estimation, incorrect or improper formulas continue to be applied.

  3. A Step-by-Step Process on Sample Size Determination for Medical Research

    In order to make up for a rough estimate of 20.0% of non-response rate, the minimum sample size requirement is calculated to be 254 patients (i.e. 203/0.8) by estimating the sample size based on the EPV 50, and is calculated to be 375 patients (i.e. 300/0.8) by estimating the sample size based on the formula n = 100 + 50i.

  4. Sample size, power and effect size revisited: simplified and practical

    In clinical research, sample size is calculated in line with the hypothesis and study design. The cross-over study design and parallel study design apply different approaches for sample size estimation. Unlike pre-clinical studies, a significant number of clinical journals necessitate sample size estimation for clinical studies.

  5. Sample size determination: A practical guide for health researchers

    Approaches to sample size calculation according to study design are presented with examples in health research. For sample size estimation, researchers need to (1) provide information regarding the statistical analysis to be applied, (2) determine acceptable precision levels, (3) decide on study power, (4) specify the confidence level, and (5 ...

  6. Sample Size Estimation in Clinical Trial

    The description sample size in the protocol will be: A sample size of 180 subjects, 90 in each arm, is sufficient to detect a clinically important difference of 0.5 between groups in reducing pain assuming a standard deviation of 1.195 using a two-tailed t-test of difference between means with 80% power and a 5% level of significance.

  7. An introduction to power and sample size estimation

    Clearly sample size calculations are a key component of clinical trials as the emphasis in most of these studies is in finding the magnitude of difference between therapies. All clinical trials should have an assessment of sample size. In other study types sample size estimation should be performed to improve the precision of our final results.

  8. Sample Size Estimation in Clinical Research : From Randomized

    Some basic statistical concepts in sample size estimation are given here. Study Designs and Hypothesis Tests in Clinical Research. The sample size estimation formulas can be very different, depending on the type of study design, the type of outcome, and the hypothesis test an investigator specifies.

  9. PDF Sample Size Formulas for Different Study Designs

    Sample size estimation in clinical research: from randomized controlled trials to observational studies. Chest, 158(1), pp.S12-S20. Xiaofeng Wang, PhD1,* and Xinge Ji, MS 1 1Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA *Correspondence to: Xiaofeng Wang, PhD

  10. Sample Size Estimation in Clinical Research

    Sample size estimation is an essential step in clinical study conduction that must be planned properly to ensure the research time, personnel effort, and costs are not wasted. It is not uncommon ...

  11. Study design in clinical research: sample size estimation and power

    The purpose of this review is to describe the statistical methods available to determine sample size and power analysis in clinical trials. The information was obtained from standard textbooks and personal experience. Equations are provided for the calculations and suggestions are made for the use of power tables. It is concluded that sample size calculations and power analysis can be ...

  12. Sample size estimation in clinical trial

    This plan is termed as clinical trial protocol. One of the key aspects of this protocol is sample size estimation. The aim of this article is to discuss how important sample size estimation is for a clinical trial, and also to understand the effects of sample size over- estimation or under-estimation on outcome of a trial. Also an attempt is ...

  13. Sample Size Estimation in Clinical Research : From Randomized

    RCTs can be parallel, crossover, or other advanced study designs. 8 The majority of RCTs in clinical research are parallel-group trials. An analysis of the RCTs indexed in PubMed between 2000 and 2006 found that 78% of RCTs were parallel designs, and 16% were crossover. 15 For brevity, we only discuss the sample size estimation of the parallel design.

  14. Precision and sample size in clinical trials by covariate adjustment

    Statistical methods to improve precision and reduce the required sample size in many phase 2 and 3 clinical trials, including COVID-19 trials, by covariate adjustment

  15. Sample Size Estimation in Clinical Research: From Randomized Controlled

    Sample size determination is an essential step in planning a clinical study. It is critical to understand that different study designs need different methods of sample size estimation. Although there is a vast literature discussing sample size estimation, incorrect or improper formulas continue to be applied.

  16. Study design in clinical research: sample size estimation and power

    The purpose of this review is to describe the statistical methods available to determine sample size and power analysis in clinical trials. The information was obtained from standard textbooks and personal experience. ... Study design in clinical research: sample size estimation and power analysis Can J Anaesth. 1996 Feb;43(2):184-91. doi: 10. ...

  17. Sample size determination: A practical guide for health researchers

    Therefore, when estimating the sample size, population size is rarely important in medical research. 37 However, if the population is limited (e.g., in a study that evaluates an academic program, where the population is all students enrolled in the program), then the sample size equations can be adjusted for the population size. 37 , 38 , 39 ...

  18. Sample Size Estimation for Health and Social Science Researchers

    any reviews on sample size estimation have focused more on specific study designs which often present technical equations and formula that are boring to statistically naïve health researchers. Therefore, this compendium reviews all the common sample size estimation formula in social science and health research with the aim of providing basic guidelines and principles to achieve valid sample ...

  19. Sample size calculation: Basic principles

    It is possible to estimate sample size taking into consideration all outcome measures, both primary and secondary at the cost of much larger sample size. ... Mansmann U. Sample size determinations in original research protocols for randomised clinical trials submitted to UK research ethics committees: Review. BMJ. 2013; 346:f1135. [PMC free ...

  20. Sample Size Calculator

    About This Calculator. This calculator uses a number of different equations to determine the minimum number of subjects that need to be enrolled in a study in order to have sufficient statistical power to detect a treatment effect. 1. Before a study is conducted, investigators need to determine how many subjects should be included.

  21. Sample size re-estimation in clinical trials

    Abstract. In clinical trials, sample size re-estimation is often conducted at interim. The purpose is to determine whether the study will achieve study objectives if the observed treatment effect at interim preserves till end of the study. A traditional approach is to conduct a conditional power analysis for sample size only based on observed ...

  22. Basic concepts for sample size calculation: Critical step for any

    Although statisticians play a major role in sample size estimation basic knowledge regarding sample size calculation is very sparse among most of the anesthesiologists related to research including under trainee doctors. ... The sample size is one of the first practical steps and statistical principal in designing a clinical trial to answer the ...

  23. Sample size estimation and power analysis for clinical research studies

    A study is conducted to attests this correlation in a population, with the significance level of 1% and power of 90%. The sample size for such a study can be estimated as follows: the sample size for 90% power at 1% level of significance was 99 for two-tailed alternative test and 87 for one-tailed test.