 Content DisclaimerCopyright @2020.All Rights Reserved.
StatsToDo : Sample Size for Estimating Population Mean
 Introduction Javascript Program R Codes Tables This page describes the relationship between sample size and error in estimating the mean of a normally distributed measurement from a sample Establishing population means is a frequent research activity, particularly in the educational, social, and biomedical fields. Educational departments may wish to know the mathematical abilities of a cohort of school children, Obstetricians need to know the normal birth weight, and so on. Terminology: Level of confidence in the results. This is expressed as a percentage, the most commonly used one being 95% confident Error, either defined by the analyst as tolerable during planning, or estimated from the results at the end of data collection, is the distance between the mean and the end of the confidence interval, so that the actual confidence interval is CI = mean ± error. Error is calculated in Standard Deviation units (z), and the error in term of the measurement ielf is z x SD Sample size is the number of observations in the data This panel provides two calculations The first is to estimate the sample size requirement, using confidence level, tolerable error, and assumed Standard Deviation. The input data and results are in actual units of measurements. In the sample size table however, the error/SD ratio us used, so the results are based on SD=1. An example: We wish to establish the mean IQ of first year university students. We expect the Standard Deviations of IQ in the cohort to be 10, and we want a 95% confidence interval of the results to be ±2 IQ points. This is an error / SD ratio of (z = 2 / 10 = 0.2). The sample size required is 99 subjects. The second is to estimate error from the data already collectd. This is based on the confidence level desired, and the sample size and Standard Deviation found in the data. The result is the error in actual measurements, so that the confidence interval is mean ±error An example: We proceeded to measure 97 university student's IQ, and found the mean and Standard Deviation of IQ in the group measured to be 110 and 12 accordingly. From this we can established that the 95% confidence interval to be ±2.4. The 95% CI is therefore 110±2.4, 107.6 to 112.4 The third is an exploration of relationship between sample size and error, and is used in pilot studies when the mean and Standard Deviations are not known. The program estimates the error with increasing sample size, using the value of 1 for Standard Deviation, so the results are in Standard Deviation units (z). The results are tabulated, and allows the researcher to determine the optimum sample size for a pilot study An Example: We would like to know the optimum sample size to be used in a pilot study with a 95% confidence. Examining the results using the Javascript program, we can conclude that a sample size of between 15 and 20 would allow us to obtain a 95% confidence interval of mean ± 0.5SDs, or a sample size of approximately 65 to obtain a 95% confidence interval of mean ± 0.25SDs. We can also conclude that, after the first 20 cases, each increase of 5 cases reduces error by less that 0.1SDs. If the cost of data collection is great, we may decide to use 20 cases in the pilot study. However, if greater precision is a priority, we may use greater numbers. Please note: Pilot studies obtains preliminary data that are useful during planning. The sample size is therefore an approximation, determined by a balance of need for precision and the cost of data collection. The results of pilot studies therefore cannot be used for hypothesis testing or defining a population parameter. To obtain robust results, the sample size calculation should be used, and the results tested using the error calculations Reference Machin D, Campbell M, Fayers, P, Pinol A (1997) Sample Size Tables for Clinical Studies. Second Ed. Blackwell Science IBSN 0-86542-870-0 p. 131-135