Content Disclaimer Copyright @2020. All Rights Reserved. |
Links : Home Index (Subjects) Contact StatsToDo
Explanations and References
This page presents 4 programs related to sample size requirements for estimating the mean of pair differences. The programs are in Javascript for immediate use via the web page, R codes, and a table of sample sizes.
Sample Size Table
The programs and tables on this page assumes that the data used and the pair differences are continuous measurements that are normally distributed. For non-parametric ordinal measurements, such as in the Wilcoxon Signed Rank Test, the power efficiency is 95.5% that of the paired t test for parametric t test. This means that, when calculating sample size, the power or β used should be adjusted appropriately, as follows
for using power of 80% (0.8), beta for paired t test = 0.2, β for Wilcoxon Test = 0.2 * 0.995 = 0.191, or power of 0.809 (80.9%) for using power of 90% (0.9), beta for paired t test = 0.1, β for Wilcoxon Test = 0.1 * 0.995 = 0.995, or power of 0.905 (90.5%) The following 4 programs are available on this page
ReferencesMachin D, Campbell M, Fayers, P, Pinol A (1997) Sample Size Tables for Clinical Studies. Second Ed. Blackwell Science IBSN 0-86542-870-0 p. 73-74Johanson GA and Brooks GP (2010) Initial Scale Development: Sample Size for Pilot Studies. Educational and Psychological Measurement Vol.70,Iss.3;p.394-400
Abbreviations
Javascript Programs
α = alpha, p, Probability of Type I Error Power = 1-β, 1-Probability of Type II Error es = Effect Size = Mean of paired difference to be detected / Expected Standard Deviation of the paired difference Value in cells = sample size = number of pairs bold : commonly used sample sizes for small, moderate, and large effect size
This panel presents R codes related to sample size for Paired Difference
Program 1: Sample SizeAlpha = Probability of Type I Error αPower = 1 - β Diff = mean of paired differences SD = Standard Deviation of paired differences # Pgm 1: Sample Size # data entry dat = (" Alpha Power Diff SD 0.05 0.8 0.5 1.0 0.01 0.8 0.5 1.0 0.05 0.9 0.5 1.0 0.01 0.9 0.5 1.0 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame # vectors to store results SSiz1Tail <- vector() SSiz2Tail <- vector() # Calculations delta <- abs(df$Diff / df$SD) # effect size zb <- abs(qnorm(1 - df$Power)) # z for beta # 1 tail za <- abs(qnorm(df$Alpha)) # 1 tail z for alpha f <- (za + zb) / delta SSiz1Tail <- append(SSiz1Tail,ceiling(f**2 + za**2 / 2.0)) # 2 tail za <- abs(qnorm(df$Alpha / 2)) # 2 tail z for alpha #za f <- (za + zb) / delta SSiz2Tail <- append(SSiz2Tail,ceiling(f**2 + za**2 / 2.0)) # append results to data frame df$SSiz1Tail <- SSiz1Tail df$SSiz2Tail <- SSiz2Tail df # show data frame with input data and rsultsThe results are as follows. Sample size is the number of pairs > df # show data frame with input data and rsults Alpha Power Diff SD SSiz1Tail SSiz2Tail 1 0.05 0.8 0.5 1 27 34 2 0.01 0.8 0.5 1 43 51 3 0.05 0.9 0.5 1 36 44 4 0.01 0.9 0.5 1 55 63 Program 2: Power (1 - β)Alpha = probability of Type I Error α N = sample size (nuber of pairs) MEAN = mean of paired sifferences SD = Standard Deviation of mean differences# Pgm2: Power # data entry dat = (" Alpha N MEAN SD 0.05 34 0.5 1.0 0.01 51 0.5 1.0 0.05 44 0.5 1.0 0.01 63 0.5 1.0 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame df # vector to store results Power1Tail <- vector() Power2Tail <- vector() # Calculations delta <- abs(df$MEAN / df$SD); #delta ZA <- abs(qnorm(df$Alpha)) ZB <- delta * sqrt(df$N -ZA**2 / 2) - ZA Power1Tail <- append(Power1Tail,pnorm(ZB)) #Power1Tail ZA <- abs(qnorm(df$Alpha / 2)) ZB <- delta * sqrt(df$N -ZA**2 / 2) - ZA Power2Tail <- append(Power2Tail,pnorm(ZB)) #Power2Tail # append ewsults to data frame df$Power1Tail <- Power1Tail df$Power2Tail <- Power2Tail df # show data frame with input data and resultsThe results are as follows Alpha N MEAN SD Power1Tail Power2Tail 1 0.05 34 0.5 1 0.8872503 0.8083861 2 0.01 51 0.5 1 0.8745876 0.8097019 3 0.05 44 0.5 1 0.9474256 0.9003350 4 0.01 63 0.5 1 0.9401596 0.9009345 Confidence IntervalFirstly, the sybroutine to calculate confidence intervals which will be used by both this and the pilot study algorithms# subroutine to calculate confidence interval ConfIntv <- function(pc,ssiz,sd) #pc= % confidence, ssiz=number of pairs, sd = Standard Deviation of pair differences { se = sd / sqrt(ssiz) # Standard Error alpha = (1 - pc / 100) # convert % confidence into α ci1 = qt(1 - alpha, ssiz - 1) * se # 1 tail # confidence 1 tail ci2 = qt(1 - alpha / 2, ssiz - 1) * se # 2 tail # confidence 2 tail return (c(ci1, ci2)) # returns 1 and 2 tail CI }Main program for confidence interval. Please note: confidence interval here is the distance between mean and the limit of the interval. The full confidence interval is mean±CI, or twice that shown here # data entry: PC=% confidence, N = sample size in number of pairs, and SD=Standard Deviation of paired differences dat = (" PC N SD 95 16 6.0 99 16 6.0 95 25 1.0 99 25 1.0 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame # vectors for results CI1 <- vector() # Confidence interval 1 tail CI2 <- vector() # Confidence interval 2 tail # calculations # 1 tail for(i in 1 : nrow(df)) { ar <- ConfIntv(df$PC[i],df$N[i],df$SD[i]) CI1 <- append(CI1, ar[1]) # Confidence interval 1 tail CI2 <- append(CI2, ar[2]) # Confidence interval 2 tail } # combine results with input data df$CI1 <- CI1 df$CI2 <- CI2 df # display input data and resultsThe results are as follows > df # display input data and results PC N SD CI1 CI2 1 95 16 6 2.6295755 3.1971743 2 99 16 6 3.9037204 4.4200693 3 95 25 1 0.3421764 0.4127797 4 99 25 1 0.4984319 0.5593879 Program 4: Pilot Study# Program 4. Pilot study # Parameters pc = 95 # % confidence sd = 1.0 # within group or population SD intv = 5 # interval maxN = 100 # maximum sample size # vectors for results SSiz <- vector() # sample size CI1 <- vector() # confidence interval 1 tail Diff1 <- vector() # difference in CI from previous row 1 tail DecCase1 <- vector() # decrease in CI per case increase 1 tail PDCase1 <- vector() # % decrease in CI per case increase 1 tailCI1 <- vector() # confidence interval 1 tail CI2 <- vector() # confidence interval 2 tail Diff2 <- vector() # difference in CI from previous row 2 tail DecCase2 <- vector() # decrease in CI per case increase 2 tail PDCase2 <- vector() # % decrease in CI per case increase 2 tail # Calculations # first row n = intv SSiz <- append(SSiz,n) ar <- ConfIntv(pc, n, sd) ci1 = ar[1] * 2 CI1 <- append(CI1,sprintf(ci1, fmt="%#.4f")) # confidence interval 1 tail Diff1 <- append(Diff1,0) # difference in CI from previous row 1 tail DecCase1 <- append(DecCase1,0) # decrease in CI per case increase 1 tail PDCase1 <- append(PDCase1,0) # % decrease in CI per case increase 1 tailCI1 <- vector() # confidence interval 1 tail ci2 = ar[2] * 2 CI2 <- append(CI2,sprintf(ci2, fmt="%#.4f")) # confidence interval 1 tail Diff2 <- append(Diff2,0) # difference in CI from previous row 1 tail DecCase2 <- append(DecCase2,0) # decrease in CI per case increase 1 tail PDCase2 <- append(PDCase2,0) # % decrease in CI per case increase 1 tailCI1 <- vector() # confidence interval 1 tail # subsequent rows while(n < maxN) { n = n + intv SSiz <- append(SSiz,n) ar <- ConfIntv(pc, n, sd) oldci1 = ci1 ci1 = ar[1] * 2 CI1 <- append(CI1,sprintf(ci1, fmt="%#.4f")) # confidence interval 1 tail diff1 = oldci1 - ci1 Diff1 <- append(Diff1,sprintf(diff1, fmt="%#.4f")) # difference in CI from previous row 1 tail decCase1 = diff1 / intv DecCase1 <- append(DecCase1,sprintf(decCase1, fmt="%#.4f")) # decrease in CI per case increase 1 tail pDCase1 = sprintf(decCase1 / oldci1 * 100, fmt="%#.1f") PDCase1 <- append(PDCase1,pDCase1) # % decrease in CI per case increase 1 tail oldci2 = ci2 ci2 = ar[2] * 2 CI2 <- append(CI2,sprintf(ci2, fmt="%#.4f")) # confidence interval 2 tail diff2 = oldci2 - ci2 Diff2 <- append(Diff2,sprintf(diff2, fmt="%#.4f")) # difference in CI from previous row 2 tail decCase2 = diff2 / intv DecCase2 <- append(DecCase2,sprintf(decCase2, fmt="%#.4f")) # decrease in CI per case increase 2 tail pDCase2 = sprintf(decCase2 / oldci2 * 100, fmt="%#.1f") PDCase2 <- append(PDCase2,pDCase2) # % decrease in CI per case increase 2 tail } #combine all results into data frame for display df <- data.frame(SSiz,CI1,Diff1,DecCase1,PDCase1,CI2,Diff2,DecCase2,PDCase2) df # display results in data frameThe result pilot study table is as follows > df # display results in data frame SSiz CI1 Diff1 DecCase1 PDCase1 CI2 Diff2 DecCase2 PDCase2 1 5 1.9068 0 0 0 2.4833 0 0 0 2 10 1.1594 0.7474 0.1495 7.8 1.4307 1.0526 0.2105 8.5 3 15 0.9095 0.2498 0.0500 4.3 1.1076 0.3232 0.0646 4.5 4 20 0.7733 0.1362 0.0272 3.0 0.9360 0.1715 0.0343 3.1 5 25 0.6844 0.0889 0.0178 2.3 0.8256 0.1105 0.0221 2.4 6 30 0.6204 0.0639 0.0128 1.9 0.7468 0.0787 0.0157 1.9 7 35 0.5716 0.0488 0.0098 1.6 0.6870 0.0598 0.0120 1.6 8 40 0.5328 0.0388 0.0078 1.4 0.6396 0.0474 0.0095 1.4 9 45 0.5009 0.0319 0.0064 1.2 0.6009 0.0388 0.0078 1.2 10 50 0.4742 0.0267 0.0053 1.1 0.5684 0.0325 0.0065 1.1 11 55 0.4513 0.0229 0.0046 1.0 0.5407 0.0277 0.0055 1.0 12 60 0.4315 0.0199 0.0040 0.9 0.5167 0.0240 0.0048 0.9 13 65 0.4140 0.0174 0.0035 0.8 0.4956 0.0211 0.0042 0.8 14 70 0.3985 0.0155 0.0031 0.7 0.4769 0.0187 0.0037 0.8 15 75 0.3847 0.0139 0.0028 0.7 0.4602 0.0167 0.0033 0.7 16 80 0.3722 0.0125 0.0025 0.7 0.4451 0.0151 0.0030 0.7 17 85 0.3608 0.0114 0.0023 0.6 0.4314 0.0137 0.0027 0.6 18 90 0.3504 0.0104 0.0021 0.6 0.4189 0.0125 0.0025 0.6 19 95 0.3409 0.0095 0.0019 0.5 0.4074 0.0115 0.0023 0.5 20 100 0.3321 0.0088 0.0018 0.5 0.3968 0.0106 0.0021 0.5 |