SSiz Paired Diff

Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Sample Size for Paired Differences

Links : Home Index (Subjects) Contact StatsToDo

Explanations and References Sample Size Table Javascript Programs

Data Input Sample Size Estimation is a table of 4 columns
  - Each row contains data from a separate study
  - Col 1 = probability of Type I error (α)
  - Col 2 = power (1-β)
  - Col 3 = mean of paired differences
  - Col 4 = Standard Deviation of paired differences

Data Input for Power Estimation is a table of 4 columns
  - Each row contains data from a separate study
  - Col 1 = probability of Type I error (α)
  - Col 2 = sample size (Pairs) used
  - Col 3 = mean of Paired Differences observed
  - Col 4 = Standard Deviation of paired differences

Data Input for Confidence Interval Estimation is a table of 3 columns
  - Each row contains data from a separate study
  - Col 1 = percent confidence (usually 95)
  - Col 2 = sample size used (pairs)
  - Col 3 = standard deviation of paired differences observed

Data Input for Pilot Study is for a single plan, a single column with 4 rows
  - Row 1 : Percent Confidence required, usually 95 or 99
  - Row 2 : Standard Deviation of Paired Mean
  - Row 3 : Sample Size Interval
  - Row 4 : Maximum sample size

R Codes

This panel presents R codes related to sample size for Paired Difference

Program 1: Sample Size

Alpha = Probability of Type I Error α
Power = 1 - β
Diff = mean of paired differences
SD = Standard Deviation of paired differences

# Pgm 1: Sample Size
# data entry
dat = ("
Alpha Power Diff  SD
0.05  0.8   0.5  1.0
0.01  0.8   0.5  1.0
0.05  0.9   0.5  1.0
0.01  0.9   0.5  1.0
       ")
df <- read.table(textConnection(dat),header=TRUE)  # conversion to data frame    

# vectors to store results
SSiz1Tail <- vector()
SSiz2Tail <- vector()

# Calculations
delta <- abs(df$Diff / df$SD)            # effect size
zb <- abs(qnorm(1 - df$Power))           # z for beta
# 1 tail
za <- abs(qnorm(df$Alpha))               # 1 tail z for alpha 
f <-  (za + zb) / delta
SSiz1Tail <- append(SSiz1Tail,ceiling(f**2 + za**2 / 2.0))
# 2 tail
za <- abs(qnorm(df$Alpha / 2))           # 2 tail z for alpha 
#za
f <-  (za + zb) / delta
SSiz2Tail <- append(SSiz2Tail,ceiling(f**2 + za**2 / 2.0))
# append results to data frame
df$SSiz1Tail <- SSiz1Tail
df$SSiz2Tail <- SSiz2Tail
df # show data frame with input data and rsults

The results are as follows. Sample size is the number of pairs

> df # show data frame with input data and rsults
  Alpha Power Diff SD SSiz1Tail SSiz2Tail
1  0.05   0.8  0.5  1        27        34
2  0.01   0.8  0.5  1        43        51
3  0.05   0.9  0.5  1        36        44
4  0.01   0.9  0.5  1        55        63

Program 2: Power (1 - β)

Alpha = probability of Type I Error α N = sample size (nuber of pairs) MEAN = mean of paired sifferences SD = Standard Deviation of mean differences

# Pgm2: Power
# data entry
dat = ("
Alpha  N  MEAN SD
0.05  34  0.5  1.0
0.01  51  0.5  1.0
0.05  44  0.5  1.0
0.01  63  0.5  1.0
       ")
df <- read.table(textConnection(dat),header=TRUE)  # conversion to data frame  
df
# vector to store results
Power1Tail <- vector()
Power2Tail <- vector()
# Calculations
delta <- abs(df$MEAN / df$SD);
#delta
ZA <- abs(qnorm(df$Alpha))
ZB <- delta * sqrt(df$N -ZA**2 / 2) - ZA
Power1Tail <- append(Power1Tail,pnorm(ZB))
#Power1Tail
ZA <- abs(qnorm(df$Alpha / 2))
ZB <- delta * sqrt(df$N -ZA**2 / 2) - ZA
Power2Tail <- append(Power2Tail,pnorm(ZB))
#Power2Tail
# append ewsults to data frame
df$Power1Tail <- Power1Tail
df$Power2Tail <- Power2Tail
df # show data frame with input data and results

The results are as follows

  Alpha  N MEAN SD Power1Tail Power2Tail
1  0.05 34  0.5  1  0.8872503  0.8083861
2  0.01 51  0.5  1  0.8745876  0.8097019
3  0.05 44  0.5  1  0.9474256  0.9003350
4  0.01 63  0.5  1  0.9401596  0.9009345

Confidence Interval

Firstly, the sybroutine to calculate confidence intervals which will be used by both this and the pilot study algorithms

# subroutine to calculate confidence interval
ConfIntv <- function(pc,ssiz,sd) #pc= % confidence, ssiz=number of pairs, sd = Standard Deviation of pair differences
{
  se = sd / sqrt(ssiz)                             # Standard Error
  alpha = (1 - pc / 100)                           # convert % confidence into α
  ci1 = qt(1 - alpha, ssiz - 1) * se  # 1 tail     # confidence 1 tail
  ci2 = qt(1 - alpha / 2, ssiz - 1) * se  # 2 tail # confidence 2 tail
  return (c(ci1, ci2))                             # returns 1 and 2 tail CI
}

Main program for confidence interval. Please note: confidence interval here is the distance between mean and the limit of the interval. The full confidence interval is mean±CI, or twice that shown here

# data entry: PC=% confidence, N = sample size in number of pairs, and SD=Standard Deviation of paired differences
dat = ("
PC  N   SD
95  16  6.0
99  16  6.0
95  25  1.0
99  25  1.0 
      ")
df <- read.table(textConnection(dat),header=TRUE)  # conversion to data frame  

# vectors for results
CI1 <- vector()  # Confidence interval 1 tail
CI2 <- vector()  # Confidence interval 2 tail

# calculations
# 1 tail
for(i in 1 : nrow(df))
{
  ar <- ConfIntv(df$PC[i],df$N[i],df$SD[i])
  CI1 <- append(CI1, ar[1])  # Confidence interval 1 tail
  CI2 <- append(CI2, ar[2])  # Confidence interval 2 tail
}

# combine results with input data
df$CI1 <- CI1
df$CI2 <- CI2
df # display input data and results

The results are as follows

> df # display input data and results
  PC  N SD       CI1       CI2
1 95 16  6 2.6295755 3.1971743
2 99 16  6 3.9037204 4.4200693
3 95 25  1 0.3421764 0.4127797
4 99 25  1 0.4984319 0.5593879

Program 4: Pilot Study

# Program 4. Pilot study
# Parameters
pc = 95         # % confidence
sd = 1.0        # within group or population SD
intv = 5        # interval
maxN = 100      # maximum sample size

# vectors for results
SSiz <- vector()     # sample size
CI1 <- vector()      # confidence interval 1 tail
Diff1 <- vector()    # difference in CI from previous row 1 tail
DecCase1 <- vector() # decrease in CI per case increase 1 tail
PDCase1 <- vector()  # % decrease in CI per case increase 1 tailCI1 <- vector()      # confidence interval 1 tail
CI2 <- vector()      # confidence interval 2 tail
Diff2 <- vector()    # difference in CI from previous row 2 tail
DecCase2 <- vector() # decrease in CI per case increase 2 tail
PDCase2 <- vector()  # % decrease in CI per case increase 2 tail

# Calculations
# first row
n = intv
SSiz <- append(SSiz,n)
ar <- ConfIntv(pc, n, sd)
ci1 = ar[1] * 2
CI1 <- append(CI1,sprintf(ci1, fmt="%#.4f"))      # confidence interval 1 tail
Diff1 <- append(Diff1,0)    # difference in CI from previous row 1 tail
DecCase1 <- append(DecCase1,0) # decrease in CI per case increase 1 tail
PDCase1 <- append(PDCase1,0)  # % decrease in CI per case increase 1 tailCI1 <- vector()      # confidence interval 1 tail
ci2 = ar[2] * 2
CI2 <- append(CI2,sprintf(ci2, fmt="%#.4f"))      # confidence interval 1 tail
Diff2 <- append(Diff2,0)    # difference in CI from previous row 1 tail
DecCase2 <- append(DecCase2,0) # decrease in CI per case increase 1 tail
PDCase2 <- append(PDCase2,0)  # % decrease in CI per case increase 1 tailCI1 <- vector()      # confidence interval 1 tail
# subsequent rows
while(n < maxN)
{
  n = n + intv
  SSiz <- append(SSiz,n)
  ar <- ConfIntv(pc, n, sd)
  oldci1 = ci1
  ci1 = ar[1] * 2
  CI1 <- append(CI1,sprintf(ci1, fmt="%#.4f"))                # confidence interval 1 tail
  diff1 = oldci1 - ci1
  Diff1 <- append(Diff1,sprintf(diff1, fmt="%#.4f"))          # difference in CI from previous row 1 tail
  decCase1 = diff1 / intv
  DecCase1 <- append(DecCase1,sprintf(decCase1, fmt="%#.4f")) # decrease in CI per case increase 1 tail
  pDCase1 = sprintf(decCase1 / oldci1 * 100, fmt="%#.1f")
  PDCase1 <- append(PDCase1,pDCase1)                          # % decrease in CI per case increase 1 tail
  oldci2 = ci2
  ci2 = ar[2] * 2
  CI2 <- append(CI2,sprintf(ci2, fmt="%#.4f"))                # confidence interval 2 tail
  diff2 = oldci2 - ci2
  Diff2 <- append(Diff2,sprintf(diff2, fmt="%#.4f"))          # difference in CI from previous row 2 tail
  decCase2 = diff2 / intv
  DecCase2 <- append(DecCase2,sprintf(decCase2, fmt="%#.4f")) # decrease in CI per case increase 2 tail
  pDCase2 = sprintf(decCase2 / oldci2 * 100, fmt="%#.1f")
  PDCase2 <- append(PDCase2,pDCase2)                          # % decrease in CI per case increase 2 tail
}
#combine all results into data frame for display
df <- data.frame(SSiz,CI1,Diff1,DecCase1,PDCase1,CI2,Diff2,DecCase2,PDCase2)
df # display results in data frame

The result pilot study table is as follows

> df # display results in data frame
   SSiz    CI1  Diff1 DecCase1 PDCase1    CI2  Diff2 DecCase2 PDCase2
1     5 1.9068      0        0       0 2.4833      0        0       0
2    10 1.1594 0.7474   0.1495     7.8 1.4307 1.0526   0.2105     8.5
3    15 0.9095 0.2498   0.0500     4.3 1.1076 0.3232   0.0646     4.5
4    20 0.7733 0.1362   0.0272     3.0 0.9360 0.1715   0.0343     3.1
5    25 0.6844 0.0889   0.0178     2.3 0.8256 0.1105   0.0221     2.4
6    30 0.6204 0.0639   0.0128     1.9 0.7468 0.0787   0.0157     1.9
7    35 0.5716 0.0488   0.0098     1.6 0.6870 0.0598   0.0120     1.6
8    40 0.5328 0.0388   0.0078     1.4 0.6396 0.0474   0.0095     1.4
9    45 0.5009 0.0319   0.0064     1.2 0.6009 0.0388   0.0078     1.2
10   50 0.4742 0.0267   0.0053     1.1 0.5684 0.0325   0.0065     1.1
11   55 0.4513 0.0229   0.0046     1.0 0.5407 0.0277   0.0055     1.0
12   60 0.4315 0.0199   0.0040     0.9 0.5167 0.0240   0.0048     0.9
13   65 0.4140 0.0174   0.0035     0.8 0.4956 0.0211   0.0042     0.8
14   70 0.3985 0.0155   0.0031     0.7 0.4769 0.0187   0.0037     0.8
15   75 0.3847 0.0139   0.0028     0.7 0.4602 0.0167   0.0033     0.7
16   80 0.3722 0.0125   0.0025     0.7 0.4451 0.0151   0.0030     0.7
17   85 0.3608 0.0114   0.0023     0.6 0.4314 0.0137   0.0027     0.6
18   90 0.3504 0.0104   0.0021     0.6 0.4189 0.0125   0.0025     0.6
19   95 0.3409 0.0095   0.0019     0.5 0.4074 0.0115   0.0023     0.5
20  100 0.3321 0.0088   0.0018     0.5 0.3968 0.0106   0.0021     0.5