Comp2SumRegs

Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Compare Two Summary Regressions (Covariance Analysis)

Links : Home Index (Subjects) Contact StatsToDo

Explanations Javascript Program

Data

Data Entry: Data in 2 Columns for groups 1 and 2, and 6 rows
  -Row 1 = sample size (n)
  -Row 2 = Mean for x
  -Row 3 = Standard Deviation for x
  -Row 4 = Mean for y
  -Row 5 = Standard Deviation for y
  -Row 6 = Regression slope (b)

R Code

The following is one continuous program, but broken up to make the numerous results easier to follow

Part 1: Program parameters: consists of the values used in the two groups

# data entered as vectos of 2 elements (grp1 and 2)
arN = c(10, 12)            # sample size
arMeanX = c(38.0, 38.1)    # mean x
arSdX = c(1.8, 2.0)        # SD x
arMeanY = c(3268, 3119)    # mean Y
arSdY = c(351.5, 380.3)    # SD   y
arB = c(185.3, 186.9)      # slope, regression coefficient b

Part 2: Create all other parameters from the entered data in two groups

# Calculate the other parameters
arA <- c(0, 0)            # constant a
arSsx = c(0, 0)           # sum of squares x
arSsy = c(0, 0)           # sum of squares y
arSxy = c(0, 0)           # sum products
arRho = c(0, 0)           # correlation coefficient rho 
arT = c(0, 0)             # student t 
arP = c(0, 0)             # significance p 
for(j in 1:2)
{
  arA[j] = arMeanY[j] - arB[j] * arMeanX[j]
  arSsx[j] = arSdX[j]^2 * (arN[j] - 1)
  arSsy[j] = arSdY[j]^2 * (arN[j] - 1)
  arSxy[j] = arB[j] * arSsx[j]
  arRho[j] = arSxy[j] / sqrt(arSsx[j] * arSsy[j])
  arT[j] = arRho[j] * sqrt((arN[j] - 2) / (1 - arRho[j]^2))
  arP[j] = 1 - pt(arT[j], arN[j] - 2)       # 2 tail
}
# Output of all parameter vectors
arN # sample size
arMeanX # mean x
arSdX 
arMeanY # mean Y
arSdY # SD y
# slopes
arB # slope, regression coefficient b
arA # constant a
# correlation coefficients
arRho # correlation coefficient rho 
arT # student t 
arP # significance p

The result table of parameters (entered and calculated), are

> arN # sample size
[1] 10 12
> arMeanX               # mean x
[1] 38.0 38.1
> arSdX                 # SD x
[1] 1.8 2.0
> arMeanY               # mean Y
[1] 3268 3119
> arSdY                 # SD y
[1] 351.5 380.3

> # slopes
> arB                   # slope, regression coefficient b
[1] 185.3 186.9
> arA                   # constant a
[1] -3773.40 -4001.89

> # correlation coefficients
> arRho                 # correlation coefficient rho 
[1] 0.9489047 0.9829082
> arT                   # student t for correlation
[1]  8.505146 16.883720
> arP                   # significance p (2 tail)
[1] 1.401494e-05 5.581409e-09

Part 3: Compare the two xs and ys

# compare x
diffX = arMeanX[1] - arMeanX[2]
dfX = arN[1] + arN[2] - 2 
pooledX = sqrt(((arN[1]-1)*arSdX[1]^2 + (arN[2]-1)*arSdX[2]^2) / dfX)
seX = pooledX * sqrt(1/arN[1] + 1/arN[2]) 
llX = diffX - 1.96 * seX
ulX = diffX + 1.96 * seX
# compare y
diffY = arMeanY[1] - arMeanY[2]
dfY = arN[1] + arN[2] - 2; 
pooledY = sqrt(((arN[1]-1)*arSdY[1]^2 + (arN[2]-1)*arSdY[2]^2) / dfY)
seY = pooledY * sqrt(1/arN[1] + 1/arN[2]); 
llY = diffY - 1.96 * seY
ulY = diffY + 1.96 * seY
# output 95% CI difference in x and y
c(llX, ulX)  # 95% CI difference in x
c(llY, ulY)  # 95% CI difference in y

The results are as follows

> # output 95% CI difference in x and y
> c(llX, ulX)  # 95% CI difference in x
[1] -1.705087  1.505087
> c(llY, ulY)  # 95% CI difference in y
[1] -159.5142  457.5142

Part 4: Comparing the two regressions

# comparing the two slopes
diffSlope = arB[1] - arB[2]                           # diff slope
s2 = ((arSsy[1] - arSxy[1] * arSxy[1] / arSsx[1]) + 
      (arSsy[2] - arSxy[2] * arSxy[2] / arSsx[2])) /
      (arN[1] + arN[2] - 4)
seSlope = sqrt(s2 * (1 / arSsx[1] + 1 / arSsx[2]))    # SE of difference
tSlope = diffSlope / seSlope                          # t test
dfSlope = arN[1] + arN[2] - 4;                        # degrees of freedom
pSlope = (1 - pt(abs(tSlope), dfSlope)) * 2           # Type I error 2 tail
# output comparison 2 slopes
diffSlope # difference between slopes
seSlope   # standard error of difference
tSlope    # t
dfSlope   # deg freedom
pSlope    # significance p (2 tail)

The results are as follows

> # output comparison 2 slopes
> diffSlope                      # difference between slopes
[1] -1.6
> seSlope                        # standard error of difference
[1] 22.83804
> tSlope                         # t
[1] -0.07005856
> dfSlope                        # deg freedom
[1] 18
> pSlope                         # significance p (2 tail)
[1] 0.9449195

Part 5: Combine the 2 regression lines and compared the adjusted mean y values
# combining the two slopes commonSlope = (arSxy[1] + arSxy[2]) / (arSsx[1] + arSsx[2]) # common slope grandMean = (arMeanX[1] * arN[1] + arMeanX[2] * arN[2]) / (arN[1] + arN[2]) # mean of x adjMean1 = arMeanY[1] + commonSlope * (grandMean - arMeanX[1]) # adjusted mean y grp 1 adjMean2 = arMeanY[2] + grandMean * (grandMean - arMeanX[2]) # adjusted mean y group 2 diffMean = arMeanY[1] - arMeanY[2] - commonSlope * (arMeanX[1] - arMeanX[2]) # adjusted diff s2 = (arSsy[1] + arSsy[2] - (arSxy[1] + arSxy[2]) * (arSxy[1] + arSxy[2]) / (arSsx[1] + arSsx[2])) / (arN[1] + arN[2] - 3); varMean = s2 * (1.0 / arN[1] + 1.0 / arN[2] + (arMeanX[1] - arMeanX[2]) * (arMeanX[1] - arMeanX[2]) / (arSsx[1] + arSsx[2])); # variance of difference seMean = sqrt(varMean) # Standard Error of difference tMean = diffMean / seMean # t dfMean = arN[1] + arN[2] - 3; # degrees of freedom pMean = (1 - pt(tMean, dfMean)) * 2 # p Type I Error (2 tail) # 95% CI t = abs(qt(0.025,dfMean)) # t value for p=0.05 2 tail ll = diffMean - t * seMean # lower limit 95% CI ul = diffMean + t * seMean # upper limit 95% CI # output of combined data commonSlope # common slope diffMean # difference between adjusted means seMean # Standard Error of difference tMean # t dfMean # df pMean # significance p 2 tail c(ll,ul) # 95% confidence interval of adjusted difference in y
The results are as follows
> # output of combined data > commonSlope # common slope [1] 186.2623 > diffMean # difference between adjusted means [1] 167.6262 > seMean # Standard Error of difference [1] 39.8789 > tMean # t [1] 4.203381 > dfMean # df [1] 19 > pMean # significance p 2 tail [1] 0.0004815876 > c(ll,ul) # 95% confidence interval of adjusted difference in y [1] 84.15872 251.09373

StatsToDo: Compare Two Summary Regressions (Covariance Analysis)

References