Content Disclaimer Copyright @2020. All Rights Reserved. |
Links : Home Index (Subjects) Contact StatsToDo
Explanations Counts of events, based on the Poisson distribution, is a frequently encountered model in medical research. Examples of this are number of falls, asthma attacks, number of cells, and so on. The Poisson parameter Lambda (λ) is the total number of events (k) divided by the number of observation units (n) in the data (λ = k/n). The unit forms the basis or denominator for calculation of the average, and need not be individual cases or research subjects. For example, the number of asthma attacks may be based on the number of child months, or the number of pregnancies based on the number of women years in using a particular contraceptive. Poisson is different to the binomial parameter of proportion or risk where proportion is the number of individuals classified as positive (p) divided by the total number of individuals in the data (r = p/n). Proportion or risk must always be a number between 0 and 1, while λ may be any positive number. For examples, if we have 100 people, and only 90 of them go shopping in a week then the binomial risk of shopping is 90/100 = 0.9. However, some of the people will go shopping more than once in the week, and the total number of shopping trips between the 100 people may be 160, and the Poisson Lambda is 160/100 = 1.6 per person week Large Lambda (λ=k/n) values, say over 200, assumes an approximately normal or geometric distribution, and the count (or sqrt(count)) can be used as a Parametric measurement. If the events occur very few times per individual, so that individuals can be classified as positive or negative cases, then the binomial distribution can be assumed and statistics related to proportions used. In between, or when events are infrequent, the Poisson distribution is used. Some clarification of nomenclature may be useful.
Data InputThe data is a table with 4 columns. Each row represents a separate testThe columns are
More Complex ModelsIf there are more than 2 groups or if the model is multivariate (2 or more independent variables), and if the raw data is available, then the best algorithm for comparison is to use the General Linear Model algorithm of Multiple Regression for Poisson Data. R codes and references for this calculation is presented in GLM.php on this siteReferencesPoisson Probability
Steel RGD, Torrie JH Dickey DA (1997) Principles and Procedures of Statistics. A Biomedical Approach. The McGraw-Hill Companies, Inc New York. p. 558
Whitehead John (1992). The Design and Analysis of Sequential Clinical Trials (Revised 2nd. Edition) . John Wiley & Sons Ltd., Chichester, ISBN 0 47197550 8. p. 48-50
The R codes are divided into the following sections
# log(factorial) vector for repeated calculations of factorial and binomial coefficients logFactVector = vector() MakeLogFactVector <- function(n) # make log(factorial vector { x = 0 for(i in 1:n) { x = x + log(i) logFactVector <<- append(logFactVector,x) } } LogFact <- function(i) { if(i==0) i=1 return (logFactVector[i]) } # binomial coefficient BinomCoeff <- function (n,k) { if(k==0 | k==n) return (1) return (exp(LogFact(n) - LogFact(k) - LogFact(n-k))) }Section 2: Input data and initial setup of log(factorial) vector # input data myDat = (" k1 n1 k2 n2 13 10 8 10 10 30 10 50 12 100 4 110 ") myDat df <- read.table(textConnection(myDat),header=TRUE) # conversion to data frame df # display input data (count(k) and sample size (n) for groups 1 and 2) # Create log(factorial) vector maxN = max(df) MakeLogFactVector(maxN)Section 3: three programs of comparing two counts using the same data and supportive functions Program 1: Whitehead's algorithm # Pgm 1 Whitehead's algorithm Pw <- vector() # probability of Type I Error (Whitehead) one tail for(i in 1:nrow(df)) { k1 = df$k1[i] n1 = df$n1[i] k2 = df$k2[i] n2 = df$n2[i] n = n1 + n2; zi = (n1*k2 - n2*k1)/n vi = n1*n2*(k1+k2)/(n*n); z = abs(zi / sqrt(vi)) Pw <- append(Pw,1 - pnorm(z)) # Whitehead alpha one tail } df$Pw <- Pw # add Whitehead's algorithm results to data frameProgram 2: C Test # Program 2: C Test Pc <- vector() # probability of Type I Error (C Test) one tail for(i in 1:nrow(df)) { k1 = df$k1[i] n1 = df$n1[i] k2 = df$k2[i] n2 = df$n2[i] x = 1 # for ratio = 1 if(k1/n1!=k2/n2) { x1=0 x2=0 k = k1 + k2 p = 0; if(k1>k2) { p = (n1/n2)/(1.0 + (n1/n2)) } else { k1 = k2 p = (n2/n1) / (1.0 + (n2/n1)) } for(j in 0:1) { x = 0; if(j==0) { for(ii in k1:k) x = x + BinomCoeff(k,ii) * p^ii * (1-p)^(k-ii) # choose is binomial coefficient } else { for(ii in 0:k) x = x + BinomCoeff(k,ii) * p^ii * (1-p)^(k-ii) } if(j==0) { x1 = x } else { x2 = x } } if(x1<x2) { x = x1; } else { x = x2; } if(x>1.0)x = 1.0; } Pc <- append(Pc,x / 2) } df$Pc <- Pc # add C Test results to data frameProgram 3: E Test #Program 3: E Test # Result vectors Pe <- vector() # probability of Type I Error (E Test) one tail for(i in 1:nrow(df)) { k1 = df$k1[i] n1 = df$n1[i] k2 = df$k2[i] n2 = df$n2[i] diff = 0 x = 1 if(k1 / n1 != k2 / n2) { Vk = k1 / (n1*n1) + k2 / (n2*n2) Tk1k2 = abs((k1 / n1 - k2 / n2 - diff) / sqrt(Vk)) Lambda2k = (k1+k2) / (n1+n2) - diff * n1 / (n1 + n2) SumT = 0 SumF = 1 x1 = 0 while (x1<1000 && SumF>1e-10) { x2 = 0 SumF = 0 f = 1; while (x2<1000 & f>1e-10) { Vx = x1 / (n1*n1) + x2 / (n2*n2) if(abs(x1 / n1 - x2 / n2)<=diff) { Tx1x2 = 0 } else { Tx1x2 = abs(x1 / n1 - x2 / n2 - diff) / sqrt(Vx); } if (Tx1x2>=Tk1k2) { v1 = exp(-n1 * (Lambda2k + diff)) v2 = (n1 * (Lambda2k + diff))^x1 v3 = exp(-n2 * Lambda2k) * (n2 * Lambda2k)^x2 f = exp(log((v1*v2*v3)) - (LogFact(round(x1)) + LogFact(round(x2)))) SumF = SumF + f } x2 = x2 + 1; } SumT = SumT + SumF x1 = x1 + 1 } x = SumT / 2 } Pe <- append(Pe,x) } df$Pe <- Pe # add E Test results to data frameFinal display of data and results df # display input data and the results (Type I Error, one tail), Pw=Whitehead, Pc=C Test, Pe= E TestThe results are: > df # display input data (count(k) and sample size (n) for groups 1 and 2) k1 n1 k2 n2 1 13 10 8 10 2 20 30 10 50 3 12 100 4 110 > k1 n1 k2 n2 Pw Pc Pe 1 13 10 8 10 0.13761676 0.09582758 0.14336618 2 10 20 10 50 0.01694743 0.49414271 0.04229829 3 12 100 4 110 0.01415499 0.01249091 0.01611372 > |