## StatsToDo: CUSUM for Inverse Gaussian Distributed Measurements

Explanations R Code
The example is a made up one to demonstrate the numerical process, and the data is generated by the computer. It purports to be from a quality control exercise in the outpatient clinic, using the patient waiting time between arrival and being attended to as the quality indicator.
• After much organization and training, the initial benchmark has been achieved, that patients do not have to wait much more than 20 minutes between arrival and being attended to. CUSUM is then established to monitor that this standard is maintained.
• With time, experienced and trained staff leave and replaced by the less experienced and trained. The standard may then deteriorate, resulting in progressively longer waiting time for patients.
• We would like to trigger an alarm to carry out re-training and reorganization when the waiting time doubled to that of the bench mark.
• As training and re-organization is disruptive to services, we set the average run length (ARL) to 100. If we sample one waiting time per clinic session, the false alarm rate should be no more frequent than 100 clinic sessions.

### Step 0: Estimate In Control mu (μ) and lambda (λ)

```# Step 0: before monitoring, establish in control mu and lambda
refData <- c(7.5, 4, 5.5, 9, 7, 16, 3.5, 17.5, 11.5, 14.5, 11, 7.5, 12.5, 11, 6.5, 8, 9, 6.5,
20, 12, 13, 10, 11.5, 13, 7.5, 11, 2, 14.5, 17, 10, 20.5, 9, 2.5, 7, 18.5, 6, 3.5,
3, 13, 17, 7, 10.5, 10.5, 6.5, 24.5, 6, 10.5, 5.5, 8.5, 17)
#install.packages("goft")    # if not already installed
library(goft)
ig_fit(refData)              # fitting an inverse Gaussian distribution to refData
```
Step 0 is undertaken during the planning stage. We collect waiting time (in minutes closest 0.5 minutes) from 50 patients. It can be seen from the plot to the right that, when compared with the theoretical normal distribution curve, the data is skewed to the left with a long right tail, and the 95% confidence interval on the left would include the impossible negative values. This is typical of the Inverse Gaussian Distribution.

We calculate the central location (mu) and skew(λ), which describe the in control charactersitcs based on the Inverse Gaussian Distribution. The results are:

```       Inverse Gaussian MLE
mu                  10.32000
lambda              28.95172
```

### Step 1: Define CUSUM Parameters

```# Step 1 Input parameters
inControlMu = 10
inControlLambda = 29
outOfControlMu = 20
arl = 100
theModel = "F"   #F for FIR,   Z for zero, S for steady state
```
Step 1 estimates k and h, using in control μ and λ, out of control μ and the arl. The model has 3 options, which sets the first value of the CUSUM
• F means Fast Initial Response, where the initial CUSUM value is set at half of the Decision Interval h. The rationale is that, if the situation is in control then CUSUM will gradually drift towards zero, but if the situation is already out of control, an alarm would be triggered early. The down side is that a false alarm is slightly more likely early on in the monitoring. As FIR is recommended by Hawkins, it is set as the default option
• Z is for zero, and CUSUM starts at the baseline value of 0. This will lower the risk of false alarm in the early stages of monitoring, but will detect the out of control situation slower if it already exists at the begining.
• S is for steady state, intended for when monitoring is already ongoing, and a new plot is being constructed. The CUSUM starts at the value when the previous chart ends.
• Each model will make minor changes to the value of the decision interval h. The setting of the initial values is mostly intended to determine how quickly an alarm can be triggered if the out of control situation exists from the beginning.

### Step 2: Calculate Reference Value (k) and Decision Interval (h)

```# Step 2: Calculate k and h
#install.packages("CUSUMdesign")   # if not already installed
library(CUSUMdesign)
result <- getH(distr=6, ICmean=inControlMu, IClambda=inControlLambda, OOCmean=outOfControlMu,
ARL=arl, type=theModel)
k = result\$ref
h = result\$DI
if(outOfControlMu<inControlMu)
{
h = -h
}
cat("Reference Value k=",k,"\tDecision Interval h=", h, "\n")
```
result is the object that contains the results of the analysis. The result required for this program are the reference value k and decision interval h. Please note that h is calculated as a positive value. If the CUSUM is designed to detect a decrease from in control value, then h needs to be changed to a negative value.

The results are as follows

```Reference Value k= 13.33333 	Decision Interval h= 22.07419
```

### Step 3: CUSUM Plot

Step 3 is divided into 2 parts. Step 3a calculates the cusum vector, and 3b plots the vector and h in a graph.

```#Step 3a: Calculate CUSUM. dat is the vector of waiting time
dat=c(15, 17.5, 8, 16, 7.5, 13, 14.5, 11, 27, 9, 12, 12, 12, 14, 10.5, 16, 17, 18.5, 14, 15.5)
cusum <- vector()
cusumValue = 0
if(theModel=="F")
{
cusumValue = h / 2
}
for(i in 1 : length(dat))
{
cusumValue = cusumValue + dat[i] - k
#print(cusumValue)
if(outOfControlMu>inControlMu) # Up
{
if(cusumValue<0)
{
cusumValue = 0
}
}
else                               # down
{
if(cusumValue>0)
{
cusumValue = 0
}
}
cusum[i] = cusumValue
}
cusum
```
The vector dat contains the waiting time values to be monitored. The resulting cusum vector is
```> cusum
[1] 12.703763 16.870430 11.537096 14.203763  8.370430  8.037096  9.203763  6.870430
[9] 20.537096 16.203763 14.870430 13.537096 12.203763 12.870430 10.037096 12.703763
[17] 16.370430 21.537096 22.203763 24.370430
```

Step 3b plotes the cusum vector

```# Step 3b: Plot the cusum vector and h
plot(cusum,type="l")
abline(h=h)
```
In step 3b, the first line plots the cusum vector, and the second line the decision interval h. The result plot is shown to the right.

### Step 4: (Optional) Exporting results

```# Step 4: Optional export of results
#myDataFrame <- data.frame(dat,cusum)       #combine dat and cusum to dataframe
#myDataFrame                                #display dataframe
#write.csv(myDataFrame, "CusumInvGauss.csv")   # write dataframe to .csv file
```
Step 4 is optional, and in fact commented out, and included as a template only. Each line can be activated by removing the #

The first line places the two vectors, dat and cusum together into a dataframe

The second line displays the data, along with row numbers, in the console, which can then be copied and pasted into other applications for further processing

The third line saves the dataframe as a comma delimited .csv file. This is needed if the data is too large to handle by copy and paste from the console.

Javascript Plot