Chapter 13 SPC Charts for Rare Events

When dealing with potentially serious or fatal events, the number of occurrences is often (fortunately) very low, which can present challenges for traditional SPC charts designed for count data.

The challenge arises because traditional SPC charts for count data – like P and U charts – assume a relatively higher and more consistent frequency of events to function effectively. When events are rare, the data becomes sparse and highly variable, making these charts unreliable:

  • Too many zeros: Frequent zero counts can lead to control limits that are misleadingly narrow or wide. Especially, with run charts, when more than half of the data points are zero, the median will also be zero, which undermines the validity of the runs analysis – typically resulting in just one long run above the median.

  • Low sensitivity: Traditional charts may fail to detect meaningful shifts or trends because the event frequency is too low to produce statistically significant signals.

  • False alarms or missed signals: The charts may either trigger false alarms due to natural variation in rare data or miss actual process shifts, leading to poor decision-making.

One useful approach to handling rare events is to plot the number of opportunities or the time between events, rather than focusing on event proportions or rates – essentially flipping the indicator to look at the gaps between occurrences instead of the occurrences themselves.

Another approach is to look at the cumulated sums (CUSUM) of binary data.

In this chapter we introduce the G chart for number of opportunities between events, the T chart for time between events, and the Bernoulli CUSUM chart for binary data.

13.1 Introducing the birth dataset

The Robson group 1 births dataset is a data frame with 2193 observations from Robson group 1 deliveries, that is: first time pregnancy, single baby, head first, gestational age at least 37 weeks.

# import data
births <- read.csv('data/robson1_births.csv',
              comment.char = '#',
              colClasses   = c('POSIXct',
                               'Date',
                               'logical',
                               'logical',
                               'integer',
                               'integer',
                               'integer',
                               'double',
                               'logical'))

# make sure the rows are sorted in time order
births <- births[order(births$datetime), ]
# add dummy variable, case, for counting the denominator for the proportion charts
births$case <- 1
# show data structure
str(births)
## 'data.frame':    2193 obs. of  10 variables:
##  $ datetime: POSIXct, format: "2016-01-04 06:13:00" "2016-01-04 07:20:00" ...
##  $ biweek  : Date, format: "2016-01-04" "2016-01-04" ...
##  $ csect   : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ cup     : logi  FALSE FALSE FALSE FALSE TRUE TRUE ...
##  $ length  : int  55 52 50 47 55 50 50 47 50 53 ...
##  $ weight  : int  3872 3750 3458 3145 4224 2730 3130 2755 3270 3645 ...
##  $ apgar   : int  10 10 10 10 10 10 10 10 10 10 ...
##  $ ph      : num  7.1 7.26 7.36 7.31 7.16 7.14 7.23 7.21 7.17 7.2 ...
##  $ asphyxia: logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ case    : num  1 1 1 1 1 1 1 1 1 1 ...

For this chapter we are interested in the asphyxia variable, which is a logical vector that is TRUE when the baby had either low apgar score or low umbilical cord pH suggesting lack of oxygen during delivery. Asphyxia is a very serious condition that may result in permanent brain damage death. In total, there are 16 cases corresponding to 0.7%.

Figure 13.1 is a run chart of the biweekly counts.

# run chart of counts
qic(biweek, asphyxia, 
    data    = births, 
    agg.fun = 'sum')  # use sum() function to aggregate data by subgroup
## Subgroup size > 1. Data have been aggregated using sum().
Run chart of number of deliveries with neonatal asphyxia

Figure 13.1: Run chart of number of deliveries with neonatal asphyxia

Notice that because more than half the data points are zero, the centre line (median) is also zero. This invalidates the runs analysis – there is only one very long run, which may lead to the false conclusion that data contains one or more shifts.

Figure 13.3 plots proportions rather than counts reaching the same conclusion.

# run chart of proportions
qic(biweek, asphyxia, case,  # use the case variable for denominators
    data = births)
Run chart of proportion deliveries with neonatal asphyxia

Figure 13.2: Run chart of proportion deliveries with neonatal asphyxia

Plotting the data on a P chart provides control limits and allows for a more meaningful runs analysis – as the average, in this case, better represents the process centre (figure 13.3.

# P chart
qic(biweek, asphyxia, case, 
    data  = births, 
    chart = 'p')
P chart of percent deliveries with neonatal asphyxia

Figure 13.3: P chart of percent deliveries with neonatal asphyxia

While the P chart remains useful, the lower control limit is zero, making it impossible to detect a signal of improvement using the three-sigma rule. As a result, we must rely solely on runs analysis to identify improvement.

One obvious solution is to increase the subgroup size by aggregating data over longer periods, such as months or quarters. However, this approach results in slower detection of process improvement or deterioration.

As an alternative, we introduce at the G chart.

13.2 G charts for opportunities between cases

The G chart plots opportunities between cases, for example the number of deliveries between neonatal asphyxia. This type of count data often follows a geometric distribution with a standard deviation given by \(\sigma = \sqrt{\bar{x}(\bar{x}+1)}\). The control limits are then calculated as:

\[\bar{x}\pm3\sqrt{\bar{x}(\bar{x}+1)}\]

where \(\bar{x}\) is the average number of opportunities between cases.

To obtain the number-between variable, we calculate the differences between the indices of cases (remember to first sort the data in time order).

# get indices of asphyxia cases
asph_cases <- which(births$asphyxia)
# calculate the number of deliveries between cases
asph_g     <- diff(c(asph_cases))

# plot G chart
qic(asph_g, chart = 'g')

Since the geometric distribution is highly skewed, the average is not ideal for runs analysis, which assumes the data are symmetrically distributed around the centre. Instead, the median may be used – as is done by default in qicharts2.

As is the case with the other SPC charts for counts data, negative lower control limits are rounded to 0, as values below this is not possible.

Note that process improvement – as in fewer cases – will present itself as the curve going up. To trigger a 3-sigma signal we would – in this case – need at least 518 deliveries with no cases of asphyxia corresponding to about seven weeks (518 / 73).

Thus, G charts are useful alternatives to P charts when occurrences are rare and the primary interest is in detecting process improvement. However, one chart does not exclude the other, and P and G charts go well together. The P chart may be more familiar to users and is more useful for signalling process deterioration, while the G charts helps trigger signals of improvement.

13.3 T charts for time between events

The T chart plots the time between events – a continuous variable. While it is possible to use an I chart to plot such data, this approach may be problematic. If events occur according to a Poisson distribution – which is often the case – the time between events is more appropriately modelled by the exponential distribution, which is highly skewed.

One way to address this issue is to transform the data prior to plotting it on an I chart. An appropriate transformation is given by:

\[x = y^{(1 / 3.6)}\]

where \(x\) is the transformed variable and \(y\) is the original time between events variable.

Control limits can then be calculated for the transformed data using the standard I chart procedure.

However, the transformed values may be difficult to interpret. A better approach is to plot the original values alongside back-transformed control limits and centre line. The back-transformation is:

\[y=x^{3.6}\].

# get time between events
asph_t <- diff(c(births$datetime[asph_cases]))
# plot T chart
qic(asph_t, chart = 't')
T chart of time (days) between deliveries with neonatal asphyxia

Figure 13.4: T chart of time (days) between deliveries with neonatal asphyxia

Note that T charts do not accommodate zero time between events – a situation that may arise when time is not recorded with sufficiently high resolution. For example, if two events – such as patient falls – occur on the same day and the time of day is not recorded, the time variable is not truly continuous. In such cases, a G chart plotting days between events, a discrete variable, may be more appropriate.

In fact, to our eyes, the control limits on the T chart in Figure 13.4 appear unnaturally wide. The upper control limit indicates that at least 190 days (or 27 weeks) must pass without any asphyxia cases before a signal is triggered. This suggests to us that these asphyxia cases (being discrete cases rather than random events) may not be suitable for analysis with a T chart.

13.4 The Bernoulli CUSUM chart for binary data

bchart(births$asphyxia, target = mean(births$asphyxia))

bchart(births$asphyxia, target = 0.02)

bchart(births$asphyxia, target = 0.002)

bchart(births$asphyxia, target = 500)