Chapter 2 Understanding SPC Charts
Having explored the concepts of common and special cause variation through the example of handwritten letters, the next step is to understand how these ideas apply to real-world processes in healthcare. This involves constructing charts that represent the “voice” or behaviour of a process over time. By drawing on statistical theory, such charts help us determine whether a process is exhibiting common cause variation or whether special causes are present. These visualisations are known as SPC charts, control charts, or process behaviour charts.
SPC charts are point-and-line plots of data collected sequentially over time. They provide an operational definition for distinguishing between common and special cause variation. An operational definition is one that is practical and useful, ensuring that different individuals interpret the data in a consistent way.
Figure 2.1: Control chart of systolic blood pressure
Figure 2.1 shows a control chart of daily systolic blood pressure measurements. One data point (#6), marked with a red dot, stands out from the others. This deviation triggers a signal, indicating that the point is unlikely to have arisen by chance alone and may therefore be attributed to a special cause.
Special causes are identified through unusual patterns in the data. By “unusual”, we mean patterns that are unlikely to occur purely by chance, such as an extreme data point. In contrast, common cause variation represents the natural, inherent noise in a stable process. When no signals are present, the observed variation is consistent with common cause variation. When signals are present, the data are consistent with special cause variation.
There are many types of control charts, and the choice of chart depends primarily on the type of data being analysed. Despite these differences, most SPC charts share a common appearance and are interpreted in the same way, using a set of statistical tests – or rules – to identify unusual patterns.
In this chapter, we introduce seven of the most commonly used control charts in healthcare, often referred to as the Magnificent Seven, along with their practical applications. In Chapter 3, we examine the rules used to detect unusual patterns in greater detail. Later, in Part 2 of the book, we provide a detailed guide to the calculations required to construct and analyse control charts.
2.1 Anatomy and physiology of SPC charts
A key innovation introduced by Shewhart was the concept of control limits, which define the expected range of common cause variation. Data points within these limits are consistent with common cause variation, whereas points outside the limits suggest the presence of special causes.
Figure 2.2: Standardised control chart. SD = standard deviation; CL = centre line; LCL = lower control limit; UCL = upper control limit
Figure 2.2 shows a standardised Shewhart control chart constructed from random numbers drawn from a normal distribution with mean 0 and standard deviation 1. The x-axis represents time or sequence order, and the y-axis shows the observed values. Each point represents a subgroup, that is, a set of observations collected under similar conditions at a given point in time. The points are connected in sequence to emphasise the temporal structure of the data.
The chart includes a centre line (CL), representing the average of the data (typically the mean or median), and two control limits: the lower control limit (LCL) and the upper control limit (UCL). These limits define the range within which common cause variation is expected to fall. Because they are placed three standard deviations (SD) on either side of the centre line, they are often referred to as 3-sigma limits.
The data points tend to cluster around the centre line, with most values close to the average and fewer observations occurring as the distance from the centre increases. The control limits are set to include almost all points arising from common cause variation. Points falling outside these limits are therefore likely to represent special causes.
2.2 Why 3-sigma limits?
Shewhart’s choice of 3-sigma limits was informed by empirical observations from real-world data. For normally distributed data, such as those in Figure 2.2, approximately 99.7% of observations lie within three standard deviations of the mean. This implies that the probability of a point falling outside the limits purely by chance is about 0.3%, or roughly 3 in 1,000.
In a chart with 20 data points, the probability that all points fall within the control limits is therefore approximately 95%.
However, Shewhart did not base his method solely on theoretical considerations. He recognised that most real-world data are not normally distributed and argued that the use of 3-sigma limits is justified because “they work” (Shewhart (1931), p. 18). In practice, this choice provides a reasonable balance between sensitivity (detecting special causes) and specificity (avoiding false alarms), regardless of the underlying distribution. This issue is discussed further in Appendix C.
2.3 Common types of control charts: The Magnificent Seven
As noted above, control limits are typically expressed as:
\[CL \pm 3SD\]
Constructing a control chart therefore requires estimates of both the process mean and the process standard deviation. Importantly, the standard deviation used should reflect only common cause variation. Using the overall standard deviation of all data points may incorporate variation from special causes and result in control limits that are too wide.
To avoid this, SPC charts use estimates based on within-subgroup variation, typically obtained by pooling variation across subgroups. The exact method depends on the type of data being analysed. Detailed formulas are provided later in Table 6.1.
Broadly, data fall into two categories: count data and measurement data.
2.3.1 Count data
Count data consist of non-negative integers representing the number of events or cases. The distinction between events and cases is important.
An event is an occurrence in time and space, such as a patient fall. If a patient falls multiple times, each fall is counted as a separate event.
A case refers to an individual unit possessing (or not possessing) a given attribute, such as a patient who has fallen. Each individual is counted only once, regardless of how many events occur.
Both events and cases can be expressed relative to a denominator, often referred to as the area of opportunity.
Events are typically expressed as rates, such as the number of falls per 1,000 patient-days. In this case, the denominator represents time or exposure and differs in nature from the numerator.
Cases are expressed as proportions, such as the percentage of patients who experienced at least one fall. Here, the numerator is a subset of the denominator and cannot exceed it.
The most common SPC charts for count data are:
- C chart: counts of events
- U chart: rates (events per unit of time or opportunity)
- P chart: proportions (cases as a subset of the total)
2.3.2 Measurement data
Measurement data are continuous and may take any value within a range, often including decimals. Examples include blood pressure, height, weight, and waiting times.
Such data can be plotted either as individual observations or as subgroup summaries. For example, waiting times may be recorded for each patient individually or averaged over a period such as a day or month.
When measurements are plotted individually, an I chart (also called an X chart) is used, where each subgroup consists of a single observation. The I chart is often paired with a moving range (MR) chart, which displays the variation between consecutive observations.
When measurements are grouped, an X-bar chart is used to display subgroup averages. This is typically paired with an S chart, which shows the variation within each subgroup.
Thus:
- I and MR charts are used for individual measurements (subgroup size = 1)
- X-bar and S charts are used for grouped measurements (subgroup size > 1)
2.4 Summary of common SPC charts
A Shewhart control chart is a point-and-line plot of data over time, supplemented by three horizontal reference lines:
- The centre line (CL), representing the process average
- The lower control limit (LCL) and upper control limit (UCL), defining the range of expected variation
Although different chart types are used for different kinds of data, they share a common structure and interpretation. Selecting the appropriate chart begins with identifying the type of data.
For count data:
- C chart: counts of events
- U chart: rates
- P chart: proportions
For measurement data:
- I chart with MR chart: individual observations
- X-bar chart with S chart: subgroup averages and variation
In the next chapter, we examine the rules used to detect signals of special cause variation. These rules are designed to achieve high sensitivity while maintaining a reasonable false alarm rate.