Scoring the PHQ-9 Questionnaire Using R

Intro

The PHQ-9 is the nine-item depression module of the Patient Health Questionnaire. Each of the items is scored on a 4-point Likert scale ranging from 0 (not at all) to 3 (nearly every day). The items are summed to obtain a total score ranging from 0 to 27 with higher scores indicating greater severity of depression. Based on the total score, different levels of severity can be evaluated with 0–4, 5–9, 10–14, 15–19 and 20–27 points indicating “minimal”, “mild”, “moderate”, “moderately severe” and “severe” depression.

The PHQ-9 questionnaire may be found under the following link.

In this blog post, I show how to calculate the PHQ-9 score and the PHQ-9 severety levels.

Packages and data

The dataset we are going to use was published in Plos One. The file has got a Digital Object Identifier (doi) and may be imported into R using the `read_delim()` function of the `readr` package.

```library(readr)
library(dplyr)
library(ggplot2)

delim = ";",
escape_double = FALSE,
trim_ws = TRUE) %>%
select(starts_with('phq9'))

glimpse(df.phq9)
```
```## Observations: 1,337
## Variables: 9
## \$ phq9_1 <int> 1, 3, 2, 0, 0, 0, 1, 0, 0, 2, 1, 1, 0, 3, 0, 0, 0, 2, 0...
## \$ phq9_2 <int> 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0...
## \$ phq9_3 <int> 3, 2, 2, 2, 1, 0, 1, 3, 1, 0, 1, 1, 0, 3, 1, 0, 0, 0, 0...
## \$ phq9_4 <int> 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 1, 3, 0, 1, 0, 0, 0, 1, 0...
## \$ phq9_5 <int> 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 1, 0, 0, 0, 0, 0...
## \$ phq9_6 <int> 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0...
## \$ phq9_7 <int> 0, 1, 1, 1, 0, 1, 0, 0, 0, 3, 1, 1, 0, 1, 0, 0, 0, 0, 0...
## \$ phq9_8 <int> 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0...
## \$ phq9_9 <int> 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
```

The Scoring Function

The `scoring_phq9` function requires a data frame containing the PHQ-9 items (data) and a vector containing the items' names (items.phq9) as input parameters.

```scoring_phq9 <- function(data, items.phq9) {
data %>%
mutate(nvalid.phq9 = rowSums(!is.na(select(., items.phq9))),
nvalid.phq9 = as.integer(nvalid.phq9),
mean.temp = rowSums(select(., items.phq9), na.rm = TRUE)/nvalid.phq9,
phq.01.temp = as.integer(unlist(data[items.phq9[1]])),
phq.02.temp = as.integer(unlist(data[items.phq9[2]])),
phq.03.temp = as.integer(unlist(data[items.phq9[3]])),
phq.04.temp = as.integer(unlist(data[items.phq9[4]])),
phq.05.temp = as.integer(unlist(data[items.phq9[5]])),
phq.06.temp = as.integer(unlist(data[items.phq9[6]])),
phq.07.temp = as.integer(unlist(data[items.phq9[7]])),
phq.08.temp = as.integer(unlist(data[items.phq9[8]])),
phq.09.temp = as.integer(unlist(data[items.phq9[9]]))) %>%
mutate_at(vars(phq.01.temp:phq.09.temp),
funs(ifelse(is.na(.), round(mean.temp), .))) %>%
mutate(score.temp = rowSums(select(., phq.01.temp:phq.09.temp), na.rm = TRUE),
score.phq9 = ifelse(nvalid.phq9 >= 7, as.integer(round(score.temp)), NA),
cutoff.phq9 = case_when(
score.phq9 >= 20 ~ 'severe',
score.phq9 >= 15 ~ 'moderately severe',
score.phq9 >= 10 ~ 'moderate',
score.phq9 >= 5 ~ 'mild',
score.phq9 < 5 ~ 'minimal'),
cutoff.phq9 = factor(cutoff.phq9, levels = c('minimal', 'mild',
'moderate', 'moderately severe',
'severe'))) %>%
select(-ends_with("temp"))

}
```

Example

The function adds three variables to the original data frame:

• `nvalid.phq9`: Number of variables with valid values,
• `score.phq9`: PHQ-9 score (0 – 27),
• `cutoff.phq9`: PHQ-9 severety levels (minimal, mild, moderate, moderately severe, severe)
```items.phq9 <- paste0('phq9_', seq(1, 9, 1))
df.phq9 <- df.phq9 %>%
scoring_phq9(., items.phq9)
glimpse(df.phq9)
```
```## Observations: 1,337
## Variables: 12
## \$ phq9_1      <int> 1, 3, 2, 0, 0, 0, 1, 0, 0, 2, 1, 1, 0, 3, 0, 0, 0,...
## \$ phq9_2      <int> 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,...
## \$ phq9_3      <int> 3, 2, 2, 2, 1, 0, 1, 3, 1, 0, 1, 1, 0, 3, 1, 0, 0,...
## \$ phq9_4      <int> 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 1, 3, 0, 1, 0, 0, 0,...
## \$ phq9_5      <int> 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 1, 0, 0, 0,...
## \$ phq9_6      <int> 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,...
## \$ phq9_7      <int> 0, 1, 1, 1, 0, 1, 0, 0, 0, 3, 1, 1, 0, 1, 0, 0, 0,...
## \$ phq9_8      <int> 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,...
## \$ phq9_9      <int> 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0,...
## \$ nvalid.phq9 <int> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,...
## \$ score.phq9  <int> 7, 10, 7, 9, 3, 2, 3, 4, 5, 7, 7, 8, 0, 11, 1, 0, ...
## \$ cutoff.phq9 <fct> mild, moderate, mild, mild, minimal, minimal, mini...
```

Visualization

PHQ-9 Score

```ggplot(df.phq9, aes(score.phq9)) +
geom_density(fill = 'blue', alpha = 0.2) +
scale_x_continuous(limits = c(0, 27), breaks = c(0,5,10,15,20,27)) +
labs(x = 'PHQ-9 Score', y = 'Density') +
theme_bw()
```

PHQ-9 Severety Levels

```ggplot(df.phq9, aes(x = cutoff.phq9, fill = cutoff.phq9)) +
geom_bar(colour = 'black') +
scale_fill_brewer(type = 'seq') +
labs(x = NULL, y = NULL, fill = NULL) +
theme_bw()
```

Author: norbert

Biometrician at Clinical Trial Centre, Leipzig University (GER), with degrees in sociology (MA) and public health (MPH).

This site uses Akismet to reduce spam. Learn how your comment data is processed.