## Intro

The acronym CONSORT stands for “Consolidated Standards of Reporting Trials”. The CONSORT statement is published by the CONSORT group and gives recommendations aiming to improve the quality of reports of randomized clinical trials. Besides a 25 item checklist, the current CONSORT statement (2010) includes the CONSORT flow diagram which is to visualize the progress through the phases of a parallel randomised clinical trial of two groups (see Figure 1).

In this blog post, I will do both show how to “draw” the CONSORT flow diagram using R and discuss the pros and cons of this approach.

Figure 1: CONSORT 2010 Flow Diagram

### R Packages

The following R packages are required to create, plot and save flow diagrams: `DiagrammeR`

supports the `Graphviz`

software which consists of a graph description language called `DOT`

. `DiagrammeRsvg`

and `rsvg`

are required to save diagrams to a hard drive.

library(DiagrammeR) library(DiagrammeRsvg) library(rsvg)

In case we want to include the flow diagram into a `.Rmd`

document and render it as PDF, we need to install the `phantomjs`

JavaScript library. This can be done using the `install_phantomjs()`

function of the `webshot`

package.

webshot::install_phantomjs()

`GraphViz`

is very flexible regarding the definition of line colors, arrow shapes, node shapes, and many other layout features. However, since the actual purpose of `GraphViz`

is the visualization of graphs (networks), it is difficult to control the position of the nodes.

Unlike in a Cartesian plot, the position of the nodes is not defined by x and y coordinates. Instead, the nodes' position can only be determined by edges (relationships between nodes) and their attributes.

### Creating a grid

When I looked at the CONSORT flow diagram (see Figure 1), I had the impression that the nodes may be placed using a grid-based system.

Figure 2: Placement of nodes using a grid

The CONSORT flow diagram conceptualized as grid consists of 21 nodes. That is:

- 4 nodes with plain text (blue, T1-T4) describing the phase of trial (Enrollment, Allocation, Follow-Up, Analysis) and
- 9 boxed nodes (black, B1-B9) containing text (“Randomized”) and numerical data (“n=200”).

In order to populate each cell of the grid, we need to add:

- 4 nodes rendered as tiny and, thus, invisible points (green, P1-P4). We use the points to mimic edges brancing-off from other edges.
- 4 completely invisible nodes (grey, I1_I4).

The numbers in brackets denote the nodes' ID numbers.

Eventually, the grid structure is created using both vertical and horizontal edges.

## Creating the CONSORT Flow Diagram

When creating the actual CONSORT diagram we gonna make use of the grid structure. All we have to do now is to change some attributes of the nodes and edges.

### Preparation

To make the flow diagram dynamic, we need to save constant elements (text labels) and dynamic elements (numeric values) into separate vectors (`text`

, `values`

). The `paste1()`

-function is a simple wrapper of the base R `paste0()`

-function making it more convenient to concatinate text labels and numeric values.

# Values ------------------------------------------------------------------ values <- c(210, 10, 200, 100, 100, 10, 10, 90, 90) # Defining Text Labels ---------------------------------------------------- text <- c('Assessment for\neligibility', 'Excluded', 'Randomized', 'Allocated to\nintervention', 'Allocated to\nintervention', 'Lost to follow-up', 'Lost to follow-up', 'Analysed', 'Analysed') # Defining Function ------------------------------------------------------- paste1 <- function(x, y){ paste0(x, ' (n=', y, ')') } # Concatenating Values and Text Labels ------------------------------------ LABS <- paste1(text, values) LABS

## [1] "Assessment for\neligibility (n=210)" ## [2] "Excluded (n=10)" ## [3] "Randomized (n=200)" ## [4] "Allocated to\nintervention (n=100)" ## [5] "Allocated to\nintervention (n=100)" ## [6] "Lost to follow-up (n=10)" ## [7] "Lost to follow-up (n=10)" ## [8] "Analysed (n=90)" ## [9] "Analysed (n=90)"

With the next step, we create a node data frame (`ndf`

) using the `create_node_df()`

-function of the `DiagrammR`

package. The attributes of this function are pretty much self-explanatory.

### Node Data Frame

ndf <- create_node_df( n = 21, label = c('Enrollment', 'Allocation', 'Follow-Up', 'Analysis', LABS, rep("", 8)), style = c(rep("solid", 13), rep('invis', 8)), shape = c(rep("plaintext", 4), rep("box", 9), rep("point", 8)), width = c(rep(2, 4), rep(2.5, 9), rep(0.001, 8)), hight = c(rep(0.5, 13), rep(0.001, 8)), fontsize = c(rep(14, 4), rep(10, 17)), fontname = c(rep('Arial Rounded MT Bold', 4), rep('Courier New', 17)), penwidth = 2.0, fixedsize = "true")

### Edge Data Frame

Furthermore, we create an edge data frame (`edf`

) using the `create_edge_df()`

-function. Edges we don't want to be visible are colored white (with alpha = 0) and don't get an arrowhead. To make edges horizontal, the `constraint`

attribute needs to be set to `false`

. `from`

and `to`

are vectors containing node IDs from which edges are outbound and incoming. They determine the relationships between the nodes. To arrange the code more clearly, I've ordered outbound and incoming edges by columns and rows according to Figure 2.

edf <- create_edge_df( arrowhead = c(rep('none', 3), rep("vee", 3), rep('none', 2), "vee", rep('none', 6), rep("vee", 3), rep("none", 3), "vee", rep("none", 10)), color = c(rep('#00000000', 3), rep('black', 6), rep('#00000000', 6), rep('black', 3), rep('#00000000', 3), rep('black', 1), rep('#00000000', 2), rep('black', 2), rep('#00000000', 6)), constraint = c(rep("true", 18), rep('false', 14)), from = c(1, 19, 20, 16, 8, 10, # column 1 5, 14, 7, 15, 2, 3, # column 2 18, 6, 21, 17, 9, 11, # column 3 1, 5, # row 1 19, 14, # row 2 20, 7, # row 3 16, 15, # row 4 8, 2, # row 5 10, 3, # row 6 12, 4), # row 7 to = c(19, 20, 16, 8, 10, 12, # column 1 14, 7, 15, 2, 3, 4, # column 2 6, 21, 17, 9, 11, 13, # column 3 5, 18, # row 1 14, 6, # row 2 7, 21, # row 3 15, 17, # row 4 2, 9, # row 5 3, 11, # row 6 4, 13)) # row 7

### Plotting

Before we are going to plot the flow diagram, we need to create a graph using the `create_graph()`

-function. The graph can be plotted using the `render_graph()`

-function.

# Create Graph ------------------------------------------------------------ g <- create_graph(ndf, edf, attr_theme = NULL) # Plotting ---------------------------------------------------------------- render_graph(g)

Figure 3: Simplified CONSORT Flow Diagram

Unfortunately, it didn't find a way to left align text within the text boxes. Thus, I didn't manage to draw a fully featured CONSORT flow diagram using R.

### Export

Saving the flow diagram to a hard drive is quite straight forward:

export_graph(g, file_name = "CONSORT.png")

File types to be exported are PNG, PDF, SVG, PS and GEXF (graph file format).

## Discussion

As demonstrated in this blog post, “drawing” a CONSORT flow diagram using R and GraphViz is rather cumbersome. Because of some limitations regarding the alignment of text within boxes, only a simplyfied form of the CONSORT flow diagram can be drawn. However, the use of R seems to be the only way to make flow diagrams dynamic.