Intro
The acronym CONSORT stands for “Consolidated Standards of Reporting Trials”. The CONSORT statement is published by the CONSORT group and gives recommendations aiming to improve the quality of reports of randomized clinical trials. Besides a 25 item checklist, the current CONSORT statement (2010) includes the CONSORT flow diagram which is to visualize the progress through the phases of a parallel randomised clinical trial of two groups (see Figure 1).
In this blog post, I will do both show how to “draw” the CONSORT flow diagram using R and discuss the pros and cons of this approach.
Figure 1: CONSORT 2010 Flow Diagram
R Packages
The following R packages are required to create, plot and save flow diagrams: DiagrammeR
supports the Graphviz
software which consists of a graph description language called DOT
. DiagrammeRsvg
and rsvg
are required to save diagrams to a hard drive.
library(DiagrammeR) library(DiagrammeRsvg) library(rsvg)
In case we want to include the flow diagram into a .Rmd
document and render it as PDF, we need to install the phantomjs
JavaScript library. This can be done using the install_phantomjs()
function of the webshot
package.
webshot::install_phantomjs()
GraphViz
is very flexible regarding the definition of line colors, arrow shapes, node shapes, and many other layout features. However, since the actual purpose of GraphViz
is the visualization of graphs (networks), it is difficult to control the position of the nodes.
Unlike in a Cartesian plot, the position of the nodes is not defined by x and y coordinates. Instead, the nodes' position can only be determined by edges (relationships between nodes) and their attributes.
Creating a grid
When I looked at the CONSORT flow diagram (see Figure 1), I had the impression that the nodes may be placed using a grid-based system.
Figure 2: Placement of nodes using a grid
The CONSORT flow diagram conceptualized as grid consists of 21 nodes. That is:
- 4 nodes with plain text (blue, T1-T4) describing the phase of trial (Enrollment, Allocation, Follow-Up, Analysis) and
- 9 boxed nodes (black, B1-B9) containing text (“Randomized”) and numerical data (“n=200”).
In order to populate each cell of the grid, we need to add:
- 4 nodes rendered as tiny and, thus, invisible points (green, P1-P4). We use the points to mimic edges brancing-off from other edges.
- 4 completely invisible nodes (grey, I1_I4).
The numbers in brackets denote the nodes' ID numbers.
Eventually, the grid structure is created using both vertical and horizontal edges.
Creating the CONSORT Flow Diagram
When creating the actual CONSORT diagram we gonna make use of the grid structure. All we have to do now is to change some attributes of the nodes and edges.
Preparation
To make the flow diagram dynamic, we need to save constant elements (text labels) and dynamic elements (numeric values) into separate vectors (text
, values
). The paste1()
-function is a simple wrapper of the base R paste0()
-function making it more convenient to concatinate text labels and numeric values.
# Values ------------------------------------------------------------------ values <- c(210, 10, 200, 100, 100, 10, 10, 90, 90) # Defining Text Labels ---------------------------------------------------- text <- c('Assessment for\neligibility', 'Excluded', 'Randomized', 'Allocated to\nintervention', 'Allocated to\nintervention', 'Lost to follow-up', 'Lost to follow-up', 'Analysed', 'Analysed') # Defining Function ------------------------------------------------------- paste1 <- function(x, y){ paste0(x, ' (n=', y, ')') } # Concatenating Values and Text Labels ------------------------------------ LABS <- paste1(text, values) LABS
## [1] "Assessment for\neligibility (n=210)" ## [2] "Excluded (n=10)" ## [3] "Randomized (n=200)" ## [4] "Allocated to\nintervention (n=100)" ## [5] "Allocated to\nintervention (n=100)" ## [6] "Lost to follow-up (n=10)" ## [7] "Lost to follow-up (n=10)" ## [8] "Analysed (n=90)" ## [9] "Analysed (n=90)"
With the next step, we create a node data frame (ndf
) using the create_node_df()
-function of the DiagrammR
package. The attributes of this function are pretty much self-explanatory.
Node Data Frame
ndf <- create_node_df( n = 21, label = c('Enrollment', 'Allocation', 'Follow-Up', 'Analysis', LABS, rep("", 8)), style = c(rep("solid", 13), rep('invis', 8)), shape = c(rep("plaintext", 4), rep("box", 9), rep("point", 8)), width = c(rep(2, 4), rep(2.5, 9), rep(0.001, 8)), hight = c(rep(0.5, 13), rep(0.001, 8)), fontsize = c(rep(14, 4), rep(10, 17)), fontname = c(rep('Arial Rounded MT Bold', 4), rep('Courier New', 17)), penwidth = 2.0, fixedsize = "true")
Edge Data Frame
Furthermore, we create an edge data frame (edf
) using the create_edge_df()
-function. Edges we don't want to be visible are colored white (with alpha = 0) and don't get an arrowhead. To make edges horizontal, the constraint
attribute needs to be set to false
. from
and to
are vectors containing node IDs from which edges are outbound and incoming. They determine the relationships between the nodes. To arrange the code more clearly, I've ordered outbound and incoming edges by columns and rows according to Figure 2.
edf <- create_edge_df( arrowhead = c(rep('none', 3), rep("vee", 3), rep('none', 2), "vee", rep('none', 6), rep("vee", 3), rep("none", 3), "vee", rep("none", 10)), color = c(rep('#00000000', 3), rep('black', 6), rep('#00000000', 6), rep('black', 3), rep('#00000000', 3), rep('black', 1), rep('#00000000', 2), rep('black', 2), rep('#00000000', 6)), constraint = c(rep("true", 18), rep('false', 14)), from = c(1, 19, 20, 16, 8, 10, # column 1 5, 14, 7, 15, 2, 3, # column 2 18, 6, 21, 17, 9, 11, # column 3 1, 5, # row 1 19, 14, # row 2 20, 7, # row 3 16, 15, # row 4 8, 2, # row 5 10, 3, # row 6 12, 4), # row 7 to = c(19, 20, 16, 8, 10, 12, # column 1 14, 7, 15, 2, 3, 4, # column 2 6, 21, 17, 9, 11, 13, # column 3 5, 18, # row 1 14, 6, # row 2 7, 21, # row 3 15, 17, # row 4 2, 9, # row 5 3, 11, # row 6 4, 13)) # row 7
Plotting
Before we are going to plot the flow diagram, we need to create a graph using the create_graph()
-function. The graph can be plotted using the render_graph()
-function.
# Create Graph ------------------------------------------------------------ g <- create_graph(ndf, edf, attr_theme = NULL) # Plotting ---------------------------------------------------------------- render_graph(g)
Figure 3: Simplified CONSORT Flow Diagram
Unfortunately, it didn't find a way to left align text within the text boxes. Thus, I didn't manage to draw a fully featured CONSORT flow diagram using R.
Export
Saving the flow diagram to a hard drive is quite straight forward:
export_graph(g, file_name = "CONSORT.png")
File types to be exported are PNG, PDF, SVG, PS and GEXF (graph file format).
Discussion
As demonstrated in this blog post, “drawing” a CONSORT flow diagram using R and GraphViz is rather cumbersome. Because of some limitations regarding the alignment of text within boxes, only a simplyfied form of the CONSORT flow diagram can be drawn. However, the use of R seems to be the only way to make flow diagrams dynamic.