How to Plot Venn Diagrams Using R, ggplot2 and ggforce

Intro

Venn diagrams – named after the English logician and philosopher John Venn – “illustrate the logical relationships between two or more sets of items” with overlapping circles.

In this tutorial, I'll show how to plot a three set venn diagram using R and the ggplot2 package.

Packages and Data

For the R code to run, we need to install and load three R packages. Unlike tidyverse and ggforce, the limma package must be installed from Bioconductor rather than from CRAN.

Moreover, we create a random data frame using the rbinom() function.

source("http://www.bioconductor.org/biocLite.R")
biocLite("limma")
library(limma)
library(tidyverse)
library(ggforce)
set.seed((123))
mydata <- data.frame(A = rbinom(100, 1, 0.8),
                     B = rbinom(100, 1, 0.7),
                     C = rbinom(100, 1, 0.6)) %>%
                       mutate_all(., as.logical)

Drawing the Circles

Next, we create a data frame defining the x and y coordinates for the three circles we want to draw and a variable defining the labels. For plotting the circles – the basic structure of our venn diagram – we need the geom_circle() function of the ggforce package. We employ the geom_circle()-function of the ggforce package to actually draw the circles. With the parameter r (= 1.5), we define the radius of the circles.

df.venn <- data.frame(x = c(0, 0.866, -0.866),
                      y = c(1, -0.5, -0.5),
                      labels = c('A', 'B', 'C'))
ggplot(df.venn, aes(x0 = x, y0 = y, r = 1.5, fill = labels)) +
    geom_circle(alpha = .3, size = 1, colour = 'grey') +
      coord_fixed() +
        theme_void()

plot of chunk unnamed-chunk-2

Furthermore, we need a data frame with the values we want the plot and the coordinates for plotting these values. The values can be obtained using the vennCounts() function of the limma package. Since ggplot2 requires data frames we need to first transform the vdc object (class VennCounts) into a matrix and then into a data frame. In addition, we need to add the x and y coordinates for plotting the values.

vdc <- vennCounts(mydata)
class(vdc) <- 'matrix'
df.vdc <- as.data.frame(vdc)[-1,] %>%
  mutate(x = c(0, 1.2, 0.8, -1.2, -0.8, 0, 0),
         y = c(1.2, -0.6, 0.5, -0.6, 0.5, -1, 0))

The final Plot

Finally, we'll customize the look of our venn diagram and plot the values.

ggplot(df.venn) +
  geom_circle(aes(x0 = x, y0 = y, r = 1.5, fill = labels), alpha = .3, size = 1, colour = 'grey') +
  coord_fixed() +
  theme_void() +
  theme(legend.position = 'bottom') +
  scale_fill_manual(values = c('cornflowerblue', 'firebrick',  'gold')) +
  scale_colour_manual(values = c('cornflowerblue', 'firebrick', 'gold'), guide = FALSE) +
  labs(fill = NULL) +
  annotate("text", x = df.vdc$x, y = df.vdc$y, label = df.vdc$Counts, size = 5)

plot of chunk unnamed-chunk-4

Author: norbert

Biometrician at Clinical Trial Centre, Leipzig University (GER), with degrees in sociology (MA) and public health (MPH).

4 thoughts on “How to Plot Venn Diagrams Using R, ggplot2 and ggforce”

  1. Great Example.
    In order to enhance the applicability I would remove the limma from the example. It really takes a long time to install the library. At the end there is nothing more than a simple binary data frame.

    For three sets of arbitrary length I would suggest a function like this:

    get_venn_data <- function(u1, u2, u3){
    res <- data.frame(
    A=integer(),
    B=integer(),
    C=integer(),
    Counts=integer()
    )
    nix <- c(0,0,0,0)
    a <- c(1,0,0,nrow(as.data.frame(u1)))
    b <- c(0,1,0,nrow(as.data.frame(u2)))
    c <- c(0,0,1,nrow(as.data.frame(u3)))
    ab <- c(1,1,0,nrow(as.data.frame(intersect(u1,u2))))
    ac <- c(1,0,1,nrow(as.data.frame(intersect(u1,u3))))
    bc <- c(0,1,1,nrow(as.data.frame(intersect(u2,u3))))
    abc <- c(1,1,1,nrow(as.data.frame(intersect(intersect(u2,u3),u3))))
    res <- rbind(res, nix)
    res <- rbind(res, a)
    res <- rbind(res, b)
    res <- rbind(res, c)
    res <- rbind(res, ab)
    res <- rbind(res, ac)
    res <- rbind(res, bc)
    res <- rbind(res, abc)
    colnames(res) <- c(“A”,”B”,”C”,”Counts”)
    return(res)
    }
    vdc <- get_venn_data(c(1,2,3),c(3,4,5),c(1,2,3,4,5))

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: