# How to Plot Venn Diagrams Using R, ggplot2 and ggforce

## Intro

Venn diagrams – named after the English logician and philosopher John Venn – “illustrate the logical relationships between two or more sets of items” with overlapping circles.

In this tutorial, I'll show how to plot a three set venn diagram using `R` and the `ggplot2` package.

## Packages and Data

For the R code to run, we need to install and load three R packages. Unlike `tidyverse` and `ggforce`, the `limma` package must be installed from Bioconductor rather than from CRAN.

Moreover, we create a random data frame using the `rbinom()` function.

```source("http://www.bioconductor.org/biocLite.R")
biocLite("limma")
library(limma)
library(tidyverse)
library(ggforce)
set.seed((123))
mydata <- data.frame(A = rbinom(100, 1, 0.8),
B = rbinom(100, 1, 0.7),
C = rbinom(100, 1, 0.6)) %>%
mutate_all(., as.logical)
```

## Drawing the Circles

Next, we create a data frame defining the x and y coordinates for the three circles we want to draw and a variable defining the labels. For plotting the circles – the basic structure of our venn diagram – we need the `geom_circle()` function of the `ggforce` package. We employ the `geom_circle()`-function of the `ggforce` package to actually draw the circles. With the parameter `r` (= 1.5), we define the radius of the circles.

```df.venn <- data.frame(x = c(0, 0.866, -0.866),
y = c(1, -0.5, -0.5),
labels = c('A', 'B', 'C'))
ggplot(df.venn, aes(x0 = x, y0 = y, r = 1.5, fill = labels)) +
geom_circle(alpha = .3, size = 1, colour = 'grey') +
coord_fixed() +
theme_void()
``` Furthermore, we need a data frame with the values we want the plot and the coordinates for plotting these values. The values can be obtained using the `vennCounts()` function of the `limma` package. Since `ggplot2` requires data frames we need to first transform the `vdc` object (class VennCounts) into a matrix and then into a data frame. In addition, we need to add the x and y coordinates for plotting the values.

```vdc <- vennCounts(mydata)
class(vdc) <- 'matrix'
df.vdc <- as.data.frame(vdc)[-1,] %>%
mutate(x = c(0, 1.2, 0.8, -1.2, -0.8, 0, 0),
y = c(1.2, -0.6, 0.5, -0.6, 0.5, -1, 0))
```

## The final Plot

Finally, we'll customize the look of our venn diagram and plot the values.

```ggplot(df.venn) +
geom_circle(aes(x0 = x, y0 = y, r = 1.5, fill = labels), alpha = .3, size = 1, colour = 'grey') +
coord_fixed() +
theme_void() +
theme(legend.position = 'bottom') +
scale_fill_manual(values = c('cornflowerblue', 'firebrick',  'gold')) +
scale_colour_manual(values = c('cornflowerblue', 'firebrick', 'gold'), guide = FALSE) +
labs(fill = NULL) +
annotate("text", x = df.vdc\$x, y = df.vdc\$y, label = df.vdc\$Counts, size = 5)
1. Chanda says:

Thank you so much for the code.

2. nie ls says:

Great Example.
In order to enhance the applicability I would remove the limma from the example. It really takes a long time to install the library. At the end there is nothing more than a simple binary data frame.

For three sets of arbitrary length I would suggest a function like this:

get_venn_data <- function(u1, u2, u3){
res <- data.frame(
A=integer(),
B=integer(),
C=integer(),
Counts=integer()
)
nix <- c(0,0,0,0)
a <- c(1,0,0,nrow(as.data.frame(u1)))
b <- c(0,1,0,nrow(as.data.frame(u2)))
c <- c(0,0,1,nrow(as.data.frame(u3)))
ab <- c(1,1,0,nrow(as.data.frame(intersect(u1,u2))))
ac <- c(1,0,1,nrow(as.data.frame(intersect(u1,u3))))
bc <- c(0,1,1,nrow(as.data.frame(intersect(u2,u3))))
abc <- c(1,1,1,nrow(as.data.frame(intersect(intersect(u2,u3),u3))))
res <- rbind(res, nix)
res <- rbind(res, a)
res <- rbind(res, b)
res <- rbind(res, c)
res <- rbind(res, ab)
res <- rbind(res, ac)
res <- rbind(res, bc)
res <- rbind(res, abc)
colnames(res) <- c(“A”,”B”,”C”,”Counts”)
return(res)
}
vdc <- get_venn_data(c(1,2,3),c(3,4,5),c(1,2,3,4,5))

1. norbert says:

Thanks a lot! That seems to be a brilliant idea!

