How to calculate Odds Ratios in Case-Control Studies using R


In June 2017 I've started working at the Clinical Trial Centre Leipzig at Leipzig University. Since my knowledge in statistics is rather poor, my employer offered me to attend some seminars in Medical Biometry at the University of Heidelberg. The first seminar I attended was called “Basics of Epidemiology”. At the first day, we learned how to calculate so called odds ratios in case-control studies using a simple pocket calculator.

In this blog post, I will show, how to calculate a simple odds ratio with 95% CI using R.

Data simulation

The data I'm using in this blog post were simulated using the wakefield package. The following code returns a data frame with 2 binary variables (Exposition and Disease) and 1.000 cases.


mydata <- data.frame(Exposition = group(n = 1000, x = c('yes', 'no'), 
                             prob = c(0.75, 0.25)),
                     Disease = group(n = 1000, x = c('yes', 'no'), 
                             prob = c(0.75, 0.25)))
## [1] 1000    2
##   Exposition Disease
## 1        yes     yes
## 2         no      no
## 3         no     yes
## 4        yes     yes
## 5        yes     yes
## 6        yes     yes

Based on this data frame, we calculate a table showing how many patients with exposition vs. no exposition developed a disease vs. no disease.

tab <-table(mydata$Exposition, mydata$Disease)
##       yes  no
##   yes 569 210
##   no  163  58

Odds Ratio Calculation

In order to get to know whether the risk for developing a disease is significantly higher in patients having a certain exposition, we need to calculate the odds ratio and its 95% CI.

The following function will return a data frame containing these values.

# return odds ratio with 95%ci
f <- function(x) {
  or <- round((x[1] * x[4]) / (x[2] * x[3]), 2)
  cil <- round(exp(log(or) - 1.96 * sqrt(1/x[1] + 1/x[2] + 1/x[3] + 1/x[4])), 2)
  ciu <- round(exp(log(or) + 1.96 * sqrt(1/x[1] + 1/x[2] + 1/x[3] + 1/x[4])), 2)
  df <- data.frame(matrix(ncol = 3, nrow = 1, 
                              dimnames = list(NULL, c('CI_95_lower', 'OR', 'CI_95_upper'))))
  df[1,] <- rbind(c(cil, or, ciu))
  df <-

Now, we can deploy the function on our table tab.

df.or <- f(tab)
knitr::kable(df.or, align = 'c')
CI_95_lower OR CI_95_upper
0.68 0.96 1.35

As the results indicate, patients with a disposition have no higher risk to develop a disease than patients having no disposition.

Posted in Indroduction | Tagged , | Leave a comment

How to parse Citavi files using R


In academic writing, the use of reference management software has become essential. A couple of years ago, I started using the free open-source software Zotero. However, most of my workmates at Leipzig University work with Citavi, a commercial software which is widely used at German, Austrian, and Swiss universities. The current version of Citavi – Citavi 5 – was released in April 2015.

plot of chunk pic

In this blog post, I show how to import Citavi files into R.

Packages required

Citavi organizes references in SQLite databases. Although Citavi files end with .ctv5 rather than .sqlite, they may be parsed as sqlite files. For reproducing the code of this blog post, the following R packages are required.




A connection to the database myrefs.ctv5 may be established using the dbConnect() function of the DBI and the SQLite() function of the RSQLite package.

# connect to Citavi file
con <- DBI::dbConnect(RSQLite::SQLite(), 'myrefs.ctv5')

A list of all tables may be returned using the dbListTables() function of the DBI package. Out database contains 38 tables.

# get a list of all tables

[1] “Annotation” “Category”
[3] “CategoryCategory” “Changeset”
[5] “Collection” “DBVersion”
[7] “EntityLink” “Group”
[9] “ImportGroup” “ImportGroupReference”
[11] “Keyword” “KnowledgeItem”
[13] “KnowledgeItemCategory” “KnowledgeItemCollection”
[15] “KnowledgeItemGroup” “KnowledgeItemKeyword”
[17] “Library” “Location”
[19] “Periodical” “Person”
[21] “ProjectSettings” “ProjectUserSettings”
[23] “Publisher” “Reference”
[25] “ReferenceAuthor” “ReferenceCategory”
[27] “ReferenceCollaborator” “ReferenceCollection”
[29] “ReferenceEditor” “ReferenceGroup”
[31] “ReferenceKeyword” “ReferenceOrganization”
[33] “ReferenceOthersInvolved” “ReferencePublisher”
[35] “ReferenceReference” “SeriesTitle”
[37] “SeriesTitleEditor” “TaskItem”

Creating data frames

For reading the tables of the database, we need the dbGetQuery() function of the DBI package. Each table contains a number of variables. In case we want to save all variables into a data frame, we select them using the asterisk. <- dbGetQuery(con,'select * from Person')

In case we are only interested in a couple of variables, we need to specify their names separated by commas. <- dbGetQuery(con,'select ID, FirstName, LastName, Sex from Person')
df.keyword <- dbGetQuery(con,'select ID, Name from Keyword')
df.refs <- dbGetQuery(con,'select ID, Title, Year, Abstract, CreatedOn, ISBN, PageCount, PlaceOfPublication, ReferenceType   from Reference')

Data wrangling

In the next step, we try to join our data frames using the dplyr package.

mydata <- df.refs %>%
  left_join(, by = 'ID') %>%
    left_join(df.keyword, by = 'ID') 
## Error in eval(expr, envir, enclos): Can't join on 'ID' x 'ID' because of incompatible types (list / list)

However, R returns an error message disclosing that the data frames cannot be joined by their ID variable because of incompatible types.

When we take a closer look at the ID variables we see that they are organized as list containing 1603 elements of type raw. Moreover, each of the list elements consists of 16 alphanumerical elements.

## [1] "list"
##  raw [1:16] 41 5b 18 51 ...
## [1] 1603

In order to be able to join our data frames we need to convert the type of the ID variables from list to character. Furthermore, we collapse the 16 list elements into single strings separated by hyphens.

for (i in 1:nrow({$ID[[i]] <- as.character($ID[[i]])$ID[[i]] <- str_c($ID[[i]], sep = "", collapse = "-")
}$ID <- unlist($ID)
# df.keyword
for (i in 1:nrow(df.keyword)){
  df.keyword$ID[[i]] <- as.character(df.keyword$ID[[i]])
  df.keyword$ID[[i]] <- str_c(df.keyword$ID[[i]], sep = "", collapse = "-")
df.keyword$ID <- unlist(df.keyword$ID)
# df.refs
for (i in 1:nrow(df.refs)){
  df.refs$ID[[i]] <- as.character(df.refs$ID[[i]])
  df.refs$ID[[i]] <- str_c(df.refs$ID[[i]], sep = "", collapse = "-")
df.refs$ID <- unlist(df.refs$ID)
## [1] "character"
## [1] "34-26-a7-db-e2-79-ac-4c-ad-21-45-91-37-3f-e9-b0"
## [2] "75-e6-c1-fd-65-0a-82-48-a8-c8-e9-80-2a-79-db-2c"
## [3] "0b-8f-73-86-82-0a-c8-48-84-f7-d2-7a-04-df-12-65"
## [4] "7a-79-2b-1e-4a-ef-ae-40-ae-06-d5-bd-5d-e9-a7-cb"
## [5] "12-3c-0e-70-a2-2a-f9-48-ad-e4-b7-7e-85-63-97-96"
## [6] "bc-ae-06-78-89-48-df-47-88-b6-5f-92-20-9b-4c-7c"

Joining the data frames

Finally, our data frames may be joined by their ID variables.

mydata <- df.refs %>%
  left_join(, by = 'ID') %>%
    left_join(df.keyword, by = 'ID') 


Our final data frame contains 1066 cases and 13 variables, that is:

##  [1] "ID"                 "Title"              "Year"              
##  [4] "Abstract"           "CreatedOn"          "ISBN"              
##  [7] "PageCount"          "PlaceOfPublication" "ReferenceType"     
## [10] "FirstName"          "LastName"           "Sex"               
## [13] "Name"
Posted in Data Management | Tagged , | Leave a comment

How to install and use the hexSticker package


A couple of days ago, the hexSticker package was published on CRAN. The package provides some functions to plot hexagon stickers that may be used to promote R packages. As described on GitHub, the stickers can be plotted either using base R's plotting function, the lattice package or the ggplot2 package. Moreover, it is also possible to plot image files.

Since I found it quite demanding to install the hexSticker package on a current Linux os (Linux Mint 18.1), I decided to write a short tutorial explaining how to install and use the package on Linux Ubuntu-based operating systems.

Linux packages required

In a first step, we need to open the terminal to install the following software packages. While texinfo is required to build R packages from source, libudunits2-dev, fftw-dev and mffm-fftw1 are needed to install some R packages the hexSticker package depends on (ggforce, fftwtools).

sudo apt-get install texinfo libudunits2-dev fftw-dev mffm-fftw1 libfftw3-dev libtiff5-dev

R packages required

Recently, the fftwtools package was added to CRAN. Thus, it can be installed the usual way:

installed.packages('fftwtools', dep = TRUE)

Finally, the EBImage package must be installed from the Bioconductor repository and the packages ggimage, ggforce and hexSticker must be installed from CRAN.


For plotting an example hexsticker we need some data provided by the streetsofle package which must be installed from GitHub:

if (!require("devtools")) install.packages("devtools")


With the following code chunk we create two hexstickers for the streetsofle package. The colours chosen stem from the city flag of Leipzig. The arguments of the sticker() function are explained within the code's comments.


p.1 <- ggplot(aes(x = lon, y = lat), data = shape.ortsteile) + 
  theme_map_le() + 
  coord_quickmap() + 
  geom_polygon(aes(x = lon, y = lat, group = group), 
               fill = NA, 
               size = 0.2, 
               color = "#FFCB00") + 
  geom_polygon(aes(x = lon, y = lat, group = group),
               color = "#FFCB00", 
               size = 1, 
               fill = NA, 
               data = shape.bezirke) 

p.1 <- sticker(p.1,
               s_x = 1, # horizontal position of subplot
               s_y = 1.1, # vertical position of subplot
               s_width = 1.4, # width of subplot
               s_height = 1.4, # height of subplot
               p_x = 1, # horizontal position of font
               p_y = .43, # vertical position of font
               p_size = 6, # font size
               p_color = "#FFCB00", # font colour
               h_size = 3, # hexagon border size
               h_fill = "#004CFF", # hexagon fill colour
               h_color = "#FFCB00") # hexagon border colour

p.2 <- ggplot(aes(x = lon, y = lat), data = shape.ortsteile) + 
  theme_map_le() + 
  coord_quickmap() + 
  geom_polygon(aes(x = lon, y = lat, group = group), 
               fill = NA, 
               size = 0.2, 
               color = "#004CFF") + 
  geom_polygon(aes(x = lon, y = lat, group = group),
               color = "#004CFF", 
               size = 1, 
               fill = NA, 
               data = shape.bezirke) 

p.2 <- sticker(p.2,
               s_x = 1, # horizontal position of subplot
               s_y = 1.1, # vertical position of subplot
               s_width = 1.4, # width of subplot
               s_height = 1.4, # height of subplot
               p_x = 1, # horizontal position of font
               p_y = .43, # vertical position of font
               p_size = 6, # font size
               p_color = "#004CFF", # font color
               h_size = 3, # hexagon border size
               h_fill = "#FFCB00", # hexagon fill colour
               h_color = "#004CFF") # hexagon border colour

Finally, both plots are put into a grid layout using the grid.arrange() function of the gridExtra package.

grid.arrange(p.1, p.2, ncol = 2, respect = TRUE)

plot of chunk unnamed-chunk-6

I'm not sure which sticker looks better. What do you think?

Posted in Fav R Packages, Visualizing Data | Tagged | Leave a comment

How to plot a companion planting guide using ggplot2


Recently my girl-friend asked me whether I could do some data project being useful for our family rather than just for myself. A couple of years ago – after our first son was born – we decided to rent a small garden (allotment garden) very close to our flat. In our garden we grow some fruit and vegetables, e.g. strawberries, peas, beans and beetroot.

When growing fruit and vegetables it is considered important to know that plants compete for resources. While some plants benefit from one another, other plants sould not be grown together. The internet provides loads of so-called companion planting guides or charts (see Link). For each plant, they define lists of companions (growing well together) and antogonists (not growing well together).

This blog post is to show how to visualize a companion planting guide using R and the ggplot2 package.

Packages and data

With readxl for importing data from Excel, dplyr for data wrangling and ggplot2 for data visualization three R packages are required to reproduce the results of this blog post.


The data I'm going to visualize stem from some companion planting guides published in German (see Link). I only selected the plants relevant for our garden and entered the data manually into an Excel worksheet.

In the first step, I saved the imported data into a data frame (mydata.1). The second data frame mydata.2 is a copy of mydata.1 with the first two variables in reverse order. In order to receive a matrix (with the same number of factor levels in each column), I merged mydata.1 and mydata.2 into mydata using the rbind() function.

mydata.1 <- readxl::read_excel('guide.xlsx', sheet = 1) 
mydata.2 <- mydata.1[, c(2, 1, 3)]
colnames(mydata.2) <- colnames(mydata.1)
mydata <- rbind(mydata.1, mydata.2)
rm(list = c(setdiff(ls(), c('mydata'))))
head(mydata, 10)
## # A tibble: 10 × 3
##          plant_1  plant_2    status
##            <chr>    <chr>     <chr>
## 1       cucumber     peas companion
## 2           corn     peas companion
## 3         radish     peas companion
## 4           peas cucumber companion
## 5           corn cucumber companion
## 6       beetroot cucumber companion
## 7           corn potatoes companion
## 8       potatoes  cabbage companion
## 9         radish  cabbage companion
## 10 strawberries   lettuce companion

Finally, I removed all objects from workspace not required for data visualization and printed the first ten lines of the tibble (slightly modified data frame) mydata.


With the following code snippet, all variables of mydata are transformed into factors. Furthermore, a parameter needed for plotting is saved as a vector named labs.n.

mydata <- dplyr::mutate_all(mydata, as.factor)
labs.n <- length(levels(mydata$plant_1)) + .5


Finally, we plot the matrix using the ggplot2 package. In order to adjust the position of the grid lines, we remove to default grid lines with panel.grid.major = element_blank() and panel.grid.minor = element_blank() and draw new grid lines with geom_vline() and geom_hline(). The new grid lines are in accordance with the boundaries of the tiles.

ggplot(mydata, aes(x = plant_1, y = plant_2, fill = status)) + 
  theme_grey() +
  coord_equal() +
  geom_tile() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = 'bottom') +
  geom_vline(xintercept = seq(0.5, labs.n, 1), color='white') +
  geom_hline(yintercept = seq(0.5, labs.n, 1), color='white') +
  scale_fill_manual('', values = c("red3", "green3")) +
       title = "Companion Planting Guide")

plot of chunk plotting

The interpretation of the companion planting guide is very easy: While green tiles signal companion plants that can be grown close to each other, red tiles flag antagonist plants that should not be grown together. The squares are gray when the plants are neither companions nor antagonists.

Posted in Visualizing Data | Tagged | 2 Comments

My first R package: streetsofle


A couple of days ago I published my first R package on GitHub. I’ve named the package streetsofle standing for streets of Leipzig because it includes a data set containing the street directory of the German city of Leipzig. The street directory was published by the Statistical Bureau of Leipzig and may be downloaded as PDF file from the following website.

Leipzig is divided into 10 greater adminsitrative districts (“Stadtbezirke”) and 63 smaller local districts (“Ortsteile”). The names of both smaller and greater districts can be found under the following link.

Figure 1: Map of Leipzig

plot of chunk map

Furthermore, the Leipzig area is covered by 34 postal codes (“Postleitzahlen”). That is:

##  [1] "04103" "04105" "04107" "04109" "04129" "04155" "04157" "04158"
##  [9] "04159" "04177" "04178" "04179" "04205" "04207" "04209" "04229"
## [17] "04249" "04275" "04277" "04279" "04288" "04289" "04299" "04315"
## [25] "04316" "04317" "04318" "04319" "04328" "04329" "04347" "04349"
## [33] "04356" "04357"


if (!require("devtools")) install.packages("devtools")

The Dataset

The data frame streetsofle contains 3391 observations and 8 variables. That is:

  • plz: postal code,
  • street_key: street identification number,
  • street_name: street name,
  • street_num: a list of street numbers,
  • bz_key: identification number for greater districts (‘Stadtbezirke’) ,
  • bz_name: names of greater districts (‘Bezirke’),
  • ot_key: identification number for smaller districts (‘Ortsteile’) ,
  • ot_name: names of smaller districts (‘Ortsteile’).

Since street sections without addresses are usually not covered by postal codes, the variable plz contains some missing values (n=169).

Next steps

Writing functions

Within the next couple of weeks I will write some functions to analyse the street directory data. The next code snipped, for example, shows how to calculate the number of smaller districts (‘Ortsteile’) traversed by the streets.

f <- function(x){length(unique(x))}
df <- data.frame(ot_num = tapply(streetsofle$ot_name, streetsofle$street_name, f))
##                      ot_num
## Aachener Straße           1
## Abrahamstraße             1
## Abtnaundorfer Straße      1
## Achatstraße               1
## ...                     ...
## Zwergmispelstraße         1
## Zwetschgenweg             1
## Zwickauer Straße          3
## Zwiebelweg                1

The following table shows the number of smaller districts traversed by the streets of Leipzig.

no. of districts no. of streets
1 2721
2 207
3 49
4 11
5 4
6 2
7 2
10 1

While 91% of the streets don’t traverse any district border, one street traverses the borders of 10 districts.

Shiny Web-App

Based on these functions I’m planning to write a Shiny Web-App providing several search functions.

Posted in Fav R Packages | Tagged , , | Leave a comment

How to scrape, import and visualize Telegram chats


Telegram is a cloud-based and cross-platform instant messaging service. Unlike WhatsApp, Telegram clients exist not only for mobile devices but also for desktop operating systems. In addition, there is also a web based client.

In this blog post I show, how to import Telegram chats into R and how to plot a chat using the qdap package.

R packages

The following code will install load and / or install the R packages required for this blog post.

if (!require("pacman")) install.packages("pacman")
pacman::p_load(readr, qdap, lubridate)

Scraping the data

A very straightforward way to save Telegram chats is to use the Chrome extension Save Telegram Chat History. On Quora, Stacy Wu explains how to use it:

  • Visit
  • Select a peer you want to get the chat history from.
  • Load the messages.
  • Select all text (Ctrl+A) and copy it (Ctrl+C).
  • Paste the text into a text editor (Ctrl+V) and save to a local file.

I have saved the chat to a local csv-file. Since the first two lines contain non-tabular information, they need to be manually removed. Furthermore, undesired line breaks must be manually removed as well.

Importing the data

The following line of code shows how to import the csv file into R using the read_csv() function of the readr package.

mydata <- readr::read_csv("telegram.csv", col_names=FALSE)

Data wrangling

After importing the file, our data frame consists of two string variables: X1 containing information about day and time of the conversations and X2 containing the names of the persons involved in the chat as well as the chat text. With the following lines of code we create 4 new variables:

  • day containing the dates of the chats,
  • time containing the times of day of the chats,
  • person containing the names of the persons involved in the chat,
  • txt containing the actual chat text.
mydata$day <- stringr::str_sub(mydata$X1, 1, 10)
mydata$day <- lubridate::dmy(mydata$day)
mydata$time <- stringr::str_sub(mydata$X1, 12, 19)
mydata$time <- lubridate::hms(mydata$time)
mydata$person <- stringr::str_extract(mydata$X2, "[^:]*")
mydata$person <- factor(mydata$person, levels = unique(mydata$person), labels = c('Me', 'Other'))
mydata$txt <- gsub(".*:\\s*","", mydata$X2)
mydata <- mydata[, c(3:6)]
head(mydata, 2)
## # A tibble: 2 × 4
##          day         time person   txt
##       <date> <S4: Period> <fctr> <chr>
## 1 2017-01-20  21H 10M 14S     Me Hello
## 2 2017-01-20  21H 11M 42S  Other  Huhu

Gradient word cloud

Since the chat involves only two persons, I decided to plot it as gradient word cloud, a visualization technique developed by Tyler Rinker. The function gradient_cloud() I use in the next code snippet is part of his qdap package. Gradient word clouds “color words with a gradient based on degree of usage between two individuals” (See).

gradient_cloud(mydata$txt, mydata$person, title = "Gradient word cloud of Telegram chat")

plot of chunk gwc

The chat I have ploted is very short and, thus, not very telling. I'm wondering how it looks in a couple of months.

Posted in Text Mining, Visualizing Data | Tagged , | 2 Comments

How to plot animated maps with gganimate


Leipzig is the largest city in the federal state of Saxony, Germany. Since about 2009, Leipzig is one of the fastest growing German cities. Between 2006 and 2016, the population has grown from about 506.000 to 580.000 inhabitants.The largest part of the population growth can be attributed to inward migration.

In this blog post, I show how to create an animated map of Leipzig visualizing the migration balances for each of Leipzig's 63 districts (“Ortsteile”).

The data

The data I will visualize in this blog post are provided by the local Statistical Bureau. They can be downloaded as .csv file from the following website. The shapefile I use, is not publicly available and can be purchased for about 25 Euro from the Statistical Bureau.

R packages

The following code will install load and / or install the R packages required for this blog post.

if (!require("pacman")) install.packages("pacman")
pacman::p_load(readr, rgdal, ggplot2, reshape2)

While the packages readr and rgdal are required for importing the .csv and the shapefile, ggplot2 and gganimate are needed to plot and animate the data. The reshape2 package is required for transforming data frames from wide to long format.

After importing the csv file containing the migration data, we add a key variable (id) to identify each of Leipzig's 63 districts. Using the melt-function of the reshape2 package, we reformat the data frame from wide into long form.

# import csv file
df.migrat <- read_csv2("./Data_LE/Bevölkerungsbewegung_Wanderungen_Wanderungssaldo.csv",
                      col_types = cols(`2015` = col_number()),
                      n_max = 63,
                      locale = locale(encoding = "latin1"),
                      trim_ws = TRUE)
# add id variable
df.migrat$id <- c(1:63)
# drop variables not needed
df.migrat <- df.migrat[,c(1,18,2:17)]
# change name of first variable
colnames(df.migrat)[1] <- 'district'
# convert from wide to long
df.migrat <- reshape2::melt(df.migrat, id.vars=c("district","id"))
# change name of third variable
colnames(df.migrat)[3] <- 'year'

With the next code snippet, we import the shapefile using the readOGR-function of the rgdal package. The object we receive (sh.file) is of class SpatialPolygonsDataFrame. After adding a key variable, we transform this object into an ordinary data frame using the fortify-function of the ggplot2 package. Furthermore, we join both data frames by the key variable id. Before we plot the data, we create a new variable (co) categorizing the migration balance variable (value).

# import shapefile
sh.file <- readOGR("ot.shp", layer="ot")
# add id variable
sh.file$id <- c(1:63)
# Create a data frame
sh.file <- fortify(sh.file, region = 'id')
# Merge data frames by id variable
mydata <- merge(sh.file, df.migrat, by='id')
# create a categorized variable
mydata$co <- ifelse(mydata$value >= 1000, 4,
                   ifelse(mydata$value >= 0, 3,
                                ifelse(mydata$value >= -1000, 2, 1)))
# define levels and labels
mydata$co  <- factor(mydata$co, levels = c(4:1),
                     labels = c('> 1000', '0-1000', '-1000-0', '< -1000'))

Moreover, we define a theme suitable for plotting maps with ggpot2. The following theme is a customized version of a theme I found on Stack Overflow.

theme_opts <- list(theme(panel.grid.minor = element_blank(),
                         panel.grid.major = element_blank(),
                         panel.background = element_blank(),
                         plot.background = element_blank(),
                         panel.border = element_blank(),
                         axis.line = element_blank(),
                         axis.text.x = element_blank(),
                         axis.text.y = element_blank(),
                         axis.ticks = element_blank(),
                         axis.title.x = element_blank(),
                         axis.title.y = element_blank(),
                         plot.title = element_text(size=16)))

Plotting the data

Finally, we can plot the data. The first plot (static) visualizes the migration data for the whole period of time (2000–2015).

ggplot(mydata, aes(x = long, y = lat, group = group, fill=value)) +
  theme_opts + 
  coord_equal() +
  geom_polygon(color = 'grey') +
  labs(title = "Migration balance between 2000 and 2015",
       subtitle = "Static Map",
       caption = 'Data source: Stadt Leipzig, Amt für Statistik und Wahlen, 2015') +
  scale_fill_distiller(name='Migration\nbalance', direction = 1, palette='RdYlGn')

plot of chunk map

For creating an animated plot, we add the frame argument stemming from the gganimate package. With interval = 5 we define the time (in seconds) between the “slides”.

p <- ggplot(mydata, aes(x = long, y = lat, group = group, fill=value, frame=year)) +
  theme_opts + 
  coord_equal() +
  geom_polygon(color = 'grey') +
  labs(title = "Migration balance in year: ",
       subtitle = "Animated Map",
       caption = 'Data source: Stadt Leipzig, Amt für Statistik und Wahlen, 2015') +
  scale_fill_distiller(name='Migration\nbalance', direction = 1, palette='RdYlGn')

gg_animate(p, interval = 5) 

Posted in Visualizing Data | Tagged , , , , , | Leave a comment