How to number and reference tables and figures in R Markdown files

Introduction

R Markdown is a great tool to make research results reproducible. However, in scientific research papers or reports, tables and figures usually need to be numbered and referenced. Unfortunately, R Markdown has no “native” method to number and reference table and figure captions. The recently published bookdown package makes it very easy to number and reference tables and figures (Link). However, since bookdown uses LaTex functionality, R Markdown files created with bookdown cannot be converted into MS Word (.docx) files.

In this blog post, I will explain how to number and reference tables and figures in R Markdown files using the captioner package.

Packages required

The following code will install load and / or install the R packages required for this blog post. The dataset I will be using in this blog post is named bundesligR and part of the bundesligR package. It contains “all final tables of Germany's highest football league, the Bundesliga” Link.

if (!require("pacman")) install.packages("pacman")
pacman::p_load(knitr, captioner, bundesligR, stringr)

In the first code snippet, we create a table using the kable function of the knitr package. With caption we can specify a simple table caption. As we can see, the caption will not be numbered and, thus, cannot be referenced in the document.

knitr::kable(bundesligR::bundesligR[c(1:6), c(2,3,11,10)],
             align = c('c', 'l', 'c', 'c'),
             caption = "German Bundesliga: Final Table 2015/16, Position 1-6")
Position Team Points GD
1 FC Bayern Muenchen 88 63
2 Borussia Dortmund 78 48
3 Bayer 04 Leverkusen 60 16
4 Borussia Moenchengladbach 55 17
5 FC Schalke 04 52 2
6 1. FSV Mainz 05 50 4

Table numbering

Thanks to Alathea Letaw's captioner package, we can number tables and figures.
In a first step, we define a function named table_nums and apply it to the tables' name and caption. We define both table name and table caption. Furthermore, we may also define a prefix (Tab. for tables and Fig. for figures).

table_nums <- captioner::captioner(prefix = "Tab.")

tab.1_cap <- table_nums(name = "tab_1", 
                        caption = "German Bundesliga: Final Table 2015/16, Position 7-12")
tab.2_cap <- table_nums(name = "tab_2", 
                        caption = "German Bundesliga: Final Table 2015/16, Position 12-18")

The next code snippet combines both inline code and a code chunk. With fig.cap = tab.1_cap, we specify the caption of the first table. It is important to separate inline code and code chunk. Otherwise the numbering won't work.

Tab. 1: German Bundesliga: Final Table 2015/16, Position 7-12

Position Team Points GD
7 Hertha BSC 50 0
8 VfL Wolfsburg 45 -2
9 1. FC Koeln 43 -4
10 Hamburger SV 41 -6
11 FC Ingolstadt 04 40 -9
12 FC Augsburg 38 -10

Table referencing

Since we have received a numbered table, it should also be possible to reference the table. However, we can not just
use the inline code table_nums('tab_1'). Otherwise, we wi'll get the following output:

[1] “Tab. 1: German Bundesliga: Final Table 2015/16, Position 7-12”

In order to return the desired output (prefix Tab. and table number), I have written the function f.ref. Using a regular expression, the function returns all characters of the table_nums('tab_1') output located before the first colon.

f.ref <- function(x) {
  stringr::str_extract(table_nums(x), "[^:]*")
}

When we apply this function to tab_1, the inline code returns the following result:

As we can see in f.ref("tab_1"), the Berlin based football club Hertha BSC had position seven in the final table.

As we can see in Tab. 1, the Berlin based football club Hertha BSC had position seven in the final table.

Just to make the table complete, Tab. 2 shows positions 13 to 18 of the final Bundesliga table.

Tab. 2: German Bundesliga: Final Table 2015/16, Position 12-18

knitr::kable(bundesligR::bundesligR[c(13:18), c(2,3,11,10)],
             align = c('c', 'l', 'c', 'c'),
             row.names = FALSE)
Position Team Points GD
13 Werder Bremen 38 -15
14 SV Darmstadt 98 38 -15
15 TSG 1899 Hoffenheim 37 -15
16 Eintracht Frankfurt 36 -18
17 VfB Stuttgart 33 -25
18 Hannover 96 25 -31

And what about figures?

Figures can be numbered and referenced following the same principle.

RMarkdown: How to format tables and figures in .docx files

In research, we usually publish the most important findings in tables and figures. When writing research papers using Rmarkdown (*.Rmd), we have several options to format the output of the final MS Word document (.docx).
Tables can be formated using either the knitr package’s kable() function or several functions of the pander package.
Figure sizes can be determined in the chunk options, e.g.

{r name_of_chunk, fig.height=8, fig.width=12}.

However, options for customizing tables and figures are rather limited in Rmarkdown. Thus, I usually customize tables and figures in the final MS Word document.

In this blog post, I show how to quickly format tables and figures in the final MS Word document using a macro). MS Word macros are written in VBA (Visual Basic for Applications) and can be accessed from a menu list or from the toolbar and run by simply clicking. There are loads of tutorials explaining how to write a macro for MS Word, e.g http://www.addictivetips.com/microsoft-office/create-macros-in-word-2010/.

The following two macros are very helpful to format drafts. Since I want drafts to be as compact as possible, tables and figures should not to be too space consuming.

The first macro called FormatTables customizes the format of all tables of the active MS Word document. With wdTableFormatGrid2, we use a table style predefined in MS Word. A list of other table styles can be found under the follwing link. Furthermore, we define font name (Arial) and font size (8 pt), space before (6 pt) and after (10 pt) the table. Finally, the row height is set to 18 pt exactly.

Sub FormatTables()

 Dim tbl As Table
    For Each tbl In ActiveDocument.Tables
         tbl.AutoFormat wdTableFormatGrid2
         tbl.Range.Font.Name = "Arial"
         tbl.Range.Font.Size = 8
         tbl.Range.ParagraphFormat.SpaceBefore = 6
         tbl.Range.ParagraphFormat.SpaceAfter = 10
         tbl.Range.Cells.SetHeight RowHeight:=18, HeightRule:=wdRowHeightExactly

    Next

End Sub

The second macro called FormatFigures merely reduces the size of all figures in the active MS Word document to 45% of its original size.

Sub FormatFigures()

Dim shp As InlineShape


For Each shp In ActiveDocument.InlineShapes
    shp.ScaleHeight = 45
    shp.ScaleWidth = 45
Next

End Sub

Please also see my blog post RMarkdown: How to insert page breaks in a MS Word document.

RMarkdown: How to insert page breaks in a MS Word document

Introduction

RStudio offers the opportunity to build MS Word documents from RMarkdown files. However, since formatting options in Markdown are very limited, there is no ‘native’ Markdown code to insert page breaks in the final MS Word output file.

In this blogpost I explain, how to define page breaks in the RMarkdown document that will be kept in the final MS Word document (.docx). My post is based on Richard Layton’s article Happy collaboration with Rmd to docx which explains how to create a MS Word .docx template in order to modify the document design of a MS Word file created from a .Rmd-file in RStudio.

The MS Word template

In the first step, we create a MS Word template called ‘mystyles.docx’ (How to…). This file must be saved in the same directory as the R Markdown file. For the following modifications we have to open this file with MS Word.

Modify style ‘Heading 5’

In the next step, we modify a predefined style. However, after modifying a predefined style, we cannot use it anymore in the originally intented way. Thus, we must choose a style hardly needed for any other purpose. In this blogpost, we use the Heading 5 style.

To modify this style, we select the ‘Home‘ ribbon tab and click the Styles window launcher in the Styles group (lower right corner, highlighted with red circle).

We select ‘Heading 5’ in the Word document. In the Styles window, we scroll down until we find the style already assigned to the text we selected. In our case, the assigned style is ‘Heading 5’. (In the figure it says ‘Heading 3’. However, we actually mean ‘Heading 5’)

The following modifications must be made in the Modify Style menu:

  • Set the font color to ‘white’ (rather than ‘Automatic’).
  • Select the smallest font size (8 rather than 11).
  • Select ‘Page break before’ in the ‘Line and Page Breaks’ tab.

  • Set the line spacing to ‘Exactly’ and ‘1 pt’ in the ‘Indents and Spacing’ tab.

After these tweaks, the ‘Heading 5’ style will no longer format a heading of level 5. Instead it will insert a very small and white (and, thus, invisible) line followed by a page break.

The RMarkdown document

In the RMarkdown document, a few specifications must be made.

The YAML header

RMarkdown documents contain a metadata section called YAML header. In this header, we specify the output format (word_document) and the name of the MS Word template (mystyles.docx).

---
title: 'Title'
date: "`r format(Sys.time(), '%d&period; %B %Y')`"
output: 
    word_document:
      reference_docx: mystyles.docx
---

The Markdown code ##### being originally reserved to format header 5 will be used to insert page breaks in the final .docx document. Since we modified the font color to ‘white’ in the MS Word template, the specification after the Markdown code (Page Break) will not appear in the final document.

The following example shows how to insert a page break between two paragraphs.


Example: Markdown code to insert a page break

Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break. Text before page break.

##### Page Break

Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break. Text after page break.


Download

My MS Word template may be downloaded here.

PS

Since I don’t have an English version of MS Word, I could not make the screenshots myself. Instead, I have used internet links. Please click on the pictures to get to the web pages.

Please also see my blog post RMarkdown: How to format tables and figures in .docx files.