Formatting p-values: A curated list of R functions

Intro

Reporting results of statistical analyses usually goes along with reporting p-values which indicate the probability under the null hypothesis of sampling a test statistic at least as extreme as that which was observed.

R offers quite a lot of options to format p-values. This blog post will give an overview (which is by no means comprehensive).

First, we compute a vector with six p-values and one missing value.

p <- c(0.50, 0.12, 0.045, 0.011, 0.009, 0.0000234, NA)
options(scipen = 9999) # suppress scientific notation

Formatting p-values

base package (Base R)

The first function I'm going to introduce, is part of the base package, which is part of the default R installation. Thus, no install.packages() and library() function is required to use it.

Using the digits option, the number of significant digits can be specified.

format.pval(p)
## [1] "0.500"   "0.120"   "0.045"   "0.011"   "0.009"   "0.00002" "NA"
format.pval(p, 
            digits = 2) # how many significant digits are to be used
## [1] "0.50" "0.12" "0.04" "0.01" "0.01" "0"    "NA"

Hmisc package

With nsmall, the format.pval() function of the popular Hmisc package includes an option to specify the minimum number of digits to the right of the decimal point.

library(Hmisc)
Hmisc::format.pval(p)
## [1] "0.500"   "0.120"   "0.045"   "0.011"   "0.009"   "0.00002" "NA"
Hmisc::format.pval(p,
                   nsmall=3, # the minimum number of digits to the right of the decimal point 
                   digits = 2) # how many significant digits are to be used
## [1] "0.500" "0.120" "0.040" "0.010" "0.010" "0.000" "NA"

scales package

In addition, the pvalue() function of the scales package, has got an option to specify a threshold for rounding the p-value according to a given significance level.

library(scales)
scales::pvalue(p)
## [1] "0.500"  "0.120"  "0.045"  "0.011"  "0.009"  "<0.001" "NA"
scales::pvalue(p,
               accuracy = 0.05, # Number to round to
               decimal.mark = ".", # The character to be used to indicate the numeric decimal point
               add_p = TRUE) # Add "p=" before the value?
## [1] "p=0.50" "p=0.10" "p<0.05" "p<0.05" "p<0.05" "p<0.05" "p=NA"

finalfit package

The p_tidy function of the finalfit package doesn't have an option to specify the number of significant digits. With the digits option, a value for rounding the p-value can be specified.

library(finalfit)
finalfit::p_tidy(p, digits = 2)
## [1] "=0.50" "=0.12" "=0.04" "=0.01" "=0.01" "<0.01" "=NA"
finalfit::p_tidy(p, 
                 digits = 3, # value to round to, no default
                 prefix = NULL) # suppress prefix
## [1] "0.500"  "0.120"  "0.045"  "0.011"  "0.009"  "<0.001" "NA"

psycho package

The format_p function of the psycho package formats the p-values according to predefined significance levels (<0.5, <0.1, <0.01). In addition, stars may be added.

library(psycho)
psycho::format_p(p)
## [1] "> .1"      "> .1"      "< .05*"    "< .05*"    "< .01**"   "< .001***"
## [7] NA
psycho::format_p(p,
                 stars = FALSE) # remove significance stars
## [1] "> .1"   "> .1"   "< .05"  "< .05"  "< .01"  "< .001" NA
psycho::format_p(p,
                 stars_only = TRUE) # return only significance stars
## [1] ""    ""    "*"   "*"   "**"  "***" NA

About norbert

Biometrician at Clinical Trial Centre, Leipzig University (GER), with degrees in sociology (MA) and public health (MPH).
This entry was posted in Tips & Tricks and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.