Intro
Reporting results of statistical analyses usually goes along with reporting p-values which indicate the probability under the null hypothesis of sampling a test statistic at least as extreme as that which was observed.
R
offers quite a lot of options to format p-values. This blog post will give an overview (which is by no means comprehensive).
First, we compute a vector with six p-values and one missing value.
p <- c(0.50, 0.12, 0.045, 0.011, 0.009, 0.0000234, NA) options(scipen = 9999) # suppress scientific notation
Formatting p-values
base
package (Base R)
The first function I'm going to introduce, is part of the base
package, which is part of the default R
installation. Thus, no install.packages()
and library()
function is required to use it.
Using the digits
option, the number of significant digits can be specified.
format.pval(p)
## [1] "0.500" "0.120" "0.045" "0.011" "0.009" "0.00002" "NA"
format.pval(p, digits = 2) # how many significant digits are to be used
## [1] "0.50" "0.12" "0.04" "0.01" "0.01" "0" "NA"
Hmisc
package
With nsmall
, the format.pval()
function of the popular Hmisc
package includes an option to specify the minimum number of digits to the right of the decimal point.
library(Hmisc) Hmisc::format.pval(p)
## [1] "0.500" "0.120" "0.045" "0.011" "0.009" "0.00002" "NA"
Hmisc::format.pval(p, nsmall=3, # the minimum number of digits to the right of the decimal point digits = 2) # how many significant digits are to be used
## [1] "0.500" "0.120" "0.040" "0.010" "0.010" "0.000" "NA"
scales
package
In addition, the pvalue()
function of the scales
package, has got an option to specify a threshold for rounding the p-value according to a given significance level.
library(scales) scales::pvalue(p)
## [1] "0.500" "0.120" "0.045" "0.011" "0.009" "<0.001" "NA"
scales::pvalue(p, accuracy = 0.05, # Number to round to decimal.mark = ".", # The character to be used to indicate the numeric decimal point add_p = TRUE) # Add "p=" before the value?
## [1] "p=0.50" "p=0.10" "p<0.05" "p<0.05" "p<0.05" "p<0.05" "p=NA"
finalfit
package
The p_tidy
function of the finalfit
package doesn't have an option to specify the number of significant digits. With the digits
option, a value for rounding the p-value can be specified.
library(finalfit) finalfit::p_tidy(p, digits = 2)
## [1] "=0.50" "=0.12" "=0.04" "=0.01" "=0.01" "<0.01" "=NA"
finalfit::p_tidy(p, digits = 3, # value to round to, no default prefix = NULL) # suppress prefix
## [1] "0.500" "0.120" "0.045" "0.011" "0.009" "<0.001" "NA"
psycho
package
The format_p
function of the psycho
package formats the p-values according to predefined significance levels (<0.5, <0.1, <0.01). In addition, stars may be added.
library(psycho) psycho::format_p(p)
## [1] "> .1" "> .1" "< .05*" "< .05*" "< .01**" "< .001***" ## [7] NA
psycho::format_p(p, stars = FALSE) # remove significance stars
## [1] "> .1" "> .1" "< .05" "< .05" "< .01" "< .001" NA
psycho::format_p(p, stars_only = TRUE) # return only significance stars
## [1] "" "" "*" "*" "**" "***" NA