## Intro

In June 2017 I've started working at the Clinical Trial Centre Leipzig at Leipzig University. Since my knowledge in statistics is rather poor, my employer offered me to attend some seminars in Medical Biometry at the University of Heidelberg. The first seminar I attended was called “Basics of Epidemiology”. At the first day, we learned how to calculate so called odds ratios in case-control studies using a simple pocket calculator.

In this blog post, I will show, how to calculate a simple odds ratio with 95% CI using R.

## Data simulation

The data I'm using in this blog post were simulated using the `wakefield` package. The following code returns a data frame with 2 binary variables (`Exposition` and `Disease`) and 1.000 cases.

```library(wakefield)

mydata <- data.frame(Exposition = group(n = 1000, x = c('yes', 'no'),
prob = c(0.75, 0.25)),
Disease = group(n = 1000, x = c('yes', 'no'),
prob = c(0.75, 0.25)))
dim(mydata)
```
```##  1000    2
```
```head(mydata)
```
```##   Exposition Disease
## 1        yes     yes
## 2         no      no
## 3         no     yes
## 4        yes     yes
## 5        yes     yes
## 6        yes     yes
```

Based on this data frame, we calculate a table showing how many patients with exposition vs. no exposition developed a disease vs. no disease.

```tab <-table(mydata\$Exposition, mydata\$Disease)
tab
```
```##
##       yes  no
##   yes 569 210
##   no  163  58
```

## Odds Ratio Calculation

In order to get to know whether the risk for developing a disease is significantly higher in patients having a certain exposition, we need to calculate the odds ratio and its 95% CI.

The following function will return a data frame containing these values.

```# return odds ratio with 95%ci
f <- function(x) {
or <- round((x * x) / (x * x), 2)
cil <- round(exp(log(or) - 1.96 * sqrt(1/x + 1/x + 1/x + 1/x)), 2)
ciu <- round(exp(log(or) + 1.96 * sqrt(1/x + 1/x + 1/x + 1/x)), 2)
df <- data.frame(matrix(ncol = 3, nrow = 1,
dimnames = list(NULL, c('CI_95_lower', 'OR', 'CI_95_upper'))))
df[1,] <- rbind(c(cil, or, ciu))
df <- as.data.frame(df)
}
```

Now, we can deploy the function on our table `tab`.

```df.or <- f(tab)
knitr::kable(df.or, align = 'c')
```
CI_95_lower OR CI_95_upper
0.68 0.96 1.35

As the results indicate, patients with a disposition have no higher risk to develop a disease than patients having no disposition. 1. tom clarke says: