Write replicable papers using the R package “Knitr”
Published:
In this section I will describe my take on literate programming: not the best, not the only one. Just the one I use.
The main idea is to outsource your R
code into LaTeX
automatically. For example, this way you do not have to copy and paste standard errors, or figures. It will be automatically generated in your PDF file after compiling the .rnw
.
Text processor and necessary dependencies: First you will need a software to handle code and syntax highlighting. I use Sublime Text 3. Have Sublime Text 3 installed.
- Inside Sublime Text:
- Install Package Control.
shift+cmd+p
, then Install Packages, and install:
- In R you have to install knitr:
install.packages(c("knitr", "formatR", "pacman"), repos = "http://cran.rstudio.com")
- In Sublime Text 3, modify user settings in LaTeXing. First line adds knitr capabilities, while the second line allows BibTeX compilation.
{"knitr": true, /* activates knitr function */
"quick_build": [ /* activates latexing capabilities? Not sure. */
{
"name": "Primary Quick Build: latexmk",
"primary": true,
"cmds": ["latexmk"]
}
],
"keep_focus": true, /* activates skim sync and pop up function */
"keep_focus_delay": 0.1, /* reduces skim sync time */
"pdf_viewer_osx": { /* makes sure skim is the default app. User should make sure that the NAME OF the app is correct too (it might change with some updates) */
"skim": [
"/Applications/Skim.app"
],
"preview": [
"/Applications/Preview.app"
]
}
}
- Writing R and LaTeX code together:
1) In your .rnw
file, just generate an environment that includes a tag that’s shared with the R
file, like so:
<<chunk:example, echo = FALSE, cache = FALSE>>=
@
In this example, the tag is chunk:example
. Also, editors and (most) readers are not interested in your coding. That’s fine. The important thing is to make your code available to the audience that needs it. You can hide it by calling the echo = FALSE
function in the preamble, i.e. within the <<>>=
symbols. Also, the cache = FALSE
makes sure that your computer does not store whatever you command it to do (graphs, data, etc.). I think it’s important to set it in FALSE
so fresh plots, tables, and datasets are generated every time. This way you are absolutely sure that whatever is in your PDF reflects the latest compile.
2) In your R script, have another environment where you actually host the code, like so:
## ---- chunk:example
set.seed(602)
library(MASS) # to generate multi-var distribution
e <- as.numeric(mvrnorm(n = 100, mu = 0, Sigma = 1))
plot(e)
##
Thus, every time you compile your .rnw
file, R
will execute the chunk:example
chunk, and insert whatever is there (a table, a plot, a number, etc.).
Some people prefer having the R
code within the .rnw
file. My R
code is always very long, which distracts me from writing. Thus, I prefer to have my R
code separated from the .rnw
file. It’s easily shareable, and more convenient.