R markdown is awesome, but hacks are needed
I love R and R markdown. But when I went to produce HTML email reports using Rmd using something simple like this:
rmd="my_report"
Rscript --vanilla -e "require(knitr); rmarkdown::render('$rmd.Rmd',params=list() )"
cat $rmd.html | mail -a "From: [email protected]" -a "MIME-Version: 1.0" -a "Content-Type: text/html" -s "$subject" $recip
…I ran into some roadblocks.
Here are the problems and solutions or hacks to address them. Hope they help you!
I wanted recipients of my Rmd HTML reports to view them in their email client. But email clients often do not like images embedded as base64 strings. The other option was to have the images live on a server. But how?
Solution: Move the images into a public web folder, and rewrite the image links with bash and trusty coreutils. Since there could be multiple versions of this report, use a folder named by exact report time:
#folder named by timestampdt=`date +"%Y-%m-%d--%H-%M-%S-%N"`
# move external filesmkdir /var/www/report_cache/$dtcp -r ${rmd}_files /var/www/report_cache_folder/$dt
# rewrite image linkscat $rmd.html | sed "s/img src=\"/img src=\"https:\/\/servername.com\/report_cache_folder\/$dt\//g" > rw_$rmd.html
Now, all images are served from the public webserver and thus accessible to email clients.
I wanted data-driven headers and plots, without knowing in advance how many there would be. I found pandoc.header too restrictive, since you had to pre-declare results=’asis’ in the chunk options, which totally limits what you can do in the R block.
Now if you call pandoc.header(“My pandoc header”,2) without results=’asis’, you get something like this in the output:
## ## My pandoc header
…which if you look in the HTML, is surrounded by <pre> and <code> tags. Its unprocessed markdown headers, prefixed by “##”.
Solution: Use htmltools to generate headers dynamically from within R chunks (no results=’asis’ required):
library(htmltools)
```{r echo=FALSE}
h1("main heading")h2("smaller heading")
```
Caveats: The new headers are indented, unlike those generated directly via markdown, and they don’t show up in the table of contents. I just made sure there were “regular” higher-level headings in place so it didn’t look odd.
I was not happy with the default styling that R markdown provided, particularly for tables.
Now I mentioned for email clients we don’t want the image data directly embedded inline. But in contrast, for email clients to render the styles, we DO need CSS inlined.
Solution: Use python premailer to inline the styles. Then there was an issue with unicode, which was corrected with PYTHONIOENCODING like so:
cat rw2_$rmd.html | PYTHONIOENCODING=UTF-8 python -m premailer | mail -a "From: [email protected]" -a "MIME-Version: 1.0" -a "Content-Type: text/html" -s "$subject" $recip
I had multiple reports and I wanted to use the same styles. So I had something like this at the top of each of my Rmd files:
<STYLE TYPE="text/css">td{font-size: 8pt;}th {font-size: 8pt;}</STYLE>
Now how do we make this DRY?
Solution: Just put the above CSS into styles.Rmd, then include that file from each report like so:
```{r child = 'styles.Rmd'}```
(Ok, this one is less of a hack, but I found this issue difficult to solve using books or google, and it can be useful in many other contexts.)
After putting all these hacks together, I finally had lovely Rmd email reports