R Markdown Syntax Highlighting
Nov 11, 2017
Curtis Alexander

Prior art

In the past I had explored using the highr package to enable syntax highlighting of languages that were not supported within R Markdown. I never loved the solution as it required one to download and setup highlight. In addition, the solution also required some manual steps to be run from the command line to include the appropriate css file as well.

What is the problem again?

Repeating my original problem — I want to have syntax highlighting for SAS code within my R Markdown documents.

R Markdown relies on pandoc for its syntax highlighting. Specifically, the skylighting Haskell library is utilized by pandoc for its highlighting. The list of languages that are supported can be found here.

Courtesy RStudio

Courtesy RStudio

RStudio provides the nice illustration above describing how one begins with an R Markdown document and ends with an html document. pandoc is what ultimately performs sytnax highlighting using the aforementioned skylighting library.

The answer…CodeMirror

Rather than relying on pandoc, I determined a way to take advantage of the Javascript text editor CodeMirror when producing html documents. Again, CodeMirror is a Javascript text editor that I simply set to be read only which serves as a form of syntax highlighting.

Aside: In the end the answer always seems to be Javascript. For many reasons, I have avoided learning Javascript. But for my use case here, I must admit that Javascript is quite helpful!

R Markdown Example

To go straight to an example, take a look at either the R Markdown document or the rendered html. This repo will be utilized throughout as an example.

R Markdown Setup

There are quite a few steps required but once they are all stitched together, they allow for SAS (and other language) syntax highlighting that wasn’t possible before.

Requirements

The following R packages are required for the proposed solution.

In addition, one will need access to the internet as the solution makes use of <script> tags to read in Javascript libraries from a CDN.

Code Mirror Javascript/CSS

There is a stylesheet and Javascript files provided by CodeMirror that need to be included. My preference is to create a header.html file and then make sure it is included within my YAML header.

Below is an example of the YAML header for rmarkdown-with-alt-langs. Specifically take note of the in_header key.

title: "R Markdown with Alternate Languages"
author: "Curtis Alexander"
output:
  html_document:
    include:
      in_header: header.html
    mathjax: null
params:
  hilang:
    - sas
    - crystal

The file header.html contains the following. Note that I used links to a CDN rather than directly pointing to CodeMirror.

<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.31.0/codemirror.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.31.0/codemirror.min.js"></script>
<style>
  .CodeMirror {
    border: 1px solid #eee;
    height: auto;
  }
</style>

R Markdown Parameter

Within the yaml block above, the language(s) to be used for highlighting within the document is/are set with the hilang option. For a full list of all languages that can be utilized, take a look at the CodeMirror languages page.

_hilang_setup.Rmd

Next, one needs to create a knitr source hook in order to wrap up the SAS code in <textarea> tags. The file _hilang_setup.Rmd contains two R code chunks, including the actual source hook.

The first R code chunk pulls in the needed Javascript from CodeMirror for the languages set within the yaml block. Note that this block requires the asis option to be set.

# take a character vector of parameters and inject
#   the appropriate script tag for code mirror
# ensures that the script tags are only inserted once
for(i in seq_along(params$hilang)) {
  js_mode <- paste0("\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.31.0/mode/", tolower(params$hilang[i]), "/", tolower(params$hilang[i]), ".min.js\"></script>\n")
  cat(htmltools::htmlPreserve(js_mode))
}

The second R code chunk sets the actual source hook.

knitr::knit_hooks$set(source = function(x, options) {
  if (!is.null(options$hilang)) {
    textarea_id <- paste(sample(LETTERS, 5), collapse = "")
    code_open <- paste0("\n\n<textarea id=\"", textarea_id, "\">\n")
    code_close <- "\n</textarea>"
    jscript_editor <- paste0("\n<script> var codeElement = document.getElementById(\"", textarea_id, "\"); var editor = null; if (null != codeElement) { editor = CodeMirror.fromTextArea(codeElement, { lineNumbers: true, readOnly: true, viewportMargin: Infinity, mode: 'text/x-", tolower(options$hilang), "' }); } </script>\n")

    # if the option from_file is set to true then assume that
    #   whatever is in the code chunk is a file path
    if (!is.null(options$from_file) && options$from_file) {
      code_body <- readLines(file.path(x))
    } else {
      code_body <- x
    }

    knitr::asis_output(
      htmltools::htmlPreserve(
        stringr::str_c(
          code_open,
          paste(code_body, collapse = "\n"),
          code_close,
          jscript_editor
        )
      )
    )
  } else {
    stringr::str_c("\n\n```", tolower(options$engine), "\n", paste0(x, collapse = "\n"), "\n```\n\n")
  }
})

The source hook doesn’t have to be in a separate file. However, I like to include in a separate file so that within my relevant R Markdown document, I just need to include the following to bring it into scope within the document.

```{r, child="_hilang_setup.Rmd"}
```

SAS Code

Next is the easy part — just drop the SAS code into an R Markdown code chunk. A code chunk needs the following options to be set — eval=FALSE and hilang="sas". Below is a simple example.

```{r, eval=FALSE, hilang="sas"}
data _null_;
  x = 1;
  put x=;
run;
```

The source hook just established identifes the SAS code within <textarea> tags and then applys CodeMirror’s Javascript to the entire block of text within the <textarea> tag.

Full Example

A full example follows — assuming one makes use of header.html and _hilang_setup.Rmd.

Future Work

In the future, I would like to consider:

  • Creating my own custom output format
  • Creating an actual package with everything bound up that can be conveniently installed
  • See if I can further parameterize the configuration options that are passed to the CodeMirror.fromTextArea Javascript function.