tl;dr
I think you can use the {renv} package to create separate reproducible environment profiles for each of your {distill} blog posts.
Profiled
Functionality comes and goes in R packages. How do you deal with that in the context of a blog built with R? What if you need to go back and change something in a post from four years ago?1
I built a demo {distill} blog to test whether the {renv} package might be a viable solution for reproducibility on a post-by-post basis.
{renv} is a package by Kevin Ushey that records your dependencies in a text ‘lockfile’. It typically works on the scale of a whole project, but since version 0.13.0 you can have multiple profiles within a given project.
I think this means that each post can have its own profile with its own distinct set of packages and package versions.
That means you can easily recreate a specific environment for a given post at a given time if you need to alter and re-render it in future.
Example
I’m presenting this here as a theory, really, but I’ve also made a demo blog to try it out. It seems to work.
There are two posts on the demo blog. They both use the {dplyr} package, but one depends on an old version (0.8.5) and one depends on the current version (1.0.8).
Using {renv} profiles means that these package versions don’t interfere with each other.
The post depending on the older {dplyr} version can’t access the across()
function, but the post depending on the newer {dplyr} version can use across()
.
In other words, the environments associated with the profiles for each post are totally isolated from each other.
How to
Of course, you first need a blog. I used {distill}2 for the demo, a package by JJ Allaire, Rich Iannone, Alison Presmanes Hill and Yihui Xie. You can follow the guidance from RStudio, but basically:
- Create your blog with
distill::create_blog()
- Build it with
rmarkdown::render_site()
(or ‘Build Website’ from the Build pane of RStudio) - Initiate a reproducible environment for the blog as a whole with
renv::init()
And then a new-post workflow could look like this:
- Create a new post with
distill_create_post()
- Activate a profile for the new post with
renv::activate()
, providing a unique name to the profile argument (I suggest the post’s folder name as seen in the blog’s _posts/ folder) - Install the packages you need for the post with
renv::install()
- Capture the dependencies in the profile’s lockfile with
renv::snapshot()
In code, that might look a bit like this:
distill::create_post("new-post")
renv::activate(profile = "YYYY-MM-DD-new-post")
renv::install(
"distill",
"rmarkdown",
"palmerpenguins",
"dplyr"
)
renv::snapshot()
For the demo blog, I called the two profiles ‘2022-03-14-dplyr-085’ and ‘2022-03-14-dplyr-108’, which you can see in the renv/profiles/ folder of the project repo.
These are named uniquely for the two separate folders in the _posts/ directory that contain each post’s files. This naming structure should make it easy to remember the profile associated with each post.
As I worked on the posts, I switched between the two profiles with renv::activate()
, passing the relevant profile name to the profile
argument.
Note that passing NULL
as the profile
argument means you switch to the default profile associated with the project as a whole, i.e. when you ran renv::init()
.
Yeah, but?
There are obvious pros and cons to this approach.
For example, maybe it’s a bit too dependent on the user: they have to remember to switch between the profiles, etc.
And I don’t think you can properly rebuild the site again with rmarkdown::render_site()
, because this function will run based only the currently active {renv} profile, rather than rendering each post in the context of its own specific profile.
But ultimately isn’t it worthwhile to be able to rebuild a post in future if you need to change or update something? Maybe.
I’d be interested to hear other criticisms, especially before I try and use this approach for real.
Meanwhile, I know that Danielle Navarro has approached this with a more thought-out and sophisticated approach and has created a work-in-progress package called {refinery} to help build a separate environment for each post in a {distill} blog.
In general, Danielle’s blog does a brilliant job of explaining the problem of blog reproducibility and the technicals behind it. I suggest you read that post if you want to know more.
Session info
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.1.0 (2021-05-18)
## os macOS Big Sur 10.16
## system x86_64, darwin17.0
## ui X11
## language (EN)
## collate en_GB.UTF-8
## ctype en_GB.UTF-8
## tz Europe/London
## date 2022-03-18
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## blogdown 1.4 2021-07-23 [1] CRAN (R 4.1.0)
## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.0)
## cli 3.2.0 2022-02-14 [1] CRAN (R 4.1.2)
## digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.0)
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0)
## jsonlite 1.7.3 2022-01-17 [1] CRAN (R 4.1.2)
## knitr 1.37 2021-12-16 [1] CRAN (R 4.1.0)
## magrittr 2.0.2 2022-01-26 [1] CRAN (R 4.1.2)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0)
## rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.2)
## rmarkdown 2.10 2021-08-06 [1] CRAN (R 4.1.0)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0)
## stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.0)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0)
## withr 2.4.3 2021-11-30 [1] CRAN (R 4.1.0)
## xfun 0.29 2021-12-14 [1] CRAN (R 4.1.0)
## yaml 2.2.2 2022-01-25 [1] CRAN (R 4.1.2)
##
## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
Yes, I’m thinking about this because this blog is nearly four years old and I’ve had some headaches trying to rebuild posts from that long ago.↩︎
This site is built with {blogdown} rather than {distill}, so I’m using this post as a chance to learn a bit more about it. {distill} has also become quite popular in the R community, so it may be helpful for a wider readership if I use it in this demo.↩︎