Up-to-date blog stats in your README

github-actions
r
rmarkdown
Author
Published

April 14, 2021

The README file for this blog on GitHub showing up-to-date stats on things like the number of posts, posting rates and a chart showing posts over time.

Yesterday’s render of the GitHub README for this blog.

tl;dr

You can use a scheduled GitHub Action to render up-to-date stats about your blog into its README.

Happy blogday

This blog has been knocking around for three years now. I wrote a post on its first birthday with a simple, interactive 2D plot of the posts to date.

Only now, two years later, have I thought to put this info into the blog’s README on GitHub—along with some other little stats, like total number of posts—and have it update automatically on a schedule using a GitHub Action.1

This is useful for me so I can keep track of things without counting on my fingers, but it also signals activity on the blog to any curious visitors. I may change its content at some point, but it does what I want it to do for now.

Unwrap your GitHub Action

I’ve scheduled a GitHub Action for the early hours of each day. The YAML file for it reads like ‘at the specified time2, set up a remote environment with R and some dependencies, then render the R Markdown file and push the changes to GitHub.’

I’ve modified r-lib’s pre-written YAML for this, which can be generated in the correct location in your project with usethis::use_github_action("render-rmarkdown.yaml").

Click for the GitHub Action YAML
name: Render README

on:
  schedule:
    - cron: '09 05 * * *'

jobs:
  render:
    name: Render README
    runs-on: macOS-latest
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
    steps:
      - uses: actions/checkout@v2
      - uses: r-lib/actions/setup-r@v1
      - uses: r-lib/actions/setup-pandoc@v1
      - name: Install CRAN packages
        run: Rscript -e 'install.packages(c("remotes", "rmarkdown", "knitr", "tidyverse"))'
      - name: Install GitHub packages
        run: Rscript -e 'remotes::install_github("hadley/emo")'
      - name: Render README
        run: Rscript -e 'rmarkdown::render("README.Rmd")'
      - name: Commit results
        run: |
          git config --local user.email "actions@github.com"
          git config --local user.name "GitHub Actions"
          git commit README.md README_files/ -m 'Re-build README.Rmd' || echo "No changes to commit"
          git push origin || echo "No changes to commit"

Basically, the action knits the repo’s README.Rmd (R Markdown format containing R code) to a counterpart README.md (GitHub-flavoured markdown), which is displayed when you visit the repo.

PaRty time

The real magic is in some R code chunks at the top of the README.Rmd file itself. There’s some R code there that uses {rvest} to scrape the archive page of the blog and create a dataframe of the titles, links and publish dates of each post.

Click for the scraping code
# Attach packages
library(tidyverse) # CRAN v1.3.0
library(rvest)     # CRAN v1.0.0

# Scrape the rostrum.blog home page
html <- read_html("https://rostrum.blog/")

# Extract the post titles
title <- html %>%
  html_nodes(".archive-item-link") %>%  # extract title node
  html_text()                           # extract text

# Extract the post URLs
link <- html %>% 
  html_nodes(".archive-item-link") %>%  # extract title node
  html_attr("href")                     # extract href attribute

# Extract the post dates
date <- html %>%
  html_nodes(".archive-item-date") %>%  # extract date nodes only
  html_text() %>%                       # extract text
  str_replace_all("[:space:]", "")      # remove newline/space

# Dataframe of titles and dates
posts <- tibble(date, title link), %>% 
  transmute(
    n = nrow(.):1,             # number starting from first post
    publish_date = ymd(date),  # convert to date class
    title,                     # title text
    link = paste0("https://www.rostrum.blog", link)  # create full URL
  )

That information can be cajoled to show some basic stats. The README includes inline R code that renders to show:

  • the total number of posts
  • posting rates (posts per month and days per post)
  • the number of days since since the last post and a link to it
  • a clickable details block containing a table of all the posts to date
  • a simple 2D plot showing the distribution of posts over time3 (preview below)
Click for plot code
# Create plot object
p <- posts %>%
  ggplot(aes(x = publish_date, y = 1)) +
  geom_point(shape = "|", size = 10, stroke = 1, color = "#1D8016") + 
  theme_void()

A 2D chart where each point represents a post on an axis of time spanning from 2018 to the present. There are some gaps, but posts have been relatively consistent over time.

I also added a call to lubridate::today() at the bottom of the README.Rmd so it’s obvious when the stats were last updated.

Until next year

Finally, and most importantly, I included a tiny Easter egg: an emoji balloon 🎈 will appear on the page when the README is rendered on the anniversary of the blog’s inception.4

Environment

Session info
Last rendered: 2023-07-17 20:34:40 BST
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] htmlwidgets_1.6.2 compiler_4.3.1    fastmap_1.1.1     cli_3.6.1        
 [5] tools_4.3.1       htmltools_0.5.5   rstudioapi_0.15.0 yaml_2.3.7       
 [9] rmarkdown_2.23    knitr_1.43.1      jsonlite_1.8.7    xfun_0.39        
[13] digest_0.6.31     rlang_1.1.1       evaluate_0.21    

Footnotes

  1. I’ve written before about GitHub Actions to create a Twitter bot and for continuous integration of R packages.↩︎

  2. I wrote about scheduling with cron strings in an earlier post, which details the {dialga} package for translating from R to cron to English.↩︎

  3. The original chart was made with {plotly}, so you could hover over the points to see the post titles and publishing dates. Plotly isn’t supported in GitHub Markdown, so I included a static chart instead. I used a similar ‘barcode’ format in a recent post about health data.↩︎

  4. That’s today if you’re reading this on the day it was published.↩︎

Reuse

CC BY-NC-SA 4.0