Automating Dependency Updates with GitHub Actions: The Ensmallen Case

github
github-actions
ensmallen
r
r-package
c++
Author

James Balamuta

Published

September 6, 2020

Editor’s Note

There is a new version of this post discussing how to automate dependency updates using GitHub Actions when the upstream dependency is on GitLab.

Editor’s note

This post was modified on Oct 4, 2023 to address the set-output deprecation and update the workflow to use v5 of the peter-evans/create-pull-request action.

Keeping dependencies up-to-date is crucial for package maintenance, but it can be tedious when done manually. In this post, I’ll share how I automated updating the Ensmallen optimization library in my {RcppEnsmallen} R package – joint work with Dirk and the Ensmallen team – using GitHub Actions, moving away from the manual approach that others (like Dirk for {RcppArmadillo}) have been using.

Designing a GitHub Actions Workflow

I created a workflow for {RcppEnsmallen} that runs daily and checks for new Ensmallen releases. When a new version is found, it automatically creates a pull request with all necessary updates. This workflow is a great example of how GitHub Actions can streamline package maintenance tasks.

Breaking Down the Workflow

For this workflow, I used a combination of GitHub Actions features to automate the update process. The workflow is triggered automatically on daily basis through CRON scheduling and checks for new Ensmallen releases. If a new version is detected, the workflow downloads the release, updates the package files, generates a changelog, and creates a pull request with the changes. This automation ensures that the R package stays up-to-date with the latest Ensmallen version with as minimal manual intervention as possible.

Let’s look at how this workflow works in more detail.

Workflow Trigger Configuration

This workflow runs automatically every day at 10:00 UTC, checking for new Ensmallen releases. I also configured a manual trigger option so I can run it on-demand if needed, providing flexibility while maintaining regular automated checks.

name: Update Ensmallen
on:
  workflow_dispatch: {}    # Manual trigger option
  schedule:
    - cron:  '0 10 * * *'  # Run daily at 10:00 UTC

Checking for New Versions

This step uses GitHub’s API to fetch the latest Ensmallen release information. By making a request to the GitHub API, the workflow retrieves the current release tag and date. At the same time, it parses the local header files to determine what version is currently being used in the package. This dual detection allows the workflow to compare versions and determine if an update is needed. All this information is stored in GitHub Actions outputs for use in later steps.

- name: Get Latest Ensmallen Tagged Release
  id: ensmallen-lib
  run: |
    # Get latest release information from GitHub API
    ENSMALLEN_RELEASE_JSON=$(curl -sL https://api.github.com/repos/mlpack/ensmallen/releases/latest)
    ENSMALLEN_RELEASE_VERSION=$(jq -r ".tag_name" <<< "$ENSMALLEN_RELEASE_JSON")
    ENSMALLEN_RELEASE_DATE=$(jq -r ".published_at" <<< "$ENSMALLEN_RELEASE_JSON" | sed 's/T.*//g')
    
    # Save outputs for later steps
    echo "release_tag=$(echo $ENSMALLEN_RELEASE_VERSION)" >> $GITHUB_OUTPUT
    echo "release_date=$(echo $ENSMALLEN_RELEASE_DATE)" >> $GITHUB_OUTPUT
    
    # Extract current version from local files
    ENSMALLEN_VERSION_MAJOR=$(grep -i ".*#define ENS_VERSION_MAJOR.*" inst/include/ensmallen_bits/ens_version.hpp | grep -o "[0-9]*")
    ENSMALLEN_VERSION_MINOR=$(grep -i ".*#define ENS_VERSION_MINOR.*" inst/include/ensmallen_bits/ens_version.hpp | grep -o "[0-9]*")
    ENSMALLEN_VERSION_PATCH=$(grep -i ".*#define ENS_VERSION_PATCH.*" inst/include/ensmallen_bits/ens_version.hpp | grep -o "[0-9]*")
    
    # Combine for comparison
    ENSMALLEN_VERSION_VALUE=${ENSMALLEN_VERSION_MAJOR}.${ENSMALLEN_VERSION_MINOR}.${ENSMALLEN_VERSION_PATCH}
    echo "current_tag=$(echo $ENSMALLEN_VERSION_VALUE)" >> $GITHUB_OUTPUT

Conditional Update Logic

The core update step only runs when a new version is detected. This prevents unnecessary processing and pull request creation when already up-to-date. The workflow compares the local version with the upstream release and proceeds only when they differ. The environment variables make the version information easily accessible throughout the update script, simplifying references to these values in the complex operations that follow.

- name: Update Ensmallen
  if: steps.ensmallen-lib.outputs.current_tag != steps.ensmallen-lib.outputs.release_tag
  env:
    CURRENT_TAG: ${{ steps.ensmallen-lib.outputs.current_tag }}
    RELEASE_TAG: ${{ steps.ensmallen-lib.outputs.release_tag }}
    RELEASE_DATE: ${{ steps.ensmallen-lib.outputs.release_date }}
  run: |
    # Update operations follow (see below)
    ...

The Update Process for Ensmallen

The update process begins by cleaning out the old files and downloading the new release tarball from GitHub. One of the clever aspects of this workflow is how it handles the dynamic structure of GitHub’s tarballs. Since the root directory in these archives contains a commit hash that changes with each release, the script dynamically extracts this prefix first. It then uses this information to selectively extract only the needed files (the header files and the changelog), placing them directly in their target locations with the correct directory structure.

# Delete the dist directory and ensmallen.hpp
rm -fr inst/include/ensmallen_bits inst/include/ensmallen.hpp

# Download the release
curl -sL -o $RELEASE_TAG https://api.github.com/repos/mlpack/ensmallen/tarball/$RELEASE_TAG

# Get the prefix directory coded to commit version to access specific files in the tar
TAR_PREFIX_DIR=$(tar -tzf $RELEASE_TAG | head -1 | cut -f1 -d"/")

# Extract the update files directly into the include directory
tar -xzf $RELEASE_TAG -C inst/include/ --strip-components=2 ${TAR_PREFIX_DIR}/include/ensmallen_bits ${TAR_PREFIX_DIR}/include/ensmallen.hpp

# Extract HISTORY.md file for changelog comparison
tar -xzf $RELEASE_TAG -C tools/ --strip-components=1 ${TAR_PREFIX_DIR}/HISTORY.md

Documenting the Update

For documentation updates, the workflow uses a sophisticated approach to generate the changelog. It compares the new and old HISTORY files using diff, then formats the output with sed commands to match the R package’s NEWS.md style. The workflow also extracts the version name from the headers and constructs a proper R package version. Package documentation files are updated using heredocs, which allow for complex multiline text generation while embedding dynamic content from variables. The DESCRIPTION file is updated with sed to change just the version line.

# Version information and changelog generation
ENSMALLEN_VERSION_NAME=$(grep -i ".*#define ENS_VERSION_NAME.*" inst/include/ensmallen_bits/ens_version.hpp | grep -o '".*"')
NEW_RCPPENSMALLEN_VERSION=0.${RELEASE_TAG}.1

# Generate changelog updates using diff
NEWS_UPDATE=$(diff --unchanged-group-format="" tools/HISTORY.md tools/HISTORYold.md | sed '/^$/d' | sed '/^#/d' | sed 's/\*/-/' |  sed 's/^/ /')
mv tools/HISTORY.md tools/HISTORYold.md

# Update package documentation
cat <<-EOF > NEWS.md
# RcppEnsmallen ${NEW_RCPPENSMALLEN_VERSION}

- Upgraded to ensmallen ${RELEASE_TAG}: ${ENSMALLEN_VERSION_NAME} ($RELEASE_DATE)
${NEWS_UPDATE}

$(cat NEWS.md)
EOF

# Update version in DESCRIPTION
sed -i "s/Version:.*/Version: ${NEW_RCPPENSMALLEN_VERSION}/g" DESCRIPTION

# Update ChangeLog with heredoc
# ...

Creating the Pull Request

Rather than committing changes directly to the main branch, the workflow creates a pull request using the peter-evans/create-pull-request action. This approach preserves the code review step while automating the update process. The PR is created with clear labeling, descriptive title and body, and references to the original project. The branch name incorporates the version tag to ensure uniqueness across different updates. This final step completes the automation process while maintaining proper GitHub collaboration practices.

- name: Create Pull Request
  uses: peter-evans/create-pull-request@v5
  with:
    commit-message: Upgrade ensmallen to ${{ steps.ensmallen-lib.outputs.release_tag }}
    title: Upgrade ensmallen to ${{ steps.ensmallen-lib.outputs.release_tag }}
    body: |
      Updates [mlpack/ensmallen][1] to ${{ steps.ensmallen-lib.outputs.release_tag }}.

      Auto-generated by [create-pull-request][2]

      [1]: https://github.com/mlpack/ensmallen
      [2]: https://github.com/peter-evans/create-pull-request
    labels: dependencies, automated pr
    branch: ensmallen-lib-updates-${{ steps.ensmallen-lib.outputs.release_tag }}

Fin

GitHub Actions provides a powerful platform for automating package maintenance tasks. This workflow eliminates tedious manual work while ensuring dependencies stay current with minimal effort. The approach can be adapted for other upstream dependencies that follow similar release patterns.