Introduction

In May of 2024, I released an in-depth blog post covering GitHub Actions Cache Poisoning along with a rough proof-of-concept.

Since then, the topic has received quite a bit of buzz. Threat actors are even picking up on the technique, as we saw with the Ultralytics supply chain attack.

GitHub has made some changes to how caching works. Sometime in November, GitHub added a restriction that blocks saving cache entries after the conclusion of a workflow job. This reduces the effectiveness of the “cache stuffing” technique. This technique allowed you to clear reserved cache entries and replace them with poisoned entries using a single Cache JWT.

Now, I am releasing a proof-of-concept tool. It automates the entire process from within a build. It leaves almost no trace. Meet Cacheract.

You can find it at https://github.com/adnaneKhan/cacheract. It is open-source under the MIT License and open for bug reports, contributions and suggestions! I hope it will become a useful tool for Red and Purple Teamers. They can use it to demonstrate the impact of insecure GitHub Actions CI/CD caching configurations on their assessments.

Please understand that like many offensive security tools, if used for malicious purposes Cacheract can cause damage. Please only use it for ethical security research and educational purposes.

A Note to Readers

This post is quite technical and I do not provide a lot of background on Actions in general. You need a strong understanding of GitHub Actions. You should also understand OSS package repositories and CI/CD vulnerabilities. Without this knowledge, the post will be hard to follow.

I’ve prepared links to good resources that will help you understand:

Background

Cache Native Malware

At the end of my first blog post, I used a very simple example. It demonstrated how build caches can compromise the integrity of a build pipeline. Otherwise, it would be SLSA L3 compliant. This issue surfaced terribly during the Ultralytics Supply Chain attack. The maintainers created a new release. However, the backdoor remained.

I had a wild idea “What if there was malware that lived entirely within ephemeral build caches? Shifting between pipeline space and caches to maintain itself, indefinitely.”

After a few late nights and a weekend of coding, Cacheract was born. I had a lot of help from GitHub Copilot, as I had never written Typescript before.

From a malware taxonomy perspective, Cacheract is an infostealer with some file over-write features. What makes it unique is the persistence mechanism.

The Name

Cacheract is a play on words of Tesseract. Tesseract is also the name of the Space Stone from the Marvel Cinematic Universe.

The Space Stone, also known as the Tesseract, is one of the six Infinity Stones in the Marvel Cinematic Universe. It allows its user to manipulate space and teleport themselves and others across vast distances, regardless of physical barriers or preventive measures.

Cacheract’s Persistence Mechanism

Cacheract is fairly simple without the custom persistence mechanism. It dumps memory from a process and sends it to a Discord webhook.

Cacheract is designed to only run on GitHub Hosted Linux runners. This is an artificial limitation for this proof of concept. Modifying it to run on OS X and Windows runners would not be too complex. Cacheract does re-run on self hosted runners due to the high variation in the execution environment. Remember, if you are on a non-ephemeral self-hosted runner, there are better tools.

The basis for Cacheract’s persistence is rooted in how GitHub Actions cache hits work.

Cache hits are calculated simply by matching a key and a version. They are strings which are client-controlled and set when a cache is saved.

Let’s take a closer look at the command. ExplainShell breaks it down nicely.

Note -P. This tells tar to not strip leading paths from a file name. This means you can write (or over-write) any file on the system that the user has permissions to modify.

GitHub Actions runners do not run as root. Therefore, we can’t modify files in /usr/bin. However, we can overwrite files that we know will execute as part of the build.

What’s on the runner’s file-system that we know will almost always run? The actions.yml files for actions run during the build.

On GitHub Hosted Linux runners, this is /home/runner/work/_actions/actions/checkout/action.yml. The index.js file is at /home/runner/work/_actions/actions/checkout/dist/action.yml

The post step runs at the end of the workflow job, and will run regardless of job success or failure. On GitHub hosted runners this is a moot step, because the runner is cleaned up after the run anyway.

What happens if we update the file to make the post step point to dist/config.js and place our own script there using the arbitrary file write primitive?

We get arbitrary code execution on any job that consumes our poisoned cache entry, and we also don’t break anything.

If you’re wondering what that looks like, here it is:

Cacheract is running in those three seconds.

Automated Cache Poisoning

In my first blog post, I released a simple Python script. It sets cache entries using the Cache URL and Actions Runtime Token. Using it was clunky, to say the least, and it required some manual setup to prepare the poisoned cache entry. After the recent changes to invalidate the token for cache writes, it would require:

Stealing and exfiltrating the tokens.
Adding a for 15-20 minutes.
Use the actions:write permission to delete cache entries (if you have it).
Use the tokens to set desired poisoned entries.
Hope that the blue team doesn’t catch you.

That restriction on using the token after the job expires is inconvenient. It really limits the technique’s use as part of a covert red team!

Cacheract does everything from within the build that deployed it.

Initial Execution

Cacheract first assesses where it is. It checks if it is running on Linux, otherwise it exits. Then, it checks environment variables to determine if it is running on a GitHub Hosted runner.

Next, it checks if sudo is enabled and decodes a small Python script. That script reads the Runner.Worker process memory and extracts the GITHUB_TOKEN, actions runtime token, cache URL, and all other pipeline secrets. It packages them up and sends it to an (optional) Discord webhook.

Cacheract Features

This is where the magic happens. Cacheract’s aim it to now find a way to persist. If Cacheract runs in a default branch, then it will programmatically try to poison other cache entries with itself. It has a few techniques to do this:

Cache Entry Replacement

If the workflow has a GitHub token with actions: write, then Cacheract will replace all entries within the main branch. It will use poisoned entries that contain the Cacheract payload in addition to the cache entry. This is possible because a GITHUB_TOKEN with actions: write has permission to delete

Cache Entry Prediction

Cacheract will also look for cache entries in feature branches that do not exist in main. This is common for repositories that use caching and have Dependabot enabled.

Inactive Cache Prediction

Cacheract will perform static analysis on all workflows within the repository. It will try to determine if other workflows consume caches. If they do, then Cacheract will attempt to calculate cache entries associated with a set of “Cache Aware” actions. Cacheract currently only does this for actions/setup-node, but future actions are coming soon.

Custom Cache Key Configuration

Cacheract also allows the operator to configure cache keys and values. This is useful for custom cache entries that you want to poison. An example would be poisoning a specific restore key used with the actions/cache reusable action in a workflow.

Custom File Replacements

Cacheract also supports a feature called “Replacements”. When Cacheract packs itself into a poisoned cache entry, the operator can direct it to pack additional files. This is how Cacheract can be more than an information stealer.

Cacheract supports hard-coded replacements. These are useful for small files. It also supports external replacements, where it will download a file from a URL.

An example scenario is replacing a source code file. If a later workflow uses the cache, the infected cache entry has the ability to over-write files used in that subsequent workflow. It might replace certain source files with ones that have a backdoor.

Replacements fire on ALL cache hits to the poisoned cache, even on those that Cacheract will not run on such as non-GitHub Hosted runners. Cacheract can set a replacement for a script that runs on a self-hosted runner. This setup would allow the operator to establish persistence on it. The operator can then use that access for further pipeline privilege escalation.

Cache Hit Persistence

If another job (or the same one) has a cache hit containing Cacheract, then Cacheract will run. This occurs during the `post checkout` step. Cacheract suppresses all output, so it is very silent. Who looks at the step output for post-checkout? No one.

At this point, the process starts over again, Cacheract extracts secrets, looks for entries, poisons them, and the job ends. This can allow Cacheract to persist for months. As long as Cacheract’s resident cache is warmed every 7 days and not evicted, then Cacheract will live on.

Putting it all Together

I created an example repository with an end-to-end scenario showing how Cacheract works. First and foremost, Cacheract is a post-exploitation payload. To exploit a repository using Cacheract, it must have a workflow misconfiguration. This misconfiguration allows untrusted code execution in the default branch. This is from a malicious insider or a Pwn Request.

The example repository contains a simple misconfiguration. It is quite common. In this scenario, a workflow checks out and runs code. At the same time, it restricts the GITHUB_TOKEN permissions to none. The workflow also consumes a cache itself. However, this doesn’t matter. The cache write token is available to all workflows. Because the GITHUB_TOKEN is set to read only, Cacheract will not be able to clear reserved cache entries.

In this scenario, an attacker can modify the package.json in their PR to contain a curl | base payload. I pointed it to a script that contain a second stage to save Cacheract to a file and run it. Prior to creating the malicious pull request, the cache entries looked like this:

Notice the entry for refs/pull/5/merge?

That is a cache entry associated with a Dependabot PR that bumped the version of a dependency. This meant it had a different cache key. Cache keys for the setup-node action are derived from the SHA256 hash of the package-lock.json file. The important thing is that the entry is NOT in main. However, we know it will be in main if the maintainer merged the pull request.

I used a test attacker account to create a fork pull request. Seconds after creating the pull request I received a Discord web-hook containing the tokens along with workflow metadata.

After Cacheract runs we have the following entries. Note the second entry from the top. It matches the key of the pull request cache entry. Cacheract was able to set it because the entry did not already exist in main. It also opportunistically set another entry based on the state of package-lock.json at the end of the implantation workflow, because the PR head had a different hash.

Now, I merge the pull request. The cache entry now matches the one I “pre-poisoned” Cacheract runs and I receive a ping with the secrets. In this example I do not get anything extra, but it shows how Cacheract is embedded into the repository’s caches.

Now, what happens if I create a release? In this repository I configured releases to run on creation of a release tag. I get two pings. The first one is for the initial push workflow (which I am not showing). Then, I get a second ping with the NPM Token. What? How did that happen?

Let’s take a look at the release job. It runs on tags matching release- and even has a deployment environment and NPM provenance. The flaw here is cache: 'npm'. Cacheract deploys

When I compiled Cacheract for the demo, I included a replacement for the package.json file. This simply added a pre-install step to write out a PROOF_OF_CONCENT_HACKED.txt file.

If we look at the release - the file is present.

If we look at the sigstore entry, then it points to https://github.com/AdnaneKhan/CacheractDemo/commit/3283eba47d84f454e9f6534ed3c788ab4b5afc33, which has no such file.

In my example I used a very obvious pre-install script. It does show up in the logs during the publish step. However, a true attacker will modify a package’s source itself. They introduce a subtle modification that only fires under specific circumstances. For the NPM example, imagine an attack similar to the recent Solana Web3.js library, where the package quietly exfiltrated private keys.

Conclusion

GitHub Actions cache poisoning can lead to very interesting attack paths. Cacheract simply reduces the complexity and allows offensive security practitioners to demonstrate impact without deep expertise in GitHub Actions itself.

For defenders, software maintainers should use Actions caching to speed up builds for CI and integration tests. However, they should not consume the cache in release builds. In release builds, integrity is more important than saving a few minutes of build time.

Introduction#

A Note to Readers#

Background#

Cache Native Malware#

The Name#

Cacheract’s Persistence Mechanism#

Automated Cache Poisoning#

Initial Execution#

Cacheract Features#

Cache Entry Replacement#

Cache Entry Prediction#

Inactive Cache Prediction#

Custom Cache Key Configuration#

Custom File Replacements#

Cache Hit Persistence#

Putting it all Together#

Conclusion#