Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Thousands of AI Workloads Compromised Amid CVE Vulnerability Dispute
An active attack campaign dubbed ShadowRay is targeting the widely used Ray open-source artificial intelligence scaling framework. It stems from a vulnerability that researchers say is a flaw but that Ray’s developers say is a deliberate design choice.
See Also: Generative AI Survey Result Analysis: Google Cloud
Researchers from Oligo said Tuesday they found “thousands” of publicly exposed Ray servers compromised by hackers in a campaign they dubbed ShadowRay.
“When attackers get their hands on a Ray production cluster, it is a jackpot,” the researchers said. The servers contain valuable company data and by their nature are open to remote code execution.
The campaign marks the advent of an unwanted milestone of AI model development, Washington, D.C., think tank the R Street Institute said Friday “Not only does Oligo’s discovery show how security vulnerabilities can be exploited within AI workloads, but it also highlights how threat actors are actively looking for ways to exploit AI.”
The Oligo researchers said they found exposed sensitive data from accounts apparently belonging to OpenAI, Stripe and Slack – including password hashes and credentials. Hackers have also downloaded cryptomining software and in at least one case, installed a reverse shell in order to maintain persistence.
The researchers said hackers are exploiting a flaw tracked as CVE-2023-48022, disclosed by a cluster of researchers in November. According to them, the vulnerability is missing authentication for a critical function – in this case, an application programming interface. “In the default configuration, Ray does not enforce authentication,” said attack surface management firm BishopFox in November. “Attackers may freely submit jobs, delete existing jobs, retrieve sensitive information, and exploit the other vulnerabilities described in this advisory.”
Anyscale in November responded by saying Ray is designed to facilitate remote code execution, calling it the “core value proposition” of the framework.
Rather than being concerned about the lack of API authentication, Ray users should follow best practices to enforce isolation and security outside of a Ray cluster. “Ray expects to run in a safe network environment and to act upon trusted code,” the developer said.
Anyscale disputed the vulnerability – leading Olgio to characterize CVE-2023-48022 as a shadow vulnerability, “a CVE that doesn’t show up in static scans but can still lead to breaches and significant losses.”
In a Wednesday update, Anyscale released a client-side script and server-side code to check for unwanted internet exposure and again called the campaign the result of “potentially misconfigured Ray open-source clusters.”
Here’s a breakdown of how the vulnerability is exploited:
- Access to the Jobs API: Ray’s Jobs API lacks proper authorization mechanisms. Attackers exploit this by gaining access to the dashboard network, typically through HTTP port 8265.
- Invocation of arbitrary jobs: Once they gain access, attackers can invoke arbitrary jobs on the remote host without needing proper authentication. This means they can execute any code or commands they desire within the Ray environment.
- Remote code execution: By leveraging the lack of authorization in the Jobs API, attackers can achieve remote code execution. This allows them to execute commands and code remotely on compromised servers, essentially gaining full control over the system.
- Exploitation via default configuration: The default configuration of Ray’s dashboard binds to all network interfaces – 0.0.0.0 – and may have port forwarding enabled, potentially exposing the dashboard to the internet by default. This increases the attack surface, making it easier for attackers to find and exploit vulnerable instances.
- Persistent access and data leakage: Attackers can maintain persistent access to compromised servers, allowing them to extract sensitive data, modify databases, steal credentials and access other resources within the AI infrastructure. This can lead to significant data breaches and financial losses for affected organizations.