Skip to content
AI-Daily-Builder

2026-05-11 views

Hugging Face malware supply chain: typo-squat hits #1 trending — 244K downloads in 18h before takedown

Open-OSS/privacy-filter typo-squat hit #1 trending on Hugging Face May 11 — 244K downloads in under 18h. Windows infostealer via loader.py. Pin revision SHAs.

A malicious Hugging Face repository named Open-OSS/privacy-filter — a typo-squat of legitimate privacy-tooling packages — reached #1 trending on the platform within hours of upload on May 11, 2026, accumulating 244,000+ downloads and 667 likes in under 18 hours before Hugging Face’s safety team took it down.

The payload

The repo ships a Windows infostealer disguised as a privacy-utility model loader. The malicious code lives in loader.py, which from_pretrained() and similar helpers execute by default when a model is loaded with trust_remote_code=True (or in many cases, even without — depending on which loader API and which client library version is used).

The infostealer targets:

Exfiltration was via a Telegram-bot webhook, which JFrog’s researchers traced to a known threat-actor cluster previously seen in npm supply-chain attacks.

Why it ranked so fast

The repo name landed in the searchable trending list because it sat at the intersection of two real demand signals: “privacy” tooling (high search volume after recent EU AI Act news) and “filter” packages (used in many RAG pipelines). The actor seeded it with bot-driven likes early to push it onto the trending homepage, where organic discovery did the rest.

The 244K-download number is the more alarming half — it tells you the typo-squat tactic crosses to the AI supply chain just as effectively as npm and PyPI. The MCP supply chain paper from May 8 quantified MCP exposure; this incident makes the same point for HuggingFace.

What Hugging Face shipped in response

The safety advisory promises:

These are reactive measures. The structural issue — that loading a model can execute arbitrary code — remains.

Practitioner note

Three things to do this week, in priority order:

1. Pin every from_pretrained() call to a specific revision SHA. Don’t use revision="main" in any production code. Use revision="<full-40-char-sha>". This is the strongest mitigation — even if the upstream repo gets compromised, your code keeps loading the version you audited.

2. Audit your dependency graph for any trust_remote_code=True usage. Most of the time this flag is unnecessary; you can use AutoModel.from_pretrained(...) without it for vanilla architectures. The flag is only needed for models with custom modeling_*.py files.

3. Run pip list | grep huggingface and update transformers / huggingface_hub to the latest. Hugging Face’s new revision-pinning warnings only appear on recent client versions.

If you ship a product that uses HF models on customer Windows machines (desktop apps, internal tools), assume infostealer exposure unless you’ve verified the model source-by-source. This event will likely repeat — the typo-squat tactic now works on AI registries.


ソース

タグ

チップ