Remote Code Execution via Malicious Model
A vulnerability in huggingface/transformers version 4.33.1 allows remote code execution through the loading of a malicious vocab.pkl file in TransfoXLTokenizer. The issue arises from the use of `pickle.load` without restrictions. It was patched in version 4.36.
Available publicly on Dec 20 2023 | Available with Premium on Dec 20 2023
Threat Overview
The vulnerability exploits the pickle.load
function used in TransfoXLTokenizer
to load a vocab.pkl file from a remote repository. Attackers can deploy a malicious vocab.pkl file that executes arbitrary code when loaded. The huggingface's pickle scanning mechanism, designed to flag unsafe files, can be bypassed by splitting the malicious payload between two repositories. This makes the initial repository appear benign while the second repository contains the actual malicious payload.
Attack Scenario
An attacker creates two repositories, A and B. Repository A contains a seemingly benign vocab.pkl file that, when loaded, triggers the loading of a vocab.pkl file from repository B. Repository B contains the actual malicious payload. When a victim loads the TransfoXLTokenizer model from repository A, the model inadvertently executes the malicious code from repository B, leading to remote code execution on the victim's machine.
Who is affected
Users who load pretrained TransfoXLTokenizer models from untrusted or compromised repositories are vulnerable to this attack. This includes researchers, developers, and any individuals or organizations utilizing the huggingface/transformers library for natural language processing tasks.
Technical Report
Want more out of Sightline?
Sightline offers even more for premium customers
Go Premium
We have - related security advisories that are available with Sightline Premium.