The first practical demonstration (POC) confirms that Rowhammer-style memory attacks can effectively target GPU memory. NVIDIA, meet GPUHammer.

A study from the University of Toronto has revealed that modern graphics processing units (GPUs) are susceptible to Rowhammer-style memory attacks, challenging long-held assumptions about the security of GPU memory. The attack, named GPUHammer, represents the first practical demonstration of a Rowhammer exploit targeting high-performance, discrete GPUs, specifically those utilizing GDDR6 memory.

Understanding Rowhammer

Rowhammer is a well-documented hardware vulnerability that affects dynamic random-access memory (DRAM). By rapidly and repeatedly accessing (“hammering”) a specific row in memory, attackers can induce electrical interference that causes bit flips in adjacent rows. Historically, Rowhammer attacks have been leveraged against CPUs to bypass memory isolation, escalate privileges, and corrupt data.

The GPUHammer Breakthrough

The University of Toronto researchers extended the Rowhammer threat model to GPUs, focusing their experiments on the Nvidia A6000—a widely used, high-end GPU in AI and high-performance computing applications. By carefully orchestrating memory access patterns directly from the GPU, the team successfully induced bit flips in the GPU’s GDDR6 memory.

The implications are significant. In their proof-of-concept, the researchers demonstrated that even a single bit flip could catastrophically degrade the performance of machine learning models. For instance, the accuracy of a deep neural network trained on the ImageNet dataset dropped from 80% to a mere 0.1% following a targeted GPUHammer attack.

Industry Response and Mitigation

Nvidia has acknowledged the vulnerability, recommending that users enable error-correcting code (ECC) memory protections to mitigate the risk. While ECC can effectively prevent bit flips caused by GPUHammer, it does come with trade-offs, including reduced available memory and potential performance impacts. Enabling ECC may reduce available memory and slightly impact performance, but it is considered essential for environments where data integrity is critical, such as AI workloads and scientific computing

The researchers note that, unlike CPU DRAM modules, GPU memory is typically soldered onto the board, complicating large-scale testing across different GPU models. Nevertheless, their findings strongly suggest that the vulnerability may extend to other GPU architectures and manufacturers.

NVIDIA notes that its latest GPUs—such as the Blackwell RTX 50 series (GeForce), Blackwell GB200/B200/B100, and Hopper H100/H200/H20/GH200—feature built-in, chip-level ECC protection that does not require user intervention.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply