Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
AnteayerSalida Principal

Hacking an NVIDIA CMP 170HX Crypto GPU for EM Sim Work

11 Septiembre 2024 at 23:00

A few years back NVIDIA created a dedicated cryptocurrency mining GPU, the CMP 170HX. This was a heavily restricted version of its flagship A100 datacenter accelerator, using the same GA100 chip. It was intended for accelerating Ethash, the Etherium proof-of-work algorithm, and nothing else. [niconiconi] bought one to use for accelerating PCB electromagnetic simulations and put a lot of effort into repairing the card, converting it to water-cooling, and figuring out how best to use this nobbled GPU.

Typically, the GA100 silicon sits in the center of the mighty A100 GPU card and would be found in a server rack, cooled by forced air. This was not an option at home, so an off-the-shelf water-cooling block was wedged in. During this process, [niconconi] found that the board wouldn’t power on, so they went on a deep dive into the power supply tree with the help of a leaked A100 schematic. The repair and modifications can be found in the appendix, right down to the end of the article. It is a long read to get there.

This Nvidia GA100 GPU is severely crippled on this board

NVIDIA has a history of deliberately restricting silicon in consumers’ hands to justify the hefty price tags of its offerings to big businesses, and this board is no different. The plan was to restrict the peak performance of the board to only applications with the same compute requirements as Ethash, specifically memory-intensive algorithms. The FP64 performance was severely limited, but instructions were not removed. This meant the code would run really badly, considering what the GPU is capable of.

The memory was limited to 8 GB, despite some A100 cards hosting a whopping 80 GB. The strategy was to use fuses to limit the crucial instructions, particularly the FP32 FMA and MAD instructions, which are used for multiply-add operations and are crucial for general computing applications. Finally, the PCIe bus was nobbled to run only as a Gen 1 interface with a single lane. They reduced the lane count by removing the coupling capacitors on the PCB, which meant they could just be added later, but it’s still only a slow interface.

[niconconi] went into great detail benchmarking the instruction types, keeping their EM simulation application in mind. After a few tweaks to make it work, they determined it was a good purchase. This article is worth reading for all those hardcore GPU nerds!

If you need a primer on GPU mining, we’ve got you covered. Once you’ve understood proof-of-work crypto, perhaps take a look at Chia?

Thanks to [gnif] for the tip!

❌
❌