the core enabling technology for large-scale machine learning systems as they're currently designed is computer processors which can do matrix multiplication quickly. that's it.
GPUs make a useful computational substrate for machine learning solely because, before machine learning, the closest thing we had to a task that needed to be accomplished at the same scale was the matrix transforms required to produce real-time 3D graphics. the combination of rising polygon counts and the desire for more sophisticated visual effects led engineers in the late '90s to redesign graphics cards from a so-called fixed-function architecture which just did computer graphics, to an architecture which could execute general matrix arithmetic at a staggering degree of parallelism; if you've heard of "shaders", those are small computer programs which execute on one of the up to sixteen thousand individual processor cores on a modern graphics card.1
however, they're quite expensive and power-inefficient -- machine learning applications don't need the high precision mathematics of 3D graphics, and the graphics-oriented feature set can be pared back substantially -- and all of the large cloud computing providers are already getting rid of GPUs in their AI data centers in favor of purpose-built AI coprocessors.
and again, this is all just matrix arithmetic. while there are implementation specifics that surely amount to a significant body of trade secrets, there are no secret theoretical techniques involved here; an effective containment regime would basically be tantamount to a general trade ban on computer processors, and if the US tried that, China would immediately institute a crash development program to make up the deficit. they've already released homemade GPUs which are in the same realm of performance, so this whole thing would win the US maybe 24 months of head start, much of which it would spend building up its industrial base to actually produce physical semiconductors in the country instead of other countries where labor is cheaper.
you would think people who work in practical computing for a living and have access to nearly limitless resources for research and devising industrial strategy would be aware of this, but apparently not.
-
this is also why certain cryptocurrencies used GPUs: the core operation of cryptocurrencies is to run a random number generator seeded on a piece of garbage data and see if the number it spits out is below a threshold -- or, as theophite on twitter famously said, "idling your car 24/7 to produce solved sudokus you could trade for heroin". if you can get that random number generator to run on one core of a GPU, you can run 16,000 instances of it in parallel! but similarly, if the money is in it, and the cryptocurrency isn't explicitly designed to make this impractical, after GPUs stop being dirt cheap, people spin up production lines to make application-specific cryptocurrency mining chips, because GPUs are so power-inefficient.
