pendell

Current Hyperfixation: Wizard of Oz

  • He/Him

I use outdated technology just for fun, listen to crappy music, and watch a lot of horror movies. Expect posts about These Things. I talk a lot.

Check tags like Star Trek Archive and Media Piracy to find things I share for others.



vogon
@vogon

attempting to not get into the habit of sharing bad tweets on here, but this one is emblematic of a common, bizarre, and dangerous misapprehension of what "artificial intelligence" is and what it's good for, and I want to write a bit about it:


first and least interesting, under capitalism this is obviously not a way technology would develop. we know for a fact that intuit, the developer of turbotax, leans on the federal government not to provide free tax preparation software and, where they are currently required to offer free service to the poor, engage in as much obstruction as possible to keep from having to fulfill that requirement. not only do they not really have a business interest in making filing your taxes easy (easier taxes don't mean you buy More Tax Preparation), but they have a business interest in making it hard (they would love to figure out some way to robustly prevent you from going to h&r block.)

secondly, and this is the core of this post, this is not a task that AI is uniquely good at, like, at all.

machine learning is, fundamentally, a process by which you teach a computer a mathematical function: you hand it a set of inputs to that function, each tagged with the output the function is supposed to produce under those inputs, and the computer uses a particular computational architecture to create an approximation to that function. a couple examples:

  • "AI art" technologies are built using extremely vast databases of pictures, each tagged by a human1 with what they depict. the models are -- generally, and roughly; I'm no professional machine-learning-ologist -- a function from the space of text prompts onto a space of small, square images, and a second function that upscales these small images to bigger ones.
  • AlphaGo consists of two parts. the first part, the "policy network", is a function from go positions onto what next move is Correct to make. the bootstrap version of the policy network was trained on every game that was ever played on the KGS go server; the policy network was then trained further by having it play a bunch of games against itself. the second part, the "value network", is a function from go positions onto how good they are for each player; this is trained from the play history of the policy network.2

there's a few properties of a system that make it a promising candidate for using machine learning solutions:

  1. the underlying pattern of the system is difficult or impossible to embody in a computer program using other methods. humans have been playing go for literally 3,000 years without coming up with a perfect strategy for it; using prior human input about aesthetic judgments and learning by example is arguably the only sensical way to have a machine make aesthetic judgments that humans will find comprehensible.
  2. training data is available cheaply. you have to have some external anchor for what the correct output is for a given set of inputs. learning the rules by example requires examples -- whether they be examples of humans performing the same task, or a less-powerful computer model of the rules of a system (whether that's a game like go, or a physical system like a self-driving car on a city street.)
  3. you are willing to accept an approximate solution. systems which are complex enough to meet criterion 1 often have a lot of borderline cases that are difficult or impossible to classify, and generally, the best machine learning techniques can guarantee is that they'll output the model that makes the fewest possible errors under the architectural priors you gave them at the start of training.
  4. you are willing to accept an unauditable solution. the inputs and outputs of a machine learning model are legible to humans, and its developers know the broad strokes of its architecture; however, the actual meat of the model is a bunch of empirically-derived numbers that the computer chose to minimize classification error.

tax law fits none of these criteria. the rules of the system are explicitly made and maintained by humans, known and written down in full, and published in an authoritative form by the Government Printing Office. an approximate system making an error is likely to incur a monetary penalty, or someone being sent to prison; if the system fails one time for every million tax returns, that amounts to 113 errors every year. after that, the best you can do to fix it is to give the next training pass another example to learn from, and there's no guarantee it'll actually do any better afterwards -- it might inexplicably screw up another taxpayer's return next year, or even the same one. actual taxpayer records are held under strict secrecy by the Internal Revenue Service, so a large set of training data is hard to come by for anyone other than the federal government or tax preparation software incumbents -- and the federal government already has a program by which they certify independent professionals as experts on the tax system worthy of representing taxpayers.

it seems like a lot of people, even people who should know better or who have concerns about the current state of the tech industry, have made it to 2022 with a conception of AI that only amounts to "it's when a computer is smart! even smarter than a human!" and that's straight-up just a dangerous thing to believe.


  1. the ethics of AI datasets are fraught in their own way. by way of explanation, generally the process of gathering a dataset goes like this:

    1. a research lab at a university scrapes a bunch of untagged data, of questionable quality, and under the belief that nobody is going to sue a lil ol research lab over IP rights, off the web;
    2. under the auspices of a research project, this dataset is labeled by people on a piecework service like mechanical turk making chump change, since a lil ol research lab can't afford to pay data workers properly;
    3. the research lab is eventually spun out into a non-profit that pays everyone in the lab a healthy wage by selling the dataset to tech companies at a price that renders it impractical for use by anyone who's not using it to make money.
  2. the successor work, AlphaGo Zero, basically proved that you don't need to have a bunch of hand-prepared go game transcripts to start with, as long as you have a machine that tells you when moves are illegal.


You must log in to comment.

in reply to @vogon's post:

a quick glance at that person's follows shows tons of tech chuds, all people who are fully bought into that bullshit misunderstanding of "ai" despite many of them actually working in the field. i think a lot of them just have the huckster's mindset at this point - even if some of the essential truth of what you wrote runs through their heads when they hear someone asking for something like "ai does your taxes for you" they immediately think of it in terms of product / sales. "oh yeah we can do that. we'll have that within [childishly optimistic number of years]." the ultimate Cause above all else is the product, the product that will ultimately supplant all human decision making.

full disclosure: i fucking hate all these people.

i think ai could alleviate some bullshit job-sectors, like insurances, upper management, speculation on financemarkets.

but at the same time i would rather see those sectors decimated by no longer being neccessary due to the capitalist reason (right of property) for them to exist to simply vanish...

global suddenly(no resistance, all lives matter)-communism(in the sense that kropotkin would have used the term) is a lovely utopia without the usual dead landlords or dissidents to entrench the new status quo.. in RL you're probably gonna have to give them some new kinda drug though, as an ersatz for the missing "i've got more than the others"-kick...

maybe something like space cash from southpark xD ... as long as amassing loads of it doesnt have any negative consequences sigh maybe fiat money should turn stale when not being moved? idk. i hate the world we're forced to live in but at least there is potentially a way for changing it for the better, we just need to find it

This post is good and true, however as a random passerby I'd like to point out that never did the original tweet suggest that it would be more feasible for computers to automate legal or tax work, only that it would be a cooler idea than automating creative work.