A lot of the discussion around the huge automated statistical models that are conventionally called "AI" involves the idea of hypothetical future improvements to the system that will make it concretely useful for things that it's currently not very good at. While undoubtedly it will get better at the sorts of things it can already do, and may even at some point be able to render a hand that doesn't make me want to vomit, there's one important thing that gets glossed over in a lot of conversation because programmers think it's so obvious it doesn't even warrant mentioning and non-programmers may not be aware of it at all.
AI cannot be programmed. AI is not like science fictional depictions where you can just build three laws into it and have it unerringly follow them, and it's not like conventional software where it rigorously follows a minutely precise set of instructions. You cannot simply make the Bing chatbot more accurate by plugging in a big database of verified facts and rules of deduction, because it fundamentally has no idea what truth is. It is a statistical model of what people are likely to say on the internet, and it's so unimaginably huge that even a team of humans couldn't possibly manually correct it except in the most broad strokes imaginable.
You can't even tell it "this is what a statement of fact looks like" because to tell it anything at all, you need an approximately-internet-sized corpus of training data with annotations that accurately indicate that information and that doesn't exist. The only internet-sized corpus is the one they've already used, and it certainly doesn't have sentence-level semantic metadata. So you're stuck: you can push the statistics as hard as you want but they'll never really do what you want because you can never tell them what you want in a language they'll understand.
Yep. AI does not Understand what it is doing, it is assigning weights to inputs and outputs and wiggling them until they match the parameters that were fed to it to aim for, which is when we get back to the point that AI is sleight of hand over immense amounts of human labor to tag data, and that tagging data is where a lot of shit goes wrong and introduces bias.
There is also cases where the AI will infer connections you absolutely did not intend or are often spurious, because of random correlation over a large enough data set (like those charts that show a statistically significant correlation between ridiculous things), and God knows your training sets cause problems on your own. My favorite example i've found was a generator for portrait shots of people that had a gender slider, and when you set it to maximum Male, the pictures would have weird, spiraling black extrusions that ended in a bulb around the neck or mouth of the person.
This confused the hell out of me until I tweaked the sliders to make it more legible, at which point I realized that the training of the AI had correlated "is speaking into a standing microphone" into pictures tagged Male. Therefore, the ultimate expression of masculinity was the ability to warp reality and create a microphone so you never had to shut up and everyone had to hear you.
Which, like, if that'd been a purposeful art installation piece I'd have mumbled "alright there Banksy" but the fact it was an emergent result of people not curating their data set entertained me for a whole lot of reasons.
