For millennia we relied on human curation to organize our knowledge. In the late 1900s it became feasible to collate a meaningful percentage of human knowledge automatically. This was very exciting. Were the results as good? Maybe not, but they were often good enough, and it was fast, and it scaled, and you could do it at home.
For the last decade, the search engine user experience has gotten progressively worse. This is partly due to search engines optimizing for profitable search results rather than helpful ones, but I would posit that the bigger cause is that the majority of new, useful information on the internet is being created behind closed doors, not on the searchable web.
More and more, the information that is available to search engines is created for search engines to find -- companies paying writers pennies to churn out multi-thousand word essays that will rank high in google results, but not paying them enough to do the research that would make the information accurate and useful. This phenomenon is getting worse fast thanks to generative language models like ChatGPT (which aren't even capable of producing accurate information except by accident) and I expect that soon the vast majority of text on the internet will be created by bots like that, writing more and more convincing essays full of useless information.
Maybe search engines will figure out how to discard this deluge, but my sense is that it's an arms race that they're going to inevitably lose. For a long time machine curation felt like the future, but now I think the window of pure machine curation is closing. To the extent that search will still be useful, it'll be useful for searching human-whitelisted content.
This is good news for human experts like editors and librarians, who have been treated as obsolete by the tech community for decades. Remember librarians? They're still around, and their work is more valuable than ever.
I expect that soon the vast majority of text on the internet will be created by bots like that, writing more and more convincing essays full of useless information
I think we already crossed this boundary before machine learning took over spam generation. Dedicated spammers did a fantastic job filling the internet with noise via copy+paste and fake social media accounts without the aid of machine learning, and companies like Google shrugged in response. ML is the final nail in the coffin of the useful internet, wherein you could actually find useful, real information. And it signals the point where tech companies pivot from pathological indifference toward spam, to active production of spam.
