If you prefer to listen rather than read, this blog is available as a podcast here. Or if you want to listen to just this post:
In the past this has been the time of year when I made predictions. Those predictions were somewhat different from those given by other people. I’m far more interested in being prepared for black swans than I am in predicting whether some mundane political event has a 90% or a 95% chance of happening. But one of the qualities of black swans is their rarity. As such everything I’ve predicted has yet to occur. In fact, for most of the predictions, there hasn’t even been movement over the last year towards making them more or less likely. There is however one notable exception: artificial intelligence.
In my very first set of predictions I asserted that:
General artificial intelligence, duplicating the abilities of an average human (or better), will never be developed.
Though I continue to maintain the accuracy of that prediction I’ve gotten a lot of pushback on it. More so than for any of my other predictions. This pushback has only gotten more intense as the amazing abilities of large language models (LLM) have become increasingly apparent. You may have heard about these models, particularly the one released just a month ago: ChatGPT.
If you’ve had the chance to play around with ChatGPT it is pretty freaking amazing. It seems to possess some real intelligence, So am I wrong? And if I’m not wrong, then I have to at least be less certain, right? Well, I don’t think I’m wrong, yet. But it would be foolish not to update my beliefs based on this new evidence, so I have. Still… I don’t think the evidence is as strong as people think.
We’ve got plenty of evidence for ChatGPT’s ability to produce writing that’s around the average of writing fed into it. But where’s the evidence of it producing far better content than that? Where’s the evidence of genius?
A post from Freddie deBoer sent me down this path. He asked ChatGPT to recreate the “to be or not to be” speech in vernacular African-American English, and it was profoundly mediocre. This by itself isn’t damning, the technology is still very young. But how does ChatGPT get from mediocrity to brilliance?
There are plans to throw even more writing at it, but unless there’s some cache of superlative writing they’ve been holding back on, won’t more writing just be a way of more deeply entrenching the average?
If more writing samples isn’t the answer, then another possibility is training. If we have users provide feedback on when it’s being brilliant vs. when it’s being mediocre, then, in theory, it will become more brilliant. In the same way that they’ve trained it to avoid controversial positions. Unfortunately this sort of reinforcement training doesn’t work great even if the goal is straightforward, and identifying brilliance is anything but straightforward. Also it would seem that “be brilliant” and “avoid controversy” are going to end up being contradictory mandates much of the time.
None of this is to say that ChatGPT and similar technology isn’t going to be terrifically disruptive. It will be. (You may add that to my list of predictions if you wish.) There will be huge changes in the way writing is done, particularly required writing. We may have already seen the effective end of assigning essays as homework. But if AI is going to be as consequential as its proponents claim, it’s not enough for it to produce average work. It has to be able to produce brilliant work.
To distill it down, I have all the faith in the world that a LLM will eventually be able to do 80 or even 90 percent of what humans can do. But it’s that last and top 10% where all the magic is. And it’s not clear to me how we get genius outputs from mediocre inputs, regardless of how voluminous that input is.
I asked ChatGPT to “write a clever appeal for patreon donations to append to the end of an article about chatGPT’s lack of brilliance”. The first result was long and mediocre, so I told it to shorten it. Then it was short and mediocre. Finally I told it to make it funny. Here’s what it came up with:
Hey there! If you enjoyed reading about chatGPT’s lack of brilliance, you might be wondering how we keep coming up with new material. The answer is simple: we have a team of highly skilled monkeys typing on old typewriters. But seriously, if you want to support our team of human writers and keep the monkey-typing to a minimum, consider becoming a patron on Patreon. Every little bit helps, and with your support, we can keep bringing you the most hilarious (and accurate) AI and NLP content around. Thanks for your help!
I’m all for AI doing 80-90% of the work, but that’s saying that I’d like it to do 80-90% of the stuff that I already know how to do. The damage will be with the upcoming generations that don’t know how to do that 80-90% and have nothing to fall back on when the power goes out. “The voice to text is down, you’ll need to type that up.” “But I can’t type!” “The computer is down, could you write that by hand?” “I don’t know how!” “Why is your spelling so bad?” “Why can’t you construct a sentence on your own?” “How do you not know how to drive a car, navigate a map, cook on a stove, etc?” Maybe I’m overthinking it all, but I just feel like the smarter tech gets, the dumber we become.
Well you could end up in a situation where you have a small group of people who know how to do everything including the top 10% and then a much larger group that’s almost completely powerless, should the power go out or the server crash. The question then becomes how robust is the server/power. Because it’s got to be nearly 100% for it not to cause massive chaos. We saw a little bit of that during the pandemic with the supply chain, as long as everything ran perfectly, but the minute it didn’t…
Paul Erlich has been in the news because he came out of relative quietness to warn about future doom, which has many people taking easy dunks on him as he is famous for his 1970 book “Population Bomb” which predicted something like 90% of the US would be starving to death by now. One of the criticisms is that he got stuff very wrong by simply taking trends from 1968 or so and straightlining them up.
We are probably the last generation that will truely appreciate what a graph truely means. We grew up at a time when making a graph meant using a pencil, plotting dots and connecting them with a ruler. To actually do some analysis and project ‘trends’ was quite a bit of work. Newspapers had serious artist types making graphs. Today Excel can throw out a million graphs a second and you can play with data all day long.
Today a young Erlich would be told something like….”well what you should do is take these trends, get their variance, fire up a million or so Monte Carlo simulations and see in how many of those pocket universes super bad things happen versus good things or ho hum things”. This can be done on your laptop that a 20-something kid normally uses for gaming.
Where this is going is that the tool can lock in a certain level of competence. Excel, for example, means the bottom bar is decent analysis, no simple math errors, and a snazzy chart or graph (there’s like a few dozen to pick from). I suspect this sets a new ‘lower bar’, which locks in competence. That does, then, imply things can only go up or at worse stay as good as they are barring something like all electricity getting wiped out.
You might have a point as far as technology, though even there I worry that the complexity of current technology makes it exceptionally fragile. But when you talk about the human response to technology, I don’t think there’s a floor on that at all.
Also I’m open to bets on this front.
So how would structure a bet if, say, we were using Excel? If Excel became unavailable or unreliable, would we fall backward? Yes. Would we fall below the circa 1970’s era of rulers and graph paper? I don’t think so. Knowing the value of visual analytics, we’d have know what’s possible and move pretty quickly to find tools to replace or fix whatever cost us Excel. If half the human race was wiped out in nuclear war, Excel (and/or clones) would still be there….which is kind of amazing to think about it.
So if ChatGPT follows the same pattern, it will be deployed to do a lot of rote writing. Manuals, simple legal briefs, response emails (maybe checked by a person before they are sent least trolls try to manipulate them by feeding in keywords). So then if fragility hit and suddenly we couldn’t use that, I suspect it would be like losing Excel. We’d be annoyed and angry but be centered on quickly restoring the old status quo as fast as possible.
I don’t know what the bet would be, though. That ChatGPT would start to be used often? It would but some failure would cause a set back larger than just going to the pre-GPT era?
Speaking of which, this is what I got when I asked it. I’d give the response a C- for usefullness but maybe a C+ for filling up the space with verbiage. Bloggers won’t be any(more) unemployed from this, at least for now.
“I would like to bet with someone that you will create a new, higher, baseline for human response to technology. What do you think would be some good measures to use for that bet?”
1. Speed of adoption of new technology
2. Satisfaction with the new technology
3. Reduction in time spent on manual tasks
4. Increase in efficiency and productivity
5. Increase in accuracy of data input and output
6. Decrease in cognitive load
7. Increase in user engagement
8. Decrease in the need for training and support
9. Increase in creative problem solving
10. Decrease in reliance on traditional methods
So there are examples of technology going backwards. The most famous is losing the cure for scurvy. But there was also a massive loss of pottery technology after Rome fell.
So we could place a bet on whether something like that happens. Though I suspect it doesn’t happen very often. And in many respects we are better at storing info. but the info we have to store is thousands of times more complex. Let’s say there was a large scale nuclear war and all TSMC fabs and all ASML facilities were taken out, how long would it take to rebuild that? (Obviously a nuclear ware would present other complications, but perhaps you take my point.)