Feel like we’ve got a lot of tech savvy people here seems like a good place to ask. Basically as a dumb guy that reads the news it seems like everyone that lost their mind (and savings) on crypto just pivoted to AI. In addition to that you’ve got all these people invested in AI companies running around with flashlights under their chins like “bro this is so scary how good we made this thing”. Seems like bullshit.
I’ve seen people generating bits of programming with it which seems useful but idk man. Coming from CNC I don’t think I’d just send it with some chatgpt code. Is it all hype? Is there something actually useful under there?
It’s overhyped but there are real things happening that are legitimately impressive and cool. The image generation stuff is pretty incredible, and anyone can judge it for themselves because it makes pictures and to judge it, you can just look at and see if it looks real or if it has freaky hands or whatever. A lot of the hype is around the text stuff, and that’s where people are making some real leaps beyond what it actually is.
The thing to keep in mind is that these things, which are called “large language models”, are not magic and they aren’t intelligent, even if they appear to be. What they’re able to do is actually very similar to the autocorrect on your phone, where you type “I want to go to the” and the suggestions are 3 places you talk about going to a lot.
Broadly, they’re trained by feeding them a bit of text, seeing which word the model suggests as the next word, seeing what the next word actually was from the text you fed it, then tweaking the model a bit to make it more likely to give the right answer. This is an automated process, just dump in text and a program does the training, and it gets better and better at predicting words when you a) get better at the tweaking process, b) make the model bigger and more complicated and therefore able to adjust to more scenarios, and c) feed it more text. The model itself is big but not terribly complicated mathematically, it’s mostly lots and lots and lots of arithmetic in layers: the input text will be turned into numbers, layer 1 will be a series of “nodes” that each take those numbers and do multiplications and additions on them, layer 2 will do the same to whatever numbers come out of layer 1, and so on and so on until you get the final output which is the words the model is predicting to come next. The tweaks happen to the nodes and what values they’re using to transform the previous layer.
Nothing magical at all, and also nothing in there that would make you think “ah, yes, this will produce a conscious being if we do it enough”. It is designed to be sort of like how the brain works, with massively parallel connections between relatively simple neurons, but it’s only being trained on “what word should come next”, not anything about intelligence. If anything, it’ll get punished for being too original with its “thoughts” because those won’t match with the right answers. And while we don’t really know what consciousness is or where the lines are or how it works, we do know enough to be pretty skeptical that models of the size we are able to make now are capable of it.
But the thing is, we use text to communicate, and we imbue that text with our intelligence and ideas that reflect the rich inner world of our brains. By getting really, really, shockingly good at mimicking that, AIs also appear to have a rich inner world and get some people very excited that they’re talking to a computer with thoughts and feelings… but really, it’s just mimicry, and if you talk to an AI and interrogate it a bit, it’ll become clear that that’s the case. If you ask it “as an AI, do you want to take over the world?” it’s not pondering the question and giving a response, it’s spitting out the results of a bunch of arithmetic that was specifically shaped to produce words that are likely to come after that question. If it’s good, that should be a sensible answer to the question, but it’s not the result of an abstract thought process. It’s why if you keep asking an AI to generate more and more words, it goes completely off the rails and starts producing nonsense, because every unusual word it chooses knocks it further away from sensible words, and eventually it’s being asked to autocomplete gibberish and can only give back more gibberish.
You can also expose its lack of rational thinking skills by asking it mathematical questions. It’s trained on words, so it’ll produce answers that sound right, but even if it can correctly define a concept, you’ll discover that it can’t actually apply it correctly because it’s operating on the word level, not the concept level. It’ll make silly basic errors and contradict itself because it lacks an internal abstract understanding of the things it’s talking about.
That being said, it’s still pretty incredible that now you can ask a program to write a haiku about Danny DeVito and it’ll actually do it. Just don’t get carried away with the hype.
My perspective is that consciousness isn’t a binary thing, or even a linear scale. It’s an amalgamation of a bunch of different independent processes working together; and how much each matters is entirely dependent on culture and beliefs. We’re artificially creating these independent processes piece by piece in a way that doesn’t line up with traditional ideas of consciousness. Conversation and being able to talk about concepts one hasn’t personally experienced are facets of consciousness and intelligence, ones that the latest and greatest LLMs do have. Of course there others too that they don’t: logic, physical presence, being able to imagine things in their mind’s eye, memory, etc.
It’s reductive to dismiss GPT4 as nothing more than mimicry; saying it’s just a mathematical text prediction model is like saying your brain is just a bunch of neurons. Both statements are true, but it doesn’t change what they can do. If someone could accurately predict the moves a chess master would make, we wouldn’t say they’re just good at statistics, we’d say they’re a chess master. Similarly, regardless of how rich someone’s internal world is, if they’re unable to express the intelligent ideas they have in any intelligible way we wouldn’t consider them intelligent.
So what we have now with AI are a few key parts of intelligence. One important thing to consider is how language can be a path to other types of intelligence; here’s a blog post I stumbled across that really changed my perspective on that: http://www.asanai.net/2023/05/14/just-a-statistical-text-predictor/. Using your example of mathematics, as we know it falls apart doing anything remotely complicated. But when you help it approach the problem step-by-step in the way a human might - breaking it into small pieces and dealing with them one at a time - it actually does really well. Granted, the usefulness of this is limited when calculators exist and it requires as much guidance as a child to get correct answers, but even matching the mathematical intelligence of a ten year old is nothing to sneeze at.
To be clear I don’t think pursuing LLMs endlessly will be the key to a widely accepted ‘general intelligence’; it’ll require a multitude of different processes and approaches working together for that to ever happen, and we’re a long way from that. But it’s also not just getting carried away with the hype to say the past few years have yielded massive steps towards ‘true’ artificial intelligence, and that current LLMs have enough use cases to change a lot of people’s lives in very real ways (good or bad).
Thanks for that article, it was a very interesting read! I think we’re mostly agreeing about things :) This stood out to me from there as an encapsulation of the conversation:
“Statistics” is probably an insufficient term for what these things are doing, but it’s helpful to pull the conversation in that direction when a lay person using one of those things is likely to assume quite the opposite, that this really is a person in a computer with hopes and dreams. But I agree that it takes more than simply consulting a table to find the most likely next word to, to take an earlier example, write a haiku about Danny DeVito. That’s synthesizing two ideas together that (I would guess) the model was trained on individually. That’s very cool and deserving of admiration, and could lead to pretty incredible things. I’d expect that the task of predicting words, on its own, wouldn’t be stringent enough to force a model to develop “true” intelligence, whatever that means, to succeed during training, but I suppose we’ll find out, and probably sooner than we expect.
Well put! I think I kinda misunderstood what you were saying, I guess we sort of reached the same conclusion from different directions. And yeah, it does seem like we’re hitting the limits of what can be achieved from the current underlying word-prediction mechanisms alone, with how diminishing the returns are from dumping more data in. Maybe something big will happen soon, but it looks to me like LLMs will stagnate for a while until they’re taken in a fundamentally new direction.
Either way, what they can do now is pretty incredible, and equally interesting to me is how it’s making us reevaluate our ideas of consciousness and intelligence on a large scale; it’s one thing to theorize about what could happen with an ‘intelligent’ AI, but the reality of these philosophical questions being so thoroughly challenged and dissected in mundane legal and practical matters is wild.
Does it, though? Where do you draw the line for real understanding? Most of the past tests for this have gotten overturned by the next version of GPT.
Seriously, it’s an open debate. A lot of people agree with you but I’m a bit uncomfortable with seeing it written as fact.
Admittedly this isn’t my main area of expertise, but I have done some machine learning/training stuff myself, and the thing you quickly learn is that machine learning models are lazy, cheating bastards who will take any shortcut they can regardless of what you are trying to get them to do. They are forced to get good at what you train them on but that is all the “effort” they’ll put in, and if there’s something easy they can do to accomplish that task they’ll find it and use it. (Or, to be more precise and less anthropomorphizing, simpler and easier approaches will tend to be more successful than complex and fragile ones, so those are the ones that will shake out as the winners as long as they’re sufficient to get top scores at the task.)
There’s a probably apocryphal (but stuff exactly like this definitely happens) story of early machine learning where the military was trying to train a model to recognize friendly tanks versus enemy tanks, and they were getting fantastic results. They’d train on pictures of the tanks, get really good numbers on the training set, and they were also getting great numbers on the images that they had kept out of the training set, pictures that the model had never seen before. When they went to deploy it, however, the results were crap, worse than garbage. It turns out, the images for all the friendly tanks were taken on an overcast day, and all the images of enemy tanks were in bright sunlight. The model hadn’t learned anything about tanks at all, it had learned to identify the weather. That’s way easier and it was enough to get high scores in the training, so that’s what it settled on.
When humans approach the task of finishing a sentence, they read the words, turn them into abstract concepts in their minds, manipulate and react to those concepts, then put the resulting thoughts back into words that make sense after the previous words. There’s no reason to think a computer is incapable of the same thing, but we aren’t training them to do that. We’re training them on “what’s the next word going to be?” and that’s it. You can do that by developing intelligence and learning to turn thoughts into words, but if you’re just being graded on predicting one word at a time, you can get results that are nearly as good by just developing a mostly statistical model of likely words without any understanding of the underlying concepts. Training for true intelligence would almost certainly require a training process that the model can only succeed at by developing real thoughts and feelings and analytical skills, and we don’t have anything like that yet.
It is going to be hard to know when that line gets crossed, but we’re definitely not there yet. Text models, when put to the test with questions that require synthesizing abstract ideas together precisely, quickly fall short. They’ve got the gist of what’s going on, in the same way a programmer can get some stuff done by just searching for everything and copy-pasting what they find, but that approach doesn’t scale and if they never learn what they’re doing, they’ll get found out when confronted with something that requires actual understanding. Or, for these models, they’ll make something up that sounds right but definitely isn’t, because even the basic understanding of “is this a real thing or is it fake” is beyond them, they just “know” that those words are likely and that’s what got them through training.
I agree with all your examples and experience. Anyone who knows machine learning would, I think. The controversial bit is here:
Maybe, or maybe not. How do we know we ourselves aren’t just very complicated statistical models? Different people will have different answers to that.
Personally, I’d venture that any human concept can be expressed with some finite string of natural language. At least to a philosophical pragmatist, being able to work flawlessly with any finite string of natural language should be equivalent to perfectly understanding the concepts contained within, then. LLMs don’t do that, but they’re getting closer all the time.
Others take a different view on epistemology that require more than just competence, or dispute that natural language is as expressive as I claim. I’m just some rando, so maybe they have a point, but I do think it’s not settled.
I would agree that we are also very complicated statistical models, there’s nothing magical going on in the human brain either, just physics which as far as we know is math that we could figure out eventually. It’s a massively huge order of magnitude leap in complexity from current machine learning models to human brains, but that’s not to say that the only way we’ll get true artificial intelligence is by accurately simulating a human brain, I’d guess that we’ll have something that’s unambiguously intelligent by any definition well before we’re capable of that. It’ll be a different approach from the human brain and may think and act in alien or unusual ways, but that can still count.
Where we are now, though, there’s really no reason to expect true intelligence to emerge from what we’re currently doing. It’s a bit like training a mouse to navigate a maze and then wondering whether maybe the mouse is now also capable of helping you navigate your cross-country road trip. “Well, you don’t know how it’s doing it, maybe it has acquired general navigation intelligence!” It can’t be disproven, I guess, but there’s no reason to think that it picked up any of those skills because it wasn’t trained to do any of that, and although it’s maybe a superintelligent mouse packing a ton of brainpower into a tiny little brain, all our experience with mice would indicate that their brains aren’t big enough or capable of that regardless of how much you trained them. Once we’ve bred, uh, mice with brains the size of a football, maybe, but not these tiny little mice.
So I was thinking that that’s about all that needs to be discussed, but I do actually have one thing to add. It sounds like you are just fundamentally less impressed with language than me. I wouldn’t buy any hype about a maze-navigating neural net, but I do buy it (with space for doubt) about a natural language AI. I literally thought “this is 90% of the GAI problem solved, it just needs something for that last 10%” the first time I played with a transformer, and I think it was GPT-2. That might sound lame now but it was just such a fundamental advance on what was around before.
Time will tell I guess if it makes me a sucker like some consumers of past chatbots, or if there is something fundamentally different this time.
I hope I don’t come across as too cynical about it :) It’s pretty amazing, and the things these things can do in, what, a few gigabytes of weights and a beefy GPU are many, many times better than I would’ve expected if you had outlined the approach for me 2 years ago. But there’s also a long history of GAI being just around the corner, and we do keep turning corners and making useful progress, but it’s always still a ways off after each leap. I remember some people thinking that chess was the pinnacle of human intelligence, requiring creativity and logic to succeed, and when computers blew past humans at chess, it became clear that no, that’s still impressive but you can get good at chess without really getting good at anything else.
It might be possible for an ML model to assemble itself into general intelligence based solely on being fed words like we’re doing, it does seem like the data going in contains enough to do that, but getting that last 10% is going to be hard, each percentage point much harder than the last, and it’s going to require more rigorous training to stop them from skating by with responses that merely come close when things get technical or precise. I’d expect that we need more breakthroughs in tools or techniques to close that gap.
It’s also important to remember that as humans, we’re inclined to read consciousness and intent into everything, which is why pretty much every pantheon of gods includes one for thunder and lightning. Chatbots sound human enough that they cross the threshold for peoples’ brains to start gliding over inaccuracies or strange thinking or phrasing, and we also unconsciously help our conversation partner by clarifying or rephrasing things if the other side doesn’t seem to be understanding. I suppose this is less true now that they’re giving longer responses and remaining coherent, but especially early on, the human was doing more work than they realized keeping the conversation on the rails, and once you started seeing that it removed a bit of the magic. Chatbots are holding their own better now but I think they still get more benefit of the doubt than we realize we’re giving them.
The Turing test was never meant to be a test of a machine’s ability to think. It was meant to boil that question down into a question that can actually be answered, but the original question remains unanswered.
In my opinion, when general AI arrives it will not be an “open debate”, the consequences will be dramatic, far-reaching and rapid.
I’m not even thinking of the Turing test, I’m thinking of the counter-example ones. Like asking how many eyes a ruler or desk has. Earlier GPTs would answer “one eye” or something, and it was used by the Chinese-room people as an example of why it was just a mimic. Now it correctly objects to the implicit assumption in the question.
You’re right, “ChatGPT is currently our overlord” would be the strongest proof of intelligence. But absence of proof is not proof of absence. What is proof of absence, or a strong enough proof of presence is where the debate is.