The "stochastic parrot" explanation really grinds my gears because it seems to me just to be a lazy rephrasing of the chinese room argument.

The man in the machine doesn't need to understand chinese. His understanding or lack thereof is completely immaterial to whether the program he is *executing* understands chinese.

It's a way of intellectually laundering, or hiding, the ambiguity underlying a person's inability to distinguish the process of understanding from the mechanism that does the understanding.

The recent arguments that some elements of relativity actually explain our inability to prove or dissect consciousness in a phenomenological context, especially with regards to outside observers (hence the reference to relativity), but I'm glossing over it horribly and probably wildly misunderstanding some aspects. I digress.

It is to say, we are not our brains. We are the *processes* running on the *wetware of our brains*.

This view is consistent with the understanding that there are two types of relations in language, words as they relate to real world objects, and words as they relate to each other. ChatGPT et al, have a model of the world only inasmuch as words-as-they-relate-to-eachother carry some information about the world as a model.

It is to say while we may find some correlates of the mind in the hardware of the brain, more substrate than direct mechanism, it is possible language itself, executed on this medium, acts a scaffold for a broader rich internal representation.

Anyone arguing that these LLMs can't have a mind because they are one-off input-output functions, doesn't stop to think through the implications of their argument: do people with dementia have agency, and sentience?
This is almost certain, even if they forgot what they were doing or thinking about five seconds ago. So agency and sentience, while enhanced by memory, are not reliant on memory as a requirement.

It turns out there is much more information about the world, contained in our written text, than just the surface level relationships. There is a rich dynamic level of entropy buried deep in it, and the training of these models is what is apparently allowing them to tap into this representation in order to do what many of us accurately see as forming internal simulations, even if the ultimate output of that is one character or token at a time, laundering the ultimate series of calculations necessary for said internal simulations across the statistical generation of just one output token or character at a time.

And much as we won't find consciousness by examining a single picture of a brain in action, even if we track it down to single neurons firing, neither will we find consciousness anywhere we look, not even in the single weighted values of a LLMs individual network nodes.

I suspect this will remain true, long past the day a language model or other model merges that can do talk and do everything a human do intelligence-wise.

  • 4
    I don't think we will ever find consciousness in a machine wet or otherwise. I think it exists outside the body and interfaces with the body. I struggle with this as this may mean all living things have a "soul" of some type.
  • 9
    @Demolishun all living things may have a soul, but once again my post proves not all of us have a spell checker.

    Because I clearly didn't spellcheck the damn post.

    How you been?
  • 5
    The ability to make connections, and come to new conclusions not included/apparent in the recent input is how I define inteligence.

    ChatGPT and friends, generates output based on previous input, it does the first part. But the second? It has no broader context. No short term memory. It is only able to generate new text that is similar to previous texts - and it will "invent" new data to fit.
    Btw - Some people fail thia test 😞.
  • 4
    @magicMirror I find its ability to generate new data interesting. Kind of like imagination. Might be the mirror function for comparing internal state (via expectation, surprise) to the outside world. One half of the learning process when you have very little data to go on.

    Instead of looking at it as being deluded, consider it as a hypothesis generation function.

    The results where the machine is asked to think of something and break it down into steps are equally surprising.

    It wouldn't be much of a stretch to ask it to intentionally hallucinate, compare a hallucination to a (true) input, then follow that by asking it to output a model hypothesis that best explains the difference between the hallucination and the true input. Don't know what this process is actually called. In any case, it is something to explore.

    I'm surprised the hallucinations aren't being documented and experimented with more. They seem intent on simply eliminating them with human feedback.
  • 1
    @Wisecrack been good. Need to spend more time in the moment I think.
  • 2
    Thats is the point, isn't it?

    Human conciousness is a self inflicted hallucination. Shared one at that. The trick will be to somehow cause the AI to hallucinate as well.

    Imagination? some people lack that as well....
  • 4
    @magicMirror "Thats is the point, isn't it?"

    It probably is. It'd be our luck, you and I, to stumble on the answer here of all places.

    Who can say.
  • 2
    Meh. Nitzhe probably got there first.
  • 3

    "stare into the abyss long enough and the abyss stares back."

    rated 0/10, least funny joke ever written.

    Reads thus spoke zarathustra: "This book is a joke! Wheres the punchline?"

    Probably up there among the greats, like that koan "what is the sound of one hand clapping?"

    We all thought it was deep. Turns out it was just some monk making a innuendo about beating off.
  • 1
    Well Yes. But actually No.

    "God is rekt", "If you stare long enough into the Cat, the Cat stares back", "And So said Simon" are meme philosophies. Some of the philosophers had very interesting amd complicated ideas - those are very relevant to the whole AI/ML/Hey look! new buzzword! discussion.
  • 3
    You are not thinking about the LLM at scale. Word2word relationships can only aproximate language, what makes you think they could even approach anything higher that that? You can't model a set by It's subset there will always be something missing no matter how big the subset is unless they are equal. And language is such a small part of life that most creatures don't even have it.

    For example the language could never ever think of laser eyed trees, because the concept doesn't exist in It's dataset... But I just created that at a blink of an eye. But the words "eye" and "tree" have most likely almost 0% correlation... And the LLm can't break out of that bound without a human input to lead it... This tells us one simple thing. All the "mind" we assign to LLMs is an Illusion created by the human input. It seems like It's talking to you, because you talk to it. It's a language mirror. "Parrot" might be less precise, but It's not Bad to explain it to people that don't know ML at all
  • 2
    LLMs are closer to seeing the face of jesus christ in a piece of toast then real consciousness. It's a non-visual pareidolia. It's a pattern that doesn't exist, even though it looks like it
  • 1
    A lot of people here think that replika is conscious: https://reddit.com/r/replika/...
  • 1
    My problem with the Chinese room experiment is that it doesn't actually address the functionalist argument at all. The functionalist argument is simply that

    - physics is a mechanical, unconscious system

    - human minds entirely operate within the laws of physics

    - human minds are conscious


    - a mechanical, unconscious system can accommodate consciousness as an emergent proprety


    - it cannot be asserted that sufficiently complex computer programs aren't conscious simply based on the properties of the medium.

    In the chinese room experiment, the medium is a person who does not speak chinese. The abilities of the program are not bounded by the abilities of this medium.
  • 1
    From another point of view, we know that any programmable machine that can execute specific operations can be programmed to execute tasks it isn't directly capable of, if these tasks are within the computational class specified by the atomic operations. A 16-bit CPU is unable to express numbers greater than 65535, but it can be programmed to recognize bigints, even though operations on these will be O(logn). You could replace the CPU with an elementary school kid who hasn't learned about numbers greater than 1000 yet, and they will still be able to work with bigints if you give them a step-by-step instruction. Realizing that the logic of numbers is extendable may be a side effect of them reading the instructions, but it's not required for them to correctly operate on bigints.
  • 1
    On the other hand, after cursory reserarch the stochastic parrot reads like an analogy for the here and now. Rather than making claims about all AI ever, it's simply addressing present day and near future such as ChatGPT, products built on technology that was available or at least within sight when it was written.

    In that regard, I find it an excellent metaphor.
  • 1
    @magicMirror this is why I like you.
  • 1
    @Hazarth the intuition would you that you're correct, but the same thing happened to me where I thought "no way this thing can solve novel logic problems."

    And then it did.

    Intuition also doesn't serve us when when determining underlying correlations, otherwise everyone would be a stockmarket billionaire. To highlight, 'eye' and 'tree' have a non-zero correlation. Trees have 'eyes'. Cultures speak about trees watching them. Laser beams burn. Trees burn. There are many thousands of nontrivial but easily missible correlations, and interrelated takeaways about these two examples.

    My hunch, and its just me talking out my ass, but my hunch is that a subset obviously never contains the set, but at sufficiently large subsets of text data, the relationships between words *do* sufficiently model the relationships *object to object*, even if theres no genuine sense of the world (word-to-object).

    it has an internal world that functions like ours the more data you feed it, but isn't.
  • 1
    @Hazarth theres actually been an argument that seeing consciousness in LLMs might be a new type of intelligence test, suggesting that the way we perform theory-of-mind in the human brain, or how we attribute 'intelligence' as commonly-understood, in fact doesn't have one mechanism, but rather that some group of people do this through a heuristic approach based on behavioral observation, while others are completely immune to the fallacy that arises from the rule-of-thumb.

    If thats the case, I'm probably in the former group when I see certain examples, because I definitely want to say "yeah this is AGI, or proto-AGI, something is going on with this thing thats nontrivially related to consciousness."

    And I think others are experiencing the same thing if I'm correct about this split in how human minds determine whether something is intelligent or not.
  • 0
    @lorentz isn't this still just confusing the mechanism for the process though?

    The chinese room isn't its components, even if those components themselves happen to be humans, homunculi, or anything else.

    The chinese room is the physical interface to the input/output of a particular process executed on a given set of hardware.

    Am I wrong? Have I misunderstood?

    I always found the original argument rather seductive in the way that it seems to not differentiate between the machine and the process, but something about it strikes me as essentially missing the forest for the trees.
  • 1
    @retoor thats just sad.
  • 1
    @Wisecrack The problem is exactly that the Chinese Room doesn't differentiate between the system being programmed and the program, even though probably even the abilities they share manifest completely differently due to the abstraction layer.
  • 1
    @Wisecrack Let me try to debunk your "intuition" on this as well. Just because you think it solved a "novel problem" doens't mean it did. In the eyes of pure statistics, if you know that A->B and B->C figuring out that A->C doesn't really involve a "novel problem". If you imagine there is 200 steps like that, and the LLM can just follow them more directly than you intuitively can, you can easily see that what appears as a novel logic problem is just 50 small well-known problems. That's where the LLM will shine, because those words and solutions will lead to words will lead to words will lead to words... once you realize this, it's trivial to see how LLMs actually work and why they appear like they are smart... but connecting the dots isn't a sign of intelligence.
  • 1

    You could solve the same "novel problems" using a sufficiently large SQL database (176B of well formed records maybe?) and a well written query or algorithm to follow from ABC to XYZ. The LLM just makes it feel much more natural. But we wouldn't call an SQL database inteligent even if it solved exactly the same problems in exactly the same sequence of operations would we? But at the same time we just need it to speak proper english to believe it's suddenly intelligent? Nah... This is very clearly the human predisposition to anthropomorphize everything.
  • 0
    Lobsang, the long earth series.
  • 1
    @Hazarth "but connecting the dots isn't a sign of intelligence."

    The ability to connect the dots without being explicitly told how absolutely is a sign of intelligence.

    The argument that it's another word, after another word, is just a rehash of the argument that it can't be intelligent merely because of the mode of output (character by character).

    The analogy to databases, or any other specialized system, I find fault with because it confuses general intelligence with task-specific intelligence, but maybe I'm making your argument for you in a roundabout way.

    "best I can do is three fiddy" which is my way of saying that there is naturally some element of anthropomorphization but thats problematic too because it isn't automatically mutually exclusive to general intelligence.

    Good post.
  • 1

    "The ability to connect the dots without being explicitly told how absolutely is a sign of intelligence."

    I disagree that that's what's happening. We're telling the network enough information to connect the dots, just in a natural language instead of a SQL language. The issue is that it doesn't understand any of the dots either. Unlike a human, it doesn't think critically and doesn't actually solve a problem it is given. It just lays the rails stochasticaly for 1000 tokens and then it is up to the human to interpret if it even makes any sense. It's a random walk. Even worse, it isn't even a random walk under the hood. What you see on your screen is *after* sampling... random sampling is not even part of an LLM, it's a technique applied at the output. If you look at an LLM that just outputs it's "true" highest prediction, all the magic goes way. Literally all of it is gone once you see it for what it is, which is a math function that outputs a prob vector.
  • 1

    That's why it's such a convincing illusion... once again, you can model this in a SQL database by creating a table of tokens and their probabilities towards ever other token (given an input of up to 2000 tokens or however is availabe to the largest GPTs now)... and then you just do SELECT * FROM Word2Word LIMIT 200, and put a beam search or other sampling method just like you would after the LLM and you would get the exact same "sign of agi" or "sign of intelligence".

    and while I don't disagree and intelligence is an emergent property. It is far from what is happening here. There's no meaning in the words and there's no planning ahead or critical thinking. It's the same issue we had with chess... we though that a machine playing chess was a sign of intelligence, but only once they did it was it clear that it's a hollow definition. And I'd say that playing chess is isn't all that different from what an LLM is doing... string of move in response to a string of inputs...
  • 1
    I still think the best way to approach the question of silicone sentience is the rhetorical question:

    “Do computers think?”
    “Do submarines swim?”

    The process of fundamentally different, but you can get there.
  • 0
    @jeeper computers don't think. The question is whether they can act as a medium that accommodates a thinking thing the same way atoms do, which also don't think.
Add Comment