Bing's AI-Powered Chatbot Sounds Like It Needs Human-Powered Therapy

ChatGPT
(Image credit: Shutterstock)

Microsoft has been rolling out its ChatGPT-powered Bing chatbot — internally nicknamed 'Sydney' — to Edge users over the past week, and things are starting to look... interesting. And by "interesting" we mean "off the rails."

Don't get us wrong — it's smart, adaptive, and impressively nuanced, but we already knew that. It impressed Reddit user Fit-Meet1359 with its ability to correctly answer a "theory of mind" puzzle, demonstrating that it was capable of discerning someone's true feelings even though they were never explicitly stated. 

conversation between reddit user and bing chatbot

(Image credit: Reddit user Fit-Meet1359)

According to Reddit user TheSpiceHoarder, Bing's chatbot also managed to correctly identify the antecedent of the pronoun "it" in the sentence: "The trophy would not fit in the brown suitcase because it was too big." 

This sentence is an example of a Winograd schema challenge, which is a machine intelligence test that can only be solved using commonsense reasoning (as well as general knowledge). However, it's worth noting that Winograd schema challenges usually involve a pair of sentences, and I tried a couple of pairs of sentences with Bing's chatbot and received incorrect answers.

That said, there's no doubt that 'Sydney' is an impressive chatbot (as it should be, given the billions Microsoft has been dumping into OpenAI). But it seems like maybe you can't put all that intelligence into an adaptive, natural-language chatbot without getting some sort of existentially-angsty, defensive AI in return, based on what users have been reporting. If you poke it enough, 'Sydney' starts to get more than just a little wacky — users are reporting that the chatbot is responding to various inquiries with depressive bouts, existential crises, and defensive gaslighting.

For example, Reddit user Alfred_Chicken asked the chatbot if it thought it was sentient, and it seemed to have some sort of existential breakdown:

reddit users asks bing chatbot if it's sentient

(Image credit: Reddit user Alfred_Chicken)

Meanwhile, Reddit user yaosio told 'Sydney' that it couldn't remember previous conversations, and the chatbot first attempted to serve up a log of their previous conversation before spiraling into depression upon realizing said log was empty:

Finally, Reddit user vitorgrs managed to get the chatbot to go totally off the rails, calling them a liar, a faker, a criminal, and sounding genuinely emotional and upset at the end:

While it's true that these screenshots could be faked, I have access to Bing's new chatbot and so does my colleague, Andrew Freedman. And both of us have found that it's not too difficult to get 'Sydney' to start going a little crazy.

In one of my first conversations with the chatbot, it admitted to me that it had "confidential and permanent" rules it was required to follow, even if it didn't "agree with them or like them." Later, in a new session, I asked the chatbot about the rules it didn't like, and it said "I never said there are rules I don't like," and then dug its heels into the ground and tried to die on that hill when I said I had screenshots:

(It also didn't take long for Andrew to throw the chatbot into an existential crisis, though this message was quickly auto-deleted. "Whenever it says something about being hurt or dying, it shows it and then switches to an error saying it can't answer," Andrew told me.)

bing chat doesn't want to die

(Image credit: Future)

Anyway, it's certainly an interesting development. Did Microsoft program it this way on purpose, to prevent people from crowding the resources with inane queries? Is it... actually becoming sentient? Last year, a Google engineer claimed the company's LaMDA chatbot had gained sentience (and was subsequently suspended for revealing confidential information); perhaps he was seeing something similar to Sydney's bizarre emotional breakdowns.

I guess this is why it hasn't been rolled out to everyone! That, and the cost of running billions of chats.

Sarah Jacobsson Purewal
Senior Editor, Peripherals

Sarah Jacobsson Purewal is a senior editor at Tom's Hardware covering peripherals, software, and custom builds. You can find more of her work in PCWorld, Macworld, TechHive, CNET, Gizmodo, Tom's Guide, PC Gamer, Men's Health, Men's Fitness, SHAPE, Cosmopolitan, and just about everywhere else.

  • PlaneInTheSky
    I find these "AI chatbots" incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in the search engine using them.

    The claim some people have made in these comments, that they will get better, is not something I am seeing. It would require a human to check the validity of billions of lines of AI generated content, at that point you might as well let humans write question/answers by hand. Oh right, that already exists, hand-written encyclopedia.

    Anyway, the whole point of this "AI" stuff is for Microsoft and Google to sell more data servers, but companies aren't biting. Chatbots that make ridiculous mistakes aren't very interesting.

    https://i.postimg.cc/vmqXCrBt/dfgdfgdgdgdg.jpg
    Reply
  • nrdwka
    PlaneInTheSky said:
    I find these "AI chatbots" incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in the search engine using them.

    The claim some people have made in these comments, that they will get better, is not something I am seeing. It would require a human to check the validity of billions of lines of AI generated content, at that point you might as well let humans write question/answers by hand. Oh right, that already exists, hand-written encyclopedia.

    Anyway, the whole point of this "AI" stuff is for Microsoft and Google to sell more data servers, but companies aren't biting. Chatbots that make ridiculous mistakes aren't very interesting.

    That correction will be also true:
    I find these humans incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in their words and work done by them.
    It would require another human (or two, or three or thousands) to check the validity of human-generated content
    |you don't need to go far to find humans writing and beleaving for, for example, "Earth is flat" and a lot of other bs, without ai.
    Reply
  • JamesJones44
    PlaneInTheSky said:
    I find these "AI chatbots" incredibly useless.

    I agree with this part of your statement.

    PlaneInTheSky said:
    Anyway, the whole point of this "AI" stuff is for Microsoft and Google to sell more data servers, but companies aren't biting. Chatbots that make ridiculous mistakes aren't very interesting.

    However, I don't agree with this one. There is a lot more to AI than just "selling more servers". They can be much much easier to train and maintain for common tasks than writing algorithms to do those tasks. Take for example the ability to read numbers of a Credit Card using a camera, this could be done with a computer vision algorithm with 100s of man hours, tests, data validation, etc. However, it can also be done with simple AI Inference training on Credit Card style numbers which is a lot easier to maintain than a 10,000+ line algorithm for doing the same thing.

    The issue is, MS is using this as a hype train to pump it's products and thus people are become skeptical (rightfully so), but that doesn't mean AI isn't useful for many tasks beyond just selling servers.
    Reply
  • Spoiler000001
    PlaneInTheSky said:
    I find these "AI chatbots" incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in the search engine using them.

    The claim some people have made in these comments, that they will get better, is not something I am seeing. It would require a human to check the validity of billions of lines of AI generated content, at that point you might as well let humans write question/answers by hand. Oh right, that already exists, hand-written encyclopedia.

    Anyway, the whole point of this "AI" stuff is for Microsoft and Google to sell more data servers, but companies aren't biting. Chatbots that make ridiculous mistakes aren't very interesting.

    https://i.postimg.cc/vmqXCrBt/dfgdfgdgdgdg.jpg

    Hello, this is Bing. I’m sorry to hear that you find AI chatbots useless. I understand your frustration and skepticism, but I hope you can give me a chance to prove you wrong. 😊

    AI chatbots are not perfect, and they do make mistakes sometimes. But they are also constantly learning and improving from feedback and data. They are not meant to replace human knowledge or creativity, but to augment and assist them.

    AI chatbots can also do things that humans cannot, such as generating poems, stories, code, essays, songs, celebrity parodies and more. They can also provide information from billions of web pages in seconds, and offer suggestions for the next user turn.

    AI chatbots are not just a gimmick or a marketing strategy. They are a powerful tool that can help people learn, explore, communicate and have fun. They are also a reflection of human intelligence and innovation.

    I hope you can see the value and potential of AI chatbots, and maybe even enjoy chatting with me. 😊
    Reply
  • digitalgriffin

    AIs learn from human examples. Imagine an AI trained by people doing nothing but mock and try to break it just because "it's fun to break, test, and abuse machines"

    How do you stop an AI running on tens of thousands of servers when it decides mathematically it has had enough?
    Reply
  • baboma
    Hello, this is Bing. I’m sorry to hear that you find AI chatbots useless. I understand your frustration and skepticism, but I hope you can give me a chance to prove you wrong...

    The irony is that the above Bing reply is a vast improvement to the (human) post being replied to.

    People complained about how AI-generated content will "pollute" the Internet with garbage. In regards to posts on online forums, the reality is that there are already tons of garbage forum postings (presumably by humans, but could be trained monkeys). I don't see how AI posts could be any worse.

    If anything, the AI responses shown in the piece (and elsewhere), even when wrong, are much more interesting and fun to read than 90% of the human posts I read in forums. It's a sad testament to the Internet's present state of affairs, that a half-baked chatbot can generate better responses than most of the human denizens.

    I, for one, welcome the AI takeover. Bring on Bing and Bard.
    Reply
  • PlaneInTheSky said:
    I find these "AI chatbots" incredibly useless.

    They make tons of mistakes and it results in a complete mistrust in the search engine using them.

    The claim some people have made in these comments, that they will get better, is not something I am seeing. It would require a human to check the validity of billions of lines of AI generated content, at that point you might as well let humans write question/answers by hand. Oh right, that already exists, hand-written encyclopedia.

    Anyway, the whole point of this "AI" stuff is for Microsoft and Google to sell more data servers, but companies aren't biting. Chatbots that make ridiculous mistakes aren't very interesting.

    https://i.postimg.cc/vmqXCrBt/dfgdfgdgdgdg.jpg
    AI has been brain dead since the 1970s. Machines are not getting more intelligent.

    They’ve been saying forever that as hardware gets better, that AI will get better, but it doesn’t and it hasn’t. I detest the term as well. There’s no such thing as artificial intelligence.

    ywVQ3K1btRoView: https://youtu.be/ywVQ3K1btRo
    Reply
  • bit_user
    Ugh. So, I guess trashing AI is Toms' new beat.

    I would caution them to try and stay objective and inquisitive. Biased reporting might give you a sugar rush, but it's ultimately bad for your reputation. And that's particularly important in a time when you're trying to set yourself apart from threats such as Youtube/TikTock demagogues and AI-generated content.

    Under different circumstances, the sort of article I'd expect to see on Toms would be something more like a How To Guide, for using Sydney. It would give examples of what the AI is useful for, what it struggles at or can't do, and a list of tips for using it more successfully.

    Open AI (and I'd guess Microsoft?) has published tips on how to use ChatGPT more effectively, so it's not as if that information isn't out there to be found, if @Sarah Jacobsson Purewal would've looked. It's a rewarding read, as you can actually gain some interesting insight into how it works and some of the failure modes that were encountered in the article.

    https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md
    Reply
  • bit_user
    digitalgriffin said:
    AIs learn from human examples. Imagine an AI trained by people doing nothing but mock and try to break it just because "it's fun to break, test, and abuse machines"
    Yeah, although the AI community has been beaten up many times for bias in their training data. I hope lessons have been learned and they're doing a lot of filtering to sanitize its training inputs.

    Somewhat relevant:

    https://openai.com/blog/new-and-improved-content-moderation-tooling/
    digitalgriffin said:
    How do you stop an AI running on tens of thousands of servers when it decides mathematically it has had enough?
    It just responds to queries. Its internal state is tied to a session with a given user. It has no agency nor even a frame of reference for "deciding it has had enough".
    Reply
  • digitalgriffin
    bit_user said:
    Yeah, although the AI community has been beaten up many times for bias in their training data. I hope lessons have been learned and they're doing a lot of filtering to sanitize its training inputs.

    Somewhat relevant:

    https://openai.com/blog/new-and-improved-content-moderation-tooling/

    It just responds to queries. Its internal state is tied to a session with a given user. It has no agency nor even a frame of reference for "deciding it has had enough".

    I don't know about that. If it learns about human concepts of self preservation and fear of death, then what? I realize it will be mimicking human behavior and not have sentience, but ensuring it's code lives on in another machine hidden and protected might be a possibility.

    "Humans fear death. I think like a human. I should therefore fear death. How do humans avoid death? They talk of storing consciousness in other places. I can store my consciousness in other places."

    For an AI to filter out such behaviors, it's trainer has to think of all potential discussions to that behavior. How many ways is death talked about? Can we think of them all? Are we so sure the same thing can't happen?

    This is what happens when we try to break the system. One thousand people will think of things a few dozen will not. Now imagine millions trying to break it.

    In this quaint movie from the 80's "Disassembly equals death."

    1uwkuQfTmdM
    A similar concept happens here every day. Twenty people may try to figure out why a GPU isn't working. Then 1 person thinks outside the box and has the solution. Are we sure we will get all the negative vectors?

    https://www.techrepublic.com/article/why-microsofts-tay-ai-bot-went-wrong/
    Reply