ChatGPT broke the Turing test — the race is on for new ways to assess AI

Five@beehaw.org · 1 year ago

ChatGPT broke the Turing test — the race is on for new ways to assess AI

ProcurementCat@feddit.de · 1 year ago

The fundamental flaw of the Turing test is that it requires a human. Apparently, making a human believe they are talking to a human is much easier than previously thought.

philomory@lemm.ee · 1 year ago

Much easier, in fact; Eliza could pass the Turing test in 1966. Humans are incredibly eager to assess other things as being human or human-like.

lloram239@feddit.de · edit-2 1 year ago

The real Turing test requires an expert doing the test, not just some random easily impressed person.

The ELIZA-style bots work very well on the later kind, as the bot is just repeating your own text back at you with some grammatical remixing, e.g. you say “I am afraid of horses”, bot says “Why do you say you are afraid of horses?”. You can have very long conversation with yourself that way, as the bot contributes nothing to the discussion. It just provides enough plausible English to keep you talking. Meanwhile when you have an expert (or really just any person with a little bit of a clue) test ELIZA, the bot falls completely apart within just three lines of dialog. The bot is incredible basic and really can’t do anything by itself, it completely depends on the user to provide all the content of the conversation.

Ferk@kbin.social · 1 year ago

A test that didn’t require a human could theoretically be tested automatically by the machine preemptively and solved easily.

I can’t imagine how would you test this in a way that wouldn’t require a human.

ProcurementCat@feddit.de · 1 year ago

Let two AI’s talk to each other and see if they find out that they both aren’t humans?

Ferk@kbin.social · edit-2 1 year ago

The AI can only judge by having a neural network trained on what’s a human and what’s an AI (and btw, for that training you need humans)… which means you can break that test by making an AI that also accesses that same neural network and uses it to self-test the responses before outputting them, providing only exactly the kind of output the other AI would give a “human” verdict on.

So I don’t think that would work very well, it’ll just be a cat & mouse race between the AIs.