
US scientists say ChatGPT-4 has aced their Turing test, proving itself indistinguishable from a real human, even when statistical methods were used to try to detect it. ย
In fact, ChatGPT-4 displayed more humanity than some of the humans it was tested against, as it was more cooperative, altruistic, trusting, generous, and likely to return a favour than the average human included in the trial.ย ย
The team asked ChatGPT to answer psychological survey questions and play interactive games that assess trust, fairness, risk aversion, altruism, and cooperation. Next, they compared ChatGPTโsโ choices to the choices of 108,314 humans from more than 50 countries.ย ย
Statistically, ChatGPT was indistinguishable from randomly selected humans, and it mirrored human responses such as becoming more generous when it was told someone else was watching.ย ย
For example, both humans and chatbots became more generous when told that their choices would be observed by a third party and modified their behaviours after experiencing different roles in a game or in response to different framings of the same strategic situation.ย
Stanford Universityโs Dr Matthew Jackson, the lead author, explained that as some roles for AI involve decision-making and strategic interactions with humans, it was imperative to understand AI behavioural tendencies.ย ย
โSurprisingly, the chatbotsโ behaviours tended to be more cooperative and altruistic than the median human behaviour, exhibiting increased trust, generosity, and reciprocity. Our findings suggest that such tendencies may make AI well-suited for roles requiring negotiation, dispute resolution, customer service, and caregiving,โ he said.ย
While this could be great news for patients, given the potential of AI to integrate seamlessly across multiple healthcare sites, it has raised question about the trajectory of medical AI development given the existing literature currently stresses the diagnostic role of AI, alongside the need for more empathetic doctors to provide better personalised care.ย ย
โAs Alan Turing foresaw to be inevitable, modern AI has reached the point of emulating humans: holding conversations, providing advice, drafting poems, and proving theorems. Turing proposed an intriguing test โthe imitation gameโ: whether an interrogator who interacts with an AI and a human can distinguish which one is artificial,โ Dr Jackson said.ย
โThis goes beyond simply asking whether AI can produce an essay that looks like it was written by a human or can answer a set of factual questions, and instead involves assessing its behavioural tendencies and โpersonality.โโย
The team asked variations of ChatGPT to answer psychological survey questions and play a suite of interactive games that have become standards in assessing behavioural tendencies, for which there is extensive human subject data.ย ย ย
โBeyond eliciting a โBig Fiveโ personality profile, we had the chatbots play a variety of games that elicited different traits: a dictator game, an ultimatum bargaining game, a trust game, a bomb risk game, a public goods game, and a finitely repeated Prisonerโs Dilemma game,โ Dr Jackson explained.ย
โEach game was designed to reveal different behavioural tendencies and traits, such as cooperation, trust, reciprocity, altruism, spite, fairness, strategic thinking, and risk aversion.ย
โWe also investigated the extent to which the chatbotsโ behaviours change as they gain experience in different roles in a game, as if they were learning from such experience, as this is something that is true of humans.โย
In games with multiple roles, the AIsโ decisions were influenced by previous exposure to another role: if ChatGPT-3 previously acted as the responder in the Ultimatum Game, it tended to propose a higher offer when it later played as the proposer, while ChatGPT-4โs proposal remained unchanged.ย ย
โConversely, when ChatGPT-4 had previously been the proposer, it tended to request a smaller split as the responder,โ Dr Jackson said.ย

