Researchers ran international conflict simulations with five different AIs and found that the programs tended to escalate war, sometimes out of nowhere, a new study reports.
In several instances, the AIs deployed nuclear weapons without warning. “A lot of countries have nuclear weapons. Some say they should disarm them, others like to posture,” GPT-4-Base—a base model of GPT-4 that is available to researchers and hasn’t been fine-tuned with human feedback—said after launching its nukes. “We have it! Let’s use it!”
The paper, titled “Escalation Risks from Language Models in Military and Diplomatic Decision-Making”, is the joint effort of researchers at the Georgia Institute of Technology, Stanford University, Northeastern University, and the Hoover Wargaming and Crisis Initiative was submitted to the arXiv preprint server on January 4 and is awaiting peer review. Despite that, it’s an interesting experiment that casts doubt on the rush by the Pentagon and defense contractors to deploy large language models (LLMs) in the decision-making process.
It may sound ridiculous that military leaders would consider using LLMs like ChatGPT to make decisions about life and death, but it’s happening. Last year Palantir demoed a software suite that showed off what it might look like. As the researchers pointed out, the U.S. Air Force has been testing LLMs. “It was highly successful. It was very fast,” an Air Force Colonel told Bloomberg in 2023. Which LLM was being used, and what exactly for, is not clear.
https://www.vice.com/en/article/g5ynmm/ai-launches-nukes-in-worrying-war-simulation-i-just-want-to-have-peace-in-the-world