On a call with journalists this Friday, the FBI issued a stark reminder - and warning - on the proliferation of cyberattacks aided by AI programs. According to the governmental body, the number of people-turned-deviants using AI technology as part of phishing attacks or malware development has been increasing at an alarming rate - and that the impact of their operations is only increasing.
We've covered reports of how even locked-in AI models such as ChatGPT have been used to develop malware that's adept at evading even the latest security systems. Polymorphic malware, as it's known, is a piece of software that used to require a highly-skilled black-hat coder (or group of coders) to achieve; now, it's been democratized towards a context window.
“We expect over time as adoption and democratization of AI models continues, these trends will increase,” said a senior FBI official.
But beyond the walled gardens of Open AI's ChatGPT and Anthropic's Claude 2, the world of open-source AI is the focus of law enforcement agencies. That's the world where you can pull a base model of your choice (the most recent example of which is Meta's open-source Llama 2), and train it on whatever documentation you want - to fine-tune it to your specific needs.
Those needs can range from a self-defeatingly-humorous ChatGPT clone hell-bent on world domination (as is the case of BratGPT); taking a detour through the DarkWeb and its linguistic subculture (as researchers did with DarkBert); and finally landing on the shores of subscription-based offerings such as WormGPT.
That last one feels like a particularly thorny issue: would-be hackers can now access the services of a subscription-based, black-hat focused ChatGPT clone. There's no more convenient environment for someone to launch a remote phishing attack.
With these tools, the attacker can easily automat building the entire webpage, back-and-forth email response chain, and pretty much the entire work involved with the process. Or, again, code polymorphic malware that's adept at evading current cybersecurity capabilities- and defense is already at a disadvantage. No need to add AI on top. Perhaps wisely, the FBI didn't identify what specific open-source AI models are being exploited while on the call, but the mere recognition of there being an issue is telling enough.
The FBI also stressed security concerns surrounding the proliferation of generative AI technology that can be sued in the field of deepfakes - AI-generated content that never happened in reality. It's hard to understate how dangerous a deep-faked, digitally-unverifiable press-conference can become in the world we live in. This issue leads directly to the conundrum of separating synthetic data (generated or otherwise handled by an AI network) from emergent data (the data that naturally emerges from the record of human activity).
A number of AI giants including OpenAI, Microsoft, Google and Meta recently vowed at the White House to introduce such watermarking technology, which is also of benefit to them in the inevitable race to lower training costs (of which the leading candidate is what's known as recursive training). But just this week, OpenAI shuttered its AI Classifier tool whose aim was just to identify such synthetic data. But owing to its mere 26% success rate, it might have been best that OpenAI did close it.
The proliferation of privately-tailored, open-source-grabbed generative AI technology is a certainty, as the examples linked within this page suggest. The moment to contain them was gone from the moment they went open-source; and over time, better hardware and techniques will allow anyone with a private AI model to improve upon it.
Like anything, AI usage will vary, ranging from the things the FBI cares about to building new businesses due to the ease of automation to crafting entire adventures in games such as Cyberpunk 2077. We'll have to see it as it goes.