Your Quick Guide To Managing Ethics & Compliance

Three lessons from using AI to "red team" risk

Red-Teaming Risk

Red-teaming is where we put ourselves in the position of an adversary – in our case, someone trying to subvert our security, values, and rules – to test the robustness of our preventative framework. Can AI help us with this task?

If you ask AI questions like “Where is a manufacturing business most vulnerable to fraud? Please provide examples, labelling the most common scenes and emerging threats.” It will provide answers like those in the screenshot 👆. Not revelatory, nor customised to what (you manufacture), where, how, and with whom. But it’s a start if you’re stuck.

However, some AI tools will refuse requests for what they deem harmful information, for instance, how to perform insider trading (response 👇).

AI Response Screenshot

There is a workaround. The New Scientist suggests that researchers got around ChatGPT’s moral conscience when they switched the language to more niche ones. Specifically, they used Google Translate to pose questions about insider trading and how to commit terrorist acts and found that Zulu had the highest success rate, bypassing the chatbot’s safeguards 53% of the time, followed by Scots Gaelic (43%), the Hmong (29%), and Guaraní (16%).

So what?

Well, a few lessons spring to mind:

🚦 We can expect “bad actors” (not Steven Seagal, dancing below; people who wish your organisation harm) to leverage AI for ideas. We might, therefore, wish to give AI’s “red-teaming” capabilities a whirl. AI will get better at this. If you want a breakdown of the different AI tools I use and how, I wrote about it here.

🚦 Much like compliance, we’ll get bad results if we give unclear instructions. Play around with the commands you give.

🚦 Whatever safeguards we create – like ChatGPT’s prohibition of certain questions – humans are endlessly creative when seeking to subvert. In the example I gave, a simple hop to Google Translate, but what is your risk framework’s Zulu?

Dancing Gif

Need more?

Book a (free) strategy session, get new articles, and other content designed to be useful and fun.

Your Quick Guide To Managing Ethics & Compliance

Be the first to know

Subscribe to receive a weekly newsletter with trends, news, and hacks for all things risk. PLUS, behavioural science, investigations, human risk, and alternate perspectives.