Call of Duty's new AI chat moderator can tell "hate speech" from "friendly banter"
Or so it says
This year's Call Of Duty: Modern Warfare 3 will make use of Modulate's ToxMod AI to help moderate voice chat in multiplayer, Activision have revealed. The system won't detect offences in real time, nor will it be able to kick you from games for effing and blinding. It just submits reports of bad behaviour, with Activision staff making the final call on enforcement actions, which might involve reviewing recordings to work out the context of an outburst.
You can turn off voice chat in Call of Duty to avoid being recorded, but it doesn't look like there's a way to opt out of AI voice moderation in the current/new Call of Duty games at the time of writing. The tech is being beta-tested from today, 30th August in Modern Warfare 2 and Warzone in North America, with the English-language worldwide release to land alongside Modern Warfare 3 on 10th November.
Intriguingly, Modulate boast that "ToxMod's advanced AI intuitively understands the difference between friendly trash talk and truly harmful toxic behavior". Given that many non-artificial intelligences, aka human beings, struggle with this distinction, I'm interested and a little fearful to see their working. The tech doesn't just listen out for naughty keywords, but "assesses the tone, timbre, emotion, and context of a conversation to determine the type and severity of toxic behavior".
In a release on Modulate's site, Activision's chief technology officer Michael Vance described the introduction of Toxmod to COD as a "critical step forward to creating and maintaining a fun, fair and welcoming experience for all players." The game already makes use of text-based filtering for in-game chat and usernames, on top of a player-reporting system.
Modulate - founded in 2017, with clients including VR outfit Schell Games and Riot - have an Ethics page in which they address a few of the more common criticisms directed at AI, such as the likelhood of "neutral" algorithms and datasets passing on the social biases of their creators, and the reliance of many large language models on copyrighted info or info given without express permission.
"We consider it not only vital for our business, but also our responsibility, to ensure that all of our machine learning systems are trained on a representative and broad set of different voices corresponding to a range of age, gender identity, race, body type, and more," it reads. "Whenever possible, we endeavor to utilize large public datasets which already have been studied for representativeness, though in some cases we are required to gather our own reference speech.
"When this is the case, we've made a point to either hire a wide range of voice actors (where acted speech is sufficient) or to work with our trained data labeling team to identify a sufficiently diverse test set."
Modulate claim the Toxmod system is designed and tested "to ensure we are not more likely to flag certain demographics as harmful, given the same behavior. That said, we do occasionally consider an individual's demographics when determining the severity of a harm. For instance, if we detect a prepubescent speaker in a chat, we might rate certain kinds of offenses more severely due to the risk to the child.
"We also recognize that certain behaviors may be fundamentally different depending on the demographics of the participants," it goes on. "While the n-word is typically considered a vile slur, many players who identify as black or brown have reclaimed it and use it positively within their communities. While Modulate does not detect or identify the ethnicity of individual speakers, it will listen to conversational cues to determine how others in the conversation are reacting to the use of such terms."
There's also a privacy page detailing what voice data Modulate gathers in order both to moderate chat and "train" its AI, and how this data is stored and anonymised. Any thoughts on all this?