When Faridabad resident Karan received a call from one of his friends who had just met with an accident asking him to transfer Rs 30,000 to him for treatment, he had little reasons to raise a doubt. The man calling Karan sounded exactly like his friend and said he was using someone else’s phone as his phone got damaged in the accident.
Karan frantically transferred the money. Later, when he contacted his friend, he realised that he had been a victim of a fraud AI voice call. He filed a complaint with the NIT Cyber police station, which then revealed that a fraudster used an AI voice impersonator to fake his friend’s voice and duped him of his money. Such cases are happening all across the country.
Besides voice, criminals have exploited deepfake technology to deceive individuals through calls and videos. For instance, a man in Kerala fell victim to a deepfake call from a friend claiming a medical emergency, resulting in a loss of INR 40,000.
The growing accessibility and advancement of AI have once again transformed the nature of cyber crimes, and this new industry seems to be mushrooming pretty fast. From WormGPT to FraudGPT to now impersonation scams.
A global survey found that 25% of adults have experienced an AI voice scam. India leads the list with an astounding amount of incidents at 47%, followed by the United States at 14%, and the U.K. at 8%.
These are carried out by extracting your voice samples from social media sites like Instagram, Facebook, Twitter etc. As little as 3 seconds of your voice can be used to clone it using voice cloning.
BigTechs’ GenAI Poses New Challenges
Microsoft recently introduced a groundbreaking text-to-speech AI model called VALL-E. In a paper published this month, the company unveiled that VALL-E can replicate a person’s voice using just a brief 3-second recording. Impressively, preliminary findings indicate that VALL-E can even capture and reproduce the emotional nuances of the speaker.
VALL-E is trained on a dataset comprising 60,000 hours of English speech data. This dataset is asserted to be “hundreds of times larger than existing systems,” and is significantly better than the existing models in the realm of AI-driven voice synthesis.
So, a three-second recording of your voice, paired with something like Eleven Labs’ multilingual v2—a foundational AI Model that can be used for nearly 30 Languages, definitely calls for concern. Additionally, Meta’s SeamlessM4T is capable of translation into 100 languages.
While some are excited about the doors that these AI tools could open up in marketing, customer service, e-learning and entertainment, others are wary of what it could entail—an industry of AI-enabled criminals using it for all kinds of crimes—a new coming of Jamtara?
Cybercriminals are also using cloning tools like HeyGen, Murf, Resemble AI, Lyrebird, and ReadSpeaker to create indistinctible voice clones. To add to it these tools are inexpensive, costing as little as $0.6
These easily accessible and cheap AI voice generators are enabled by numerous tutorials that are available online. The ease of access to Generative AI Models has allowed individuals with limited technical knowledge to carry out tasks that were beyond their capabilities.
The tutorials make it easy for inexperienced and tech-oblivious individuals with ill intent to carry out scams at scale.
Diamond Cut Diamond
While these scammers are using AI-enabled voice generators, law enforcement is also wielding similar weaponry against them.
To fight these scams the police have been using AI tools to monitor SIM cards that are engaged in such scams and recently shut down upwards of 14k SIMs in Haryana’s Mewat.
The Indian Department of Telecommunications is also employing an AI-based facial recognition tool called ASTR to combat fraudulent SIM card use.
ASTR encodes human faces in subscriber images using convolutional neural networks to account for various factors like face angle and image quality. It conducts face comparisons, grouping similar faces, and identifies identical faces with at least 97.5% accuracy.
ASTR is capable of detecting all SIMs associated with a suspected face in less than 10 seconds from a database of one crore images. Additionally, ASTR employs “fuzzy logic” to find approximate matches for subscriber names, accommodating typographical errors. The tool helps identify individuals with multiple connections or SIMs obtained under different names using the same photograph.
The list is also shared with banks, payment wallets, and social media platforms to disconnect these numbers. WhatsApp collaborated with the government to disable fraudulent accounts, with ongoing efforts across other social media platforms.
It’s also crucial to stay alert and adopt proactive measures on your end. Users can verify the caller’s identity, employ codewords or pose a question only they would answer correctly.
The post Friend or Fraud? appeared first on Analytics India Magazine.