What is Wrong with AI Text Detectors? 

In an AI-human tussle in academia last year, we saw professors allegedly accusing students of submitting AI-generated assignments. Professors used AI detectors like Turnitin and GPTZero to get to the bottom of things. They later found these tools don’t give an accurate analysis and often offer false positive results.

Professors have since resorted to more verbal tests, while the number of students using LLMs to write these assignments and thesis remains high. It has been long established that AI text detectors don’t work, as it is impossible to determine how much of an article is human-written or LLM generated, with confidence.

Simultaneously, there is research being done on how to accurately detect text produced by AI. This is a constant cat-and-mouse chase where each of them is improving significantly using different methods of detection and avoidance.

Vivek Verma, an undergraduate student at University of California, and the creator of AI text detector Ghostbuster, said, “There can be more use for the detectors than trying to implicate students on their assignments. These can be used to make the chatbots better at writing them in the first place.”

Difficulty in Detecting AI Text

While one can guess who wrote the text, it can be hard to be sure of it. A user on HackerNews pointed out that text books written on Quantum Mechanics got a 96% positive for being AI generated. This was obviously not the case, but the text followed a specific language pattern that overlapped with those often found in AI-generated texts.

The models are often trained on datasets that do not fully encompass the diversity and complexity of human writing, especially in specialised fields like quantum mechanics. They analyse text based on patterns learned from these datasets, which can lead to inaccuracies when encountering texts that differ from their training material.

In January last year, OpenAI released a classifier as an AI detection tool, and discontinued it only after six months citing low rates of accuracy. While there are other AI detectors like Copyleaks that claim 99% accuracy with 0.1% false positives, the AI text generators have stepped up their game.

ChatGPT plugins like Humanize, WriteEasy, or even prompting it to write in a specific style would be enough to fool a detector. Yet these models overuse generic adjectives, use excessive passive voice, follow a formulaic writing style using a few syntax orders, without any creativity or originality.

Humans on the other hand, tend to write with a certain ebb and flow in their sentences. They’re often in varying lengths along with the use of new and strange analogies. While AI generated text is grammatically correct, it lacks factual depth, or deep understanding of the topic with new connections.

This and more such insights can be found in the paper ‘AI vs Human’.

An Endless Cycle

The website Verma developed was the result of a paper titled ‘Ghostbuster, Detecting Text Ghostwritten by Large Language Models’ from the University of California, and uses a new method to test material on their authenticity.

They use three weaker models to run the text which needs to be checked. Verma said, “We realise these detectors are not completely trustworthy, but we can use them to differentiate between the origin of the text to train new models.”

Another use case he explains is to make the models better at writing text in the first place. “Often chatbots like ChatGPT produce text that is too verbose. We can use the detectors as an evaluation tool to sort out the responses from better to worse,” he said.

The next step is also to identify which part of a text is written by humans and AI with accuracy. There are detectors like GLTR (Giant Language Model Test Room), developed by Harvard and MIT-IBM Watson AI Lab that assigns areas with higher probability of being AI generated.

“The final detectors should also explain why it categorises the content in each bucket and its reasoning behind it, which is what I’m working on next,” Verma concluded.

The post What is Wrong with AI Text Detectors? appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...