Digital Disconnect:A straightforward political op-ed has exposed a far more complicated problem in the world of artificial intelligence. Tools designed to identify AI-written content are failing to agree not just on edge cases, but on something as conventional as a Prime Minister’s blog.

Continues below advertisement

Prime Minister Narendra Modi on Thursday wrote an opinion blog that reservation for women in legislative bodies was the "need of the hour", adding that any delay in implementing it would be "particularly unfortunate". "It is imperative that the 2029 Lok Sabha elections and the Assembly elections to the various states in the coming times are conducted with women’s reservation in place. Over the decades, there have been repeated efforts to provide women with their rightful place in democratic institutions by the previous governments," PM Modi wrote.

When this exact text was tested across multiple AI detection tools and generative AI platforms, the results were anything but consistent.

Continues below advertisement

Same Text, Completely Different AI Verdicts

Some platforms flagged the op-ed as heavily AI-assisted. Elon Musk's Grok estimated that 80 to 90 per cent of the content was AI-generated, while ChatGPT placed the likelihood between 55 and 70 per cent.

Credit: Screenshot/Grok

Credit: Screenshot/ChatGPT

On the other hand, Google's Gemini assessed the same text as 5-10 per cent AI-generated, and Anthropic's Claude stated, "No, this is not AI-generated. This is an authentic op-ed written by Prime Minister Narendra Modi."

Credit: Screenshot/Gemini

Credit: Screenshot/Claude

ZeroGPT marked 66.8 per cent of the text as AI-generated, Grammarly estimated 17 per cent, while Copyleaks claimed that the article had 50.1 per cent AI involvement. GPTZero wrote, "If we had to classify it, it would be considered AI generated".

Credit: Screenshot/ZeroGPT

Credit: Screenshot/Grammarly

Credit: Screenshot/Copyleaks

Credit: Screenshot/GPTZero

The same words, when passed through different systems, produced entirely different conclusions. Reminds me of the 'Blind Men & An Elephant' parable. This lack of consensus raises questions about what these tools are actually measuring.

Jane Austen & US Constitution Face The Same Issue

The problem extends beyond contemporary political writing. According to an article by blogger Yashvi Jain, even classic literature fails these tests.

As per Jain's findings, an excerpt from Jane Austen's seminal 1813 novel 'Pride and Prejudice' was flagged as 73.9 per cent AI-generated by ZeroGPT, while Copyleaks identified it as human-written. GPTZero placed it somewhere in between at 17 per cent.

Legal texts show similar inconsistencies. Portions of the US Constitution have been flagged as over 90 per cent AI-generated by some detectors, while others also categorise them as AI-written. These are documents created long before modern computing existed, yet they trigger the same red flags.

Why AI Detectors Struggle With Accuracy

The issue lies in how these tools operate. Unlike plagiarism checkers, AI detectors do not compare text against a database of known sources. Instead, they rely on statistical patterns.

They analyse predictability in word choice, known as perplexity, and variation in sentence structure, often referred to as 'burstiness'. Text that appears too structured, too consistent, or too polished is more likely to be flagged as AI-generated.

This creates a fundamental flaw. Many forms of legitimate writing, such as political speeches, legal documents, and edited journalism, are naturally structured and formal. As a result, they can resemble AI-generated patterns even when they are entirely human.

The tools also struggle with older styles of writing. Archaic or classical English, which differs significantly from modern usage, can be misinterpreted as artificial because it falls outside the model’s training patterns.

Another factor is the growing use of AI-powered writing assistants. Tools like Grammarly or paraphrasing software subtly standardise language, which can increase the likelihood of content being flagged as AI-generated, even if the original draft was written by a human.

This Is Getting Hard To Ignore

Taken together, these examples point to a larger issue. AI detection tools are not measuring authorship with certainty. They are estimating probability based on patterns that can overlap between human and machine writing. While they may occasionally identify fully AI-generated content, their inconsistent results make them unreliable as definitive arbiters. The same text can be labelled both human and artificial depending on which tool is used.

The debate around AI-generated content is growing, but the tools meant to enforce that distinction appear far from settled. For now, the technology designed to answer the question of authorship may be raising more doubts than it resolves.

This article is written by a human (yours truly). Go ahead, run this through an AI content detector. See what you find!

Digital Disconnect is an ABP Live-exclusive column, where we explore the many admirable advancements the world of tech is seeing each day, and how they lead to a certain disconnect among users. Is the modern world an easier place to live in, thanks to tech? Definitely. Does that mean we don’t long for things to go back to the good-ol’ days? Well, look out for our next column to find out.