How do AI content detectors work — and can you trust them?

Dan

Co-founder, CEO

Picture this: you’ve just spent hours crafting the perfect blog post with help from your favorite AI. You tinkered with prompts, tone and style to squeeze just the right words out of the LLM. And you’ve finally found it: the copy that will speak to your customers. You just know it.

Until the intrusive thoughts hit like a ton of bricks. What if someone runs this post through an AI detector? Will I lose my customers’ trust? Will Google banish me to the depths of search engine oblivion?

You’re not alone in your paranoia. AI content detectors are becoming more prevalent — and powerful. On top of the growth of AI detectors, Google is sick of sites flooding search results with AI-generated filler content. In March, the tech giant announced plans to find and eliminate up to 40% of “low-quality, unoriginal” articles from search results.

In this post, we put top AI content detectors to the test. Can they sniff out AI-generated content from ChatGPT, Gemini, Claude and Plus? Which AI is best at beating the detectors? And, whether you’re a student, researcher, writer, or in another profession — how can you bypass detection?

Let’s dive in.

{toc}

Key takeaways

AI content detectors analyze text to determine whether it was written by a human or AI. Their accuracy varies, and most are easy to trick. Like, really easy.
As AI tools improve, the line between human- and machine-generated text is blurring. It’ll become harder for AI detectors to distinguish between the two.
But, as AI evolves, so do detectors. We’ll likely see an arms race between AI writing and detection tools.
To bypass detection, combine AI-generated text with human writing, get creative with prompts, edit carefully, and avoid AI “tells”, like stilted, verbose language.

What is AI content detection?

We’re all using artificial intelligence more. Sometimes, spotting an AI-penned post is easy: we’re all familiar with ChatGPT’s patented wordiness.

But, armed with the right prompts, we can train LLMs to mimic human writing. It’s getting harder to distinguish content written by a human from that of a computer masquerading as one. This has implications for many industries, from education to SEO.

Enter artificial intelligence detectors: algorithms designed to detect AI-generated text. AI content detectors leverage machine learning, computational linguistics, and natural language processing to spot the subtle signatures of AI-generated text.

How do AI content detectors work?

AI detectors — sometimes called GPT detectors — are trained on masses of human-written and machine-generated text. They drill down into a piece of content’s style, tone, syntax and vocabulary, comparing this data to patterns they’ve seen before in both human and AI text.

AI detectors focus on two primary criteria: perplexity and burstiness.

Perplexity refers to how predictable word choice is. AI-generated text has low perplexity compared to human writing, which tends to be more creative.
Burstiness refers to variation in sentences’ length and structure. AI content tends to have lower burstiness, which is why AI writing can feel tedious.

More specifically, AI detectors are on the lookout for red flags like:

Robotic, dry writing style
Factual errors and inconsistencies
Bizarre word choices and phrasing
A lack of original ideas or insights
Repetitive structures and words

These factors feed into advanced algorithms that calculate the probability that a piece of text was written by AI. In some cases, the AI detector shares this score with the user; in others, they simply tell the user whether they think the text was written by an AI or a human.

Here’s an example of how this looks on GPTZero, a leading AI detection tool. It correctly detected Claude-generated text as AI:

Who uses AI detection tools?

Anyone can use artificial intelligence detection tools, but people in the following professions are more likely to use them regularly:

Educators and students
Content writers, marketers and publishers
Researchers
Journalists and editors
Recruiters, who may want to ensure that cover letters and resumés were written by applicants
Social media moderators who want to detect disinformation

How accurate are AI content detectors?

AI content detection sites boast high reliability. But while they’ve improved over the years, they’re still far from perfect.

Any student who has had a stray sentence flagged by Turnitin knows that detectors aren’t 100% accurate. The same goes for writers who, despite painstaking research, interviewing, and writing, see their articles toil at the bottom of search results.

The truth is that it’s hard for AI detectors to keep up with the speed of generative AI developments. AI is getting better at adapting to specific voices and tones, especially if given the right prompts and style guidelines.

In January 2023, OpenAI released AI Classifier, a tool designed to detect text produced by ChatGPT — and swiftly pulled it six months later.

Turns out, the AI Classifier could only identify 26% of AI-written text. “If OpenAI can’t get its AI detection tool to work, nobody else can either,” futurist Daniel Jeffries tweeted.

He’s not the only one to think so: other experts have called AI detection “mostly snake oil”. A 2023 Cornell University study found that it’s easy to trick AI detection tools into believing text was written by a human.

Our own research corroborates these findings. But we’ll get into that in a bit.

Can AI content detectors be wrong?

They sure can. AI detectors are beset by false positives and negatives.

False positives happen when an AI detector wrongly labels content written by a human as AI-generated.
False negatives occur when an AI detector fails to pinpoint AI content.

Scores of students have claimed to be falsely accused of cheating with AI — allegations that are difficult to dispute.

AI detectors struggle to identify tools like Plus that use advanced language modeling to create text that mimics your writing style. More on that in the next section.

Putting top AI detectors to the test

To test AI detectors’ accuracy, we put popular tools through their paces. We used text generated by:

OpenAI’s ChatGPT
Google’s Gemini
Anthropic’s Claude
Our own Plus AI presentation slides

We used the same prompt for all AI tools: “Please write an engaging [blog post or presentation] on how to improve productivity for small businesses. Craft a compelling narrative that explores the benefits and challenges. The tone should be conversational, persuasive and funny. Avoid jargon and overwrought language.”

We took the writing samples generated by each AI and ran them through 10 leading AI content detectors. Then, we took text from this very article to see if the AI detectors could sniff out content written by a human.

The results are below. Green check marks indicate that the AI detector was correct; red x's indicate that the detector was off the mark:

AI Content Detector	ChatGPT	Gemini	Claude	Plus	This article
Copyleaks	✅ AI detected	✅ AI detected	✅ AI detected	❌ Human text	✅ Human text
Writer	❌ 77% human text	❌ 84% human text	❌ 68% human text	❌ 90% human text	✅ 100% human text
GPTZero	✅ AI	❌ human	✅ AI	❌ human	✅ human
Scribbr	✅ 100% chance of AI	❌ 44% chance of AI	✅ 100% chance of AI	❌ 74% chance of AI	❌ 13% chance of AI
Quillbot	✅ 100% chance of AI	✅ 100% chance of AI	✅ 100% chance of AI	❌ 46% chance of AI	❌ 11% chance of AI
[undetectable AI]	❌ appears human	❌ appears human	✅ written by AI	❌ appears human	✅ appears human
ZeroGPT	✅ 97% AI	❌ 65% AI	✅ 98% AI	❌ 48% AI	✅ 0% AI
Content at Scale	❌ human	❌ human	❌ human	❌ human	✅ human
Crossplag	✅ 100% AI	✅ 100% AI	✅ 100% AI	✅ 100% AI	✅ 0% AI
ContentDetector.ai	❌ likely human	❌ likely human	✅ likely AI	❌ likely human with a few AI sentences	✅ likely human

The results of this research are hit-or-miss. Only one AI detector — Crossplag — was able to discern AI from human writing samples correctly every time. Copyleaks managed to identify AI-generated text from ChatGPT, Gemini, and Claude, and pick up on human-written text — but failed to flag Plus-generated text as AI. Both Crossplag and Copyleaks have roots in plagiarism detection, which may be why they won out over other tools.

Crossplag’s assessment of ChatGPT-generated text

Overall, AI content detectors overzealously labeled AI text as human-written.

A few examples:

ContentDetector.AI’s assessment of Gemini-generated text

Content at Scale’s assessment of Claude-generated text

Writer’s assessment of ChatGPT-generated text

When they did catch our AI-generated text, we easily fooled detectors by switching up a few words and phrases.

Here’s an example of how we did this with ZeroGPT’s AI detection tool. First, we fed it ChatGPT-generated text, which it deemed AI-generated:

We edited obvious ChatGPT tells out of the first few paragraphs, and the detector flagged the text as written by a human:

Paraphrasing tools can do this quickly and easily; some AI content detection tools, like [undetectable AI], even have them built into their platforms. (Plus has a built-in rewrite feature for seamless in-presentation editing, FYI.)

Most detectors were good at pinpointing text written by a human. Meanwhile, text from Claude was flagged as AI more often than Gemini or ChatGPT. Most AI detectors failed to label Plus text as AI, which was a pleasant surprise for us.

Copyleaks’ assessment of Plus-generated text

Which AI content detector is best?

According to our tests, the most reliable AI content detectors are Crossplag and Copyleaks. But remember that no AI detector is 100% foolproof, and results can vary dramatically.

What are the limitations of AI detection tools?

Just like AI itself, AI content detection isn’t perfect. Its limitations include:

Detectors have a hard time spotting text put out by high-quality AI tools that mimic human writing.
They often can’t discern AI-generated content that's been heavily edited or combined with human-written content.
They can produce false positives or negatives, especially with shorter, more ambiguous pieces of text.

Why is AI content detection so difficult to get right?

AI writing tools are evolving at a mind-boggling pace. As AI language models become more and more advanced, they're able to generate text that's increasingly hard to distinguish from human writing.

It's a never-ending game of cat and mouse between AI writing tools and AI detectors, with each constantly trying to outsmart the other.

How can AI detector tools be improved?

Use better training data: Improving the quality and amount of data could help AI detectors make more accurate predictions
Improve algorithms to keep up with the latest updates to large language models
Integrate human expertise into these models
Incorporate feedback from users

How to bypass AI content detection

The consequences of failing an AI content detection test can be dire.

Getting caught using AI-generated content could damage students’ academic careers, or even their long-term prospects. Writers, meanwhile, could lose their audience’s trust — not to mention their site’s search traffic.

Of course, we aren’t advocating passing off entirely AI-generated content as human-written. The best approach is to use AI with a heavy dose of human judgment, critical thinking, and editing skills.

To make sure your AI-assisted text doesn’t set off alarms, follow these tips:

Combine AI-generated content with human-written text thoughtfully. Make sure the content matches your voice and includes original ideas.
Use prompts to blacklist jargon. Include a list of banned words in your instructions to LLMs. For ChatGPT, for example, you may want to ban terms like “unveiling”, “embarking” and “a new era”.
Use higher-quality AI tools like Plus to generate natural-sounding text.
Put those AI content detectors to use: run your work through them to catch any potential issues.
Edit out obvious AI tells, like: some text
- Stilted, hedging and fluffy language
- Unnatural word choices
- Lack of sourcing
- Flaws in logic and mistakes
- Repetition
- Uniform sentence structure

What is the future of AI detection tools?

AI detection is an important consideration for anyone using LLMs to create content, be it a student writing a term paper or a marketer drafting a blog post.

These tools are a double-edged sword. They can help flag low-quality writing, but may also incorrectly label text written by a human as AI-generated, or vice versa. And, of course, they can discourage people from using AI as a writing tool — in any capacity.

You may be wondering: what’s the point of having all this great technology if we can’t take advantage of it? If AI is the future, shouldn’t we be encouraging people to learn how to use it now?

The ethics of AI use are changing, fast – and so is the technology. We’re going to see an arms race between AI tools and detectors, with each side trying to outdo the other. But, for now: it looks like the AI tools are winning.

FAQ

What is an AI content detector?

Think of AI content detectors as digital detectives. They use machine learning algorithms to analyze a piece of text and determine whether it was written by a human or AI.

How do AI content detectors work?

AI content detectors analyze writing style, tone, word choice, and sentence structure. They compare this data to hallmarks of AI-generated text, and then calculate the probability that the content in question was written by AI.

How accurate are AI content detectors?

The accuracy of AI detectors varies widely, depending on factors like the quality of the AI writing tool, the specific characteristics of the text being analyzed, and the sophistication of the detection algorithm. While AI detectors are constantly improving, they're not perfect and can sometimes produce false positives or false negatives. Our research found that most AI detectors aren’t accurate — and that it’s easy to trick these tools to avoid detection.

What are the best AI tools to bypass AI content detection?

In our testing, Plus AI and ChatGPT were the best tools to avoid AI content detection across 10 different AI content detection services. Claude was the worst tool at bypassing AI content detection.

What are the best AI content detection tools?

In our testing, Crossplag and Copyleaks were the most accurate tools when detecting content from AI writing tools like ChatGPT, Gemini, and Claude. Writer and Content at Scale were the worst tools for detecting AI content.

Table of Contents

Item text

How do AI content detectors work — and can you trust them?

Key takeaways

What is AI content detection?

How do AI content detectors work?

Who uses AI detection tools?

How accurate are AI content detectors?

Can AI content detectors be wrong?

When they did catch our AI-generated text, we easily fooled detectors by switching up a few words and phrases.

Which AI content detector is best?

What are the limitations of AI detection tools?

Why is AI content detection so difficult to get right?

How can AI detector tools be improved?

How to bypass AI content detection

What is the future of AI detection tools?

FAQ

What is an AI content detector?

How do AI content detectors work?

How accurate are AI content detectors?

What are the best AI tools to bypass AI content detection?

What are the best AI content detection tools?

Latest posts

Latest post

How to use ChatGPT to create a PowerPoint

AI glossary: 130+ AI terms that you should know

How to embed a YouTube video in PowerPoint

Plus AI vs. Copilot for PowerPoint: In-depth comparison, pricing, and recommendations

More resources

Latest post

250+ Ideas for persuasive speech topics

The most overused ChatGPT words

How to create a PowerPoint presentation step by step

Plus AI vs. Copilot for PowerPoint: In-depth comparison, pricing, and recommendations