Myths and Misconceptions about AI Detection

Jason is an English and Philosophy teacher at New Roads School in Los Angeles.

Introduction

Commonly, I am asked what AI detectors can and cannot do. Today, we’ll explore some of the most common myths and misconceptions about AI detection. Let’s dive in!

Myth 1: Instead of using AI detection, we should incorporate AI into our teaching, and teach students how to use it.

We should incorporate AI into our teaching, and we should teach students how to use AI! But integrating AI instead of using AI detection is pitting two ideas against each other that are not actually conflicting. AI detection is a necessary prerequisite for AI incorporation, to put up sensible guardrails around the technology and ensure that the technology is being used in an assistive, rather than abusive, capacity.

There is a certain luddite fallacy that happens when new technology emerges and any restriction is suggested - in this case that “detection equals deterrence”, which is false.

In practice, the opposite is true. Those seeking strong AI detection tools are usually the ones wanting to use AI the most in their classrooms. They want to use the tools, but they don’t want the tools to be abused. Alternatively, there are those who are so eager about the future of AI that they want to use AI tools full stop in their classrooms under the ideal that “this is the future”. They welcome no restrictions. Still, there are those who abhor AI and are the “pen and paper” types. They are individuals or schools who have decided that the only valid approach to AI is not to have any of it. There is no need for detection when the class is returned to pre-computerized days. The truth, however, is that those seeking AI detection are usually those most interested in maximizing the use of AI for learning in the classroom. They want to avoid unnecessary determinants while experimenting and expanding their craft with AI. This seems to us the right approach.

Myth 2: AI detectors are black boxes that do not reveal their methodology, and therefore can’t be trusted.

While it is true that other AI detectors are not transparent about their methods, Pangram has openly shared its methodology. Pangram is open about their methods, because they believe that it is important to gain the trust of the research community and show factual evidence why the software is accurate. Pangram provides an interactive, animated demonstration of their methodology on their website.

Pangram also publishes some of their technical innovations in AI conferences and journals. For example, they recently presented work describing how the system is robust to humanizers and paraphrasers at the COLING conference.

Myth 3: AI detection has not been peer-reviewed or validated by academics.

Not only has Pangram’s work been peer reviewed, it can be reviewed by anyone at any time.

Pangram has recently been featured and benchmarked in several peer-reviewed works. Pangram won the award for the most accurate and robust detector in the COLING Shared Task, a competition featuring several open-source and commercially available AI detectors.

Pangram was also recently featured in research from University of Maryland showing that it is the only automated AI detector that outperforms trained human experts in detecting AI-generated text, and another research paper from the University of Houston demonstrating that Pangram is the only AI detector that is robust to translation.

Older research studies that are commonly cited, such as the Weber-Wulff study from 2023 and the Liang study demonstrating that AI detectors are biased against ESL, do not benchmark Pangram. Not only are these studies outdated, but we have demonstrated that Pangram excels on these benchmarks whereas other detectors do not.

Pangram is not afraid to be stress-tested by researchers, and that’s why they give unlimited free access to academic researchers wanting to study the accuracy of Pangram’s AI detector.

Myth 4: AI detectors are inaccurate.

I have often been approached by people claiming that their work was flagged for AI when it was human written. Unfortunately, I think a few things are going on here.

There are people who believe AI detectors are just no good, because authors and institutions unapologetically keep saying so without evidence. Take for example this article which claims that “As of mid-2024, no detection service has been able to conclusively identify AI-generated content at a rate better than random chance, and Illinois State University does not have a relationship with any of these services.”, which is a made up assertion given that even the worst AI detectors are still catching some AI content. Pangram boasts a 1/10,000 false positive rate, because in their development and methodology, which can be read in their white paper, those are the actual numbers of incorrect detections they see, which is about 100x better than the next best commercial software available.

No detection software can be 100% accurate. That is not possible. AI detectors are generally good; Pangram detection is better. No AI detection is 100%. If you were to run two pieces of writing that claimed to be human through Pangram (or any detector), and both of them were flagged as AI when they were not, the statistical likelihood that the detector is wrong is absurdly lower than the likelihood the writing was actually written by AI. This is the problem schools want to solve - confidence when asserting that something is written with AI, which it was not. With Pangram, we can be much more confident that a piece of writing is AI, than a person claiming it not to be.

Myth 5: AI detectors don’t detect writing assistance tools such as Grammarly

There is a misconception that generative AI help, such as Grammarly won’t be detected. Maybe that is true with other detectors, but Pangram will detect a sufficient amount of generative AI assistance in writing. That means that, yes, the paper you wrote is your own, but it is being flagged for AI because you used a significant amount of AI to “clean it up”. I see this all the time with students.

Grammarly is no longer just a grammar checker. It is a full on AI assistance tool that will completely rewrite student essays using a large language model. If a student uses Grammarly in this way, to fundamentally alter the composition and styling of their original writing, Pangram will detect the essay as being AI-generated.

This is why I strongly encourage teachers to adopt an AI policy, such as the tier system on Pangram’s website, so that there are no misconceptions about what kinds of AI tooling assistance in the writing process is allowed, and what constitutes misconduct.

Myth 6: AI detectors cause harm by creating false accusations against honest students.

Detractors of AI detection commonly say that falsely accusing students of AI causes irreparable harm and damage to a student’s reputation, and a teacher’s credibility.

However, Pangram is not an accusatory tool, in and of itself.

Most of the time, in my experience, a positive detection of AI in a student’s writing is likely to be a misunderstanding, or simply a well-intentioned student who was overwhelmed by deadline pressure. A simple conversation between a teacher and a student does not have to be confrontational in nature. We believe that teachers should use the opportunity to understand the student’s writing process: ask a student how well they understand the underlying material, look at the student’s revision history to get a sense of how the document was put together, and ask the student to explain if and how they used AI assistance in the editing process, as opposed to immediately jumping to the conclusion that a student intended to cheat.

Pangram often compares AI detectors to metal detectors: when a metal detector goes off, you do not immediately get arrested. Rather, a positive detection is a reason to open up further conversation and gain a deeper understanding of what’s really going on.

Conclusion

It’s important - as with any tool - that teachers understand both the strengths and the limitations of AI detection.

While systems such as Pangram are extremely accurate in detecting AI-generated text, mistakes do rarely occur.

That is why it is vital that teachers establish clear guidance, policies, and boundaries in their classrooms for what kind of AI assistance is allowed, and why positive detections from Pangram should be taken seriously, but conversations around AI usage should be approached with empathy and curiosity. A positive detection from Pangram should never be used in isolation to punish or accuse a student of academic misconduct without a deeper understanding of the student’s writing process.

Want to continue the conversation? Jason is happy to talk and provide further guidance on establishing an AI policy for your classroom. He can be reached at jason@pangram.com.

Products

Use Cases

Company

Resources