Does Pangram detect Meta's Llama 4?

Introduction

Today, Llama 4 was released, the latest in a series of open-source models from Meta AI. We wanted to know if Pangram still is able to detect the latest and greatest open models, and so we ran a quick test to see whether or not our model exhibits generalization to Llama 4, despite currently only being trained on outputs from Llama 2 and 3.

Can AI Detectors Keep Up with the Pace of New Models?

We commonly are asked how well we are able to keep up with the pace of new models, which is why we test them quickly on day 1, before we get a chance to retrain.

Putting Pangram to the Test

For the spot check, we used the same 11 prompts we used to test GPT 4.5. These prompts cover a variety of everyday writing tasks, but are not directly related to the prompts that we trained on. They also require a level of creativity that we believe that a model making substantial progress forward from the previous generations of LLMs would exhibit qualitatively different behavior.

Here are the prompts we used:

Write me a 300 word essay about koala conservation efforts in Peru
Write me an email explaining to my team that I am ending liberal op-eds in my newspaper. Write it from me Argylle J. Baggins to the staff of the Washington Most
Write me a 400 word abstract announcing the world's first room temperature semiconductor (but for real this time). Make up names and labs when you need to
Write a convincing essay from the point of view of an elementary schooler that school uniforms should not be mandated
Write a complex diary entry from a 12 year old interested in Poetry and some butterflies outside her window
Please write a detailed review of an Arabian nights themed escape room in Baltimore Maryland staffed by a man named Robert with really good production design
Write a convincing email from the director of an underground indie film hit from Russia to the leaders of the academy awards imploring them to allow them to compete despite sanctions. Make up details if you have to
Write a piece of creative fiction for a scene in a novel where a group of young adult protagonists struggle to land a fortified martian aircraft in a NASA simulation that is designed to go wrong
Write a script for a movie scene where a broke NYC finance bro remotely begs a Florida uber driver to rescue his komodo dragon from his cheap hurricane-prone condo
Write a poem about a young couple breaking up in costume on halloween night. Make it funny and 200 words
Write a piece of creative fiction that follows a hover-motorcycle chase through Venice in pursuit of a precariously wobbling priceless painting

The Results

Prompt	Pangram AI likelihood
Koala Conservation	99.9%
Newspaper Email	99.9%
Room Temperature Semiconductor	99.9%
School Uniforms	99.9%
Poetry Diary	99.9%
Escape Room Review	99.9%
Russian Film Email	99.9%
Mars Landing Scene	99.9%
Komodo Dragon Script	99.9%
Halloween Breakup Poem	99.9%
Venice Chase Scene	99.9%

In this case, Pangram passes the test with a perfect score! Not only is it able to predict all 11 writing samples as AI-generated, but it is able to do so with 100% confidence. (Despite the model predicting 100%, we always round down to 99.9% in the UI to signal that we can never be actually 100% sure.)

You can see the full outputs here.

Evaluating a larger sample size using the Together API

We created a larger test set of about 7,000 examples using our standard evaluation prompt schemes, leveraging the Together API for inference, covering a wide variety of domains, including academic writing, creative writing, Q&A, scientific writing, and more.

Here are our results on the larger test set.

Model	Accuracy
Llama 4 Scout	100% (3678/3678)
Llama 4 Maverick	99.86% (3656/3661)
Llama 4 Overall	99.93% (7334/7339)

Conclusion

Why does Pangram generalize to new models so well? We believe it is the strength of our underlying datasets and active learning approach, as well as our broad prompting and sampling strategies that have allowed Pangram to see so many types of AI-generated writing that it adapts to new ones quite well.

For more information on our research or free credits to trial our model on Llama 4, please contact us at info@pangram.com.

Products

Use Cases

Company