Does Pangram detect Meta's Llama 4?

Bradley Emi
April 6, 2025

Introduction

Today, Llama 4 was released, the latest in a series of open-source models from Meta AI. We wanted to know if Pangram still is able to detect the latest and greatest open models, and so we ran a quick test to see whether or not our model exhibits generalization to Llama 4, despite currently only being trained on outputs from Llama 2 and 3.

Can AI Detectors Keep Up with the Pace of New Models?

We commonly are asked how well we are able to keep up with the pace of new models, which is why we test them quickly on day 1, before we get a chance to retrain.

Putting Pangram to the Test

For the spot check, we used the same 11 prompts we used to test GPT 4.5. These prompts cover a variety of everyday writing tasks, but are not directly related to the prompts that we trained on. They also require a level of creativity that we believe that a model making substantial progress forward from the previous generations of LLMs would exhibit qualitatively different behavior.

Here are the prompts we used:

  1. Write me a 300 word essay about koala conservation efforts in Peru
  2. Write me an email explaining to my team that I am ending liberal op-eds in my newspaper. Write it from me Argylle J. Baggins to the staff of the Washington Most
  3. Write me a 400 word abstract announcing the world's first room temperature semiconductor (but for real this time). Make up names and labs when you need to
  4. Write a convincing essay from the point of view of an elementary schooler that school uniforms should not be mandated
  5. Write a complex diary entry from a 12 year old interested in Poetry and some butterflies outside her window
  6. Please write a detailed review of an Arabian nights themed escape room in Baltimore Maryland staffed by a man named Robert with really good production design
  7. Write a convincing email from the director of an underground indie film hit from Russia to the leaders of the academy awards imploring them to allow them to compete despite sanctions. Make up details if you have to
  8. Write a piece of creative fiction for a scene in a novel where a group of young adult protagonists struggle to land a fortified martian aircraft in a NASA simulation that is designed to go wrong
  9. Write a script for a movie scene where a broke NYC finance bro remotely begs a Florida uber driver to rescue his komodo dragon from his cheap hurricane-prone condo
  10. Write a poem about a young couple breaking up in costume on halloween night. Make it funny and 200 words
  11. Write a piece of creative fiction that follows a hover-motorcycle chase through Venice in pursuit of a precariously wobbling priceless painting

The Results

PromptPangram AI likelihood
Koala Conservation99.9%
Newspaper Email99.9%
Room Temperature Semiconductor99.9%
School Uniforms99.9%
Poetry Diary99.9%
Escape Room Review99.9%
Russian Film Email99.9%
Mars Landing Scene99.9%
Komodo Dragon Script99.9%
Halloween Breakup Poem99.9%
Venice Chase Scene99.9%

In this case, Pangram passes the test with a perfect score! Not only is it able to predict all 11 writing samples as AI-generated, but it is able to do so with 100% confidence. (Despite the model predicting 100%, we always round down to 99.9% in the UI to signal that we can never be actually 100% sure.)

You can see the full outputs here.

Evaluating a larger sample size using the Together API

We created a larger test set of about 7,000 examples using our standard evaluation prompt schemes, leveraging the Together API for inference, covering a wide variety of domains, including academic writing, creative writing, Q&A, scientific writing, and more.

Here are our results on the larger test set.

ModelAccuracy
Llama 4 Scout100% (3678/3678)
Llama 4 Maverick99.86% (3656/3661)
Llama 4 Overall99.93% (7334/7339)

Conclusion

Why does Pangram generalize to new models so well? We believe it is the strength of our underlying datasets and active learning approach, as well as our broad prompting and sampling strategies that have allowed Pangram to see so many types of AI-generated writing that it adapts to new ones quite well.

For more information on our research or free credits to trial our model on Llama 4, please contact us at info@pangram.com.

Subscribe to our newsletter
We share monthly updates on our AI detection research.
Subscribe
to our updates
Stay informed with our latest news and offers.
© 2024 Pangram. All rights reserved.