If you pay close enough attention, you can often find indicators in a text that suggest it was written by AI. Certain sentence structures, word choices, or formatting appear far more often in AI text than in human text.
You may even already know a few phrases that seem to indicate that a text is AI-generated. There are many examples: “complex tapestry”, “a testament to”, or even the word “delve” as writer and investor Paul Graham widely publicized.
Now, if you’re a regular user of the Pangram dashboard, you may have also noticed that we’ve begun to highlight the overused AI phrases, like a “complex tapestry” in the AI essay below.
This is Pangram’s new AI Phrases tool! Here’s how it works:
When you scan a document and Pangram detects it was generated by AI, we perform a second scan for common AI phrases.
To train our model to be highly accurate, we use internal datasets of tens of millions of human and AI-generated documents. Separately, our team can scan through both of these datasets for common sequences of words in human and AI writing. Then, we can compare the statistics to see if AI actually does overuse certain phrases. We use a technique called N-Gram analysis (the similar sound to PaNGram may not be coincidence 😊), and the results are striking. We clearly see that there are countless phrases that AI appears to use far more often than humans do. So many, in fact, that we decided to build a tool to display these phrases right on the Pangram dashboard.
Here at Pangram, we’re interested in preserving human voices. We have our core detection model that is capable of processing hundreds of thousands of details about a text to make a judgement about whether or not it is AI-generated.
However, we’re also interested in explainability. Knowing that a particular phrase is highly overrepresented in AI text can allow you to better understand (and explain to others!) why we may have detected a piece of text as AI or not. If you know that a piece of writing has several phrases that appear hundreds or thousands of times more frequently in AI text than human text, you now have more quantifiable evidence to support our judgement.
We want to keep you, our users, informed about not only whether a text is AI-generated, but how we can tell. AI phrases are a key part of this mission, and our overall journey towards interpretability.
In future blog posts, we will go over some of the most overused AI phrases, so stay tuned! For more information about Pangram or our interpretability features, feel free to reach out at info@pangram.com