

Writing in the journal Patterns, the scientists traced the discrimination to the way the detectors assess what is human and what is AI-generated. When essays written by native English-speaking eighth graders in the US were run through the programs, the same AI detectors classed more than 90% as human-generated. More than half of the essays, which were written for a widely recognised English proficiency test known as the Test of English as a Foreign Language, or TOEFL, were flagged as AI-generated, with one program flagging 98% of the essays as composed by AI.


Scientists led by James Zou, an assistant professor of biomedical data science at Stanford University, ran 91 English essays written by non-native English speakers through seven popular GPT detectors to see how well the programs performed.
