Writing Under Surveillance: The Problem with AI Detection

"There has always been resistance to teaching machines and to the technocracy in which they are embedded... And perhaps it's worth repeating that that resistance did not come only from disgruntled educators."
— Audrey Watters, Teaching Machines

CUNY students certainly know the feeling. On Reddit, they describe anxiety and frustration at the prospect of being falsely accused by widely discredited software, their professors deferring to opaque algorithmic systems over their own judgement or the testimony of their students.

AI detection software has already begun to shape how students learn to write and under what conditions they see themselves as writers. While multilingual students report being accused of cheating because their syntax looks "too clean," others describe misspelling words and punctuating incorrectly to evade a false positive designation and a failing grade to go with it, though every word is their own.

When students feel compelled to stage their own humanity for an audience of AI detectors, they write not for humans but for algorithmic systems that are sparsely understood by professors who require them or administrators who procure them, inflicting harm that falls hardest on under-resourced, working-class students like so many here at CUNY.

To understand the problem with AI detection, consider what these services actually claim to measure. For instance, GPTZero relies on two metrics: the first is perplexity, a measure of how well a language model would predict each successive word in a passage; the second is burstiness, which tracks creative variability in sentence rhythm and structure. Such a premise relies on the assumption that human writers naturally exhibit syntactic variance, often mixing long and short sentences together, while AI models tend toward consistent, flat tempos at both the sentence- and paragraph-level. In turn, these algorithmic scores are used to then calculate the probability that a large language model (LLM) produced content submitted by a student ("AI Detectors"; Galczynski).

The premise is shakier than it sounds. A 2025 detector comparison video reports that identical content was labeled 10% AI-generated by one tool and 81% by another (Engelbrecht). It only follows that commercial detectors trained on different datasets, each with distinct labeling standards, would diverge when compared across services. On the flipside people fare no better, with a recent study placing human attempts to classify AI-generated text at a paltry 19% accuracy, which is as good as guesswork (Cheng et al.).

Detection failures have also been documented at scale. For instance, one of the most comprehensive accounts in the field, published in the International Journal for Educational Integrity, tested twelve publicly available tools alongside Turnitin and PlagiarismCheck, if only to conclude that they were "neither accurate nor reliable," and exhibited a systematic bias toward classifying AI-generated text as human-written rather than detecting it (Weber-Wulff et al.). Scaled up even further, a "low" 1% false positive rate across 22.35 million first-year college essays amounts to 223,500 essays falsely flagged in a single year (Hirsch).

Mind you, these are not minor calibration errors. As Jordan Galczynski of UCLA's HumTech observes, the rush to adopt detection technology in higher education has produced classification systems that reify and outsource assessment to automated constructs of human authenticity (Galczynski). The MLA-CCCC Joint Task Force on Writing and AI put it plainly in its first working paper: detection tools generate "false accusations" that "may disproportionately affect marginalized groups" (MLA-CCCC Joint Task Force on Writing and AI). This point matters because AI detectors measure not cheating but legibility to an algorithm—and legibility is not evenly distributed.

To make matters worse, false positives fall hardest on those already at the margins. A Stanford study of seven widely-used detectors found they classified 61.22% of TOEFL essays by non-native English speakers as AI-generated (Liang et al.). Across all seven detectors, 89 of 91 essays were flagged by at least one tool. The reason is baked into the architecture. Non-native speakers tend to score lower on perplexity, and detectors use perplexity as a proxy for human authorship, penalizing writers whose patterns resemble what AI produces ("AI Detectors Biased Against Non-Native English Writers"). The same disparities appear along racial and neurological lines: Common Sense Media found Black students faced disproportionately higher AI-detection accusations than white students, and neurodiverse students are more likely to be falsely flagged because their writing patterns diverge from the narrow templates these tools treat as authentically human (Hirsch).

If not Kafkaesque, the situation certainly calls to mind a police state of writing, one where students must answer to the automated verdict of AI detectors whose algorithms are, for many, painfully easy to game. Running synthetic text through a paraphrasing tool, for instance, dropped DetectGPT's accuracy from 70.3% to 4.6% without changing meaning (Krishna et al.). Adding a single word like "cheeky" to a text generation prompt was also reported to fool detectors 80–90% of the time ("AI Detection Tools," USD Law Library). In such cases, we are invited to consider why our institutions keep insisting, if not pretending, that AI detection works in the first place?

No matter their complexity or sophistication, AI detectors seek to classify and identify patterns in language that are neither stable across large language models (LLMs) nor robust to adversarial modification by end users intent on beating the system (Weber-Wulff et al.). They are also anxiety-inducing, highly unpredictable, and bear standards for human writing as opaque as they are normativizing.

So, then, as LLMs grow more and more capable of mimicking human writing, and the technical distinction between machine and human text becomes harder to sustain, the question follows whether the problem of AI detection itself is sound and worth our concern, or rather intractable and more trouble than it's worth.

For my part, I'd hazard a guess in support of the latter.

For CUNY instructors, the real work begins where detection ends–whether in the classroom or on the quad, in reading and responding to each other's work with care and honesty, or in reminding ourselves that learning is messier in practice than any set of algorithms can name. Detection practices foreclose that work, recasting educators as auditors and their students as suspects, undermining the conditions under which young writers learn, as Paulo Freire puts it, to read the word and the world.

But trust can be rebuilt, and it so often starts simply, and slowly, with the everyday practice of reading and writing together again.

Round 26 · double Hirsch citation collapsed to single

AI use in your class may still disrupt learning goals or class community. If so, consider the small wins and tips outlined in this VP companion resource: Small Wins & Teaching Tips.

Works Cited