OpenAI’s New AI-Detector Isn’t Great at Detecting AI

Stock image of robot hand and face

OpenAI, the artificial intelligence company behind viral text-generator ChatGPT, has released a new AI tool intended to help manage the mess wrought by its previous creation. Unfortunately, it’s not very good. 

The company announced a free web-based AI-detection widget on Tuesday. The application is intended to classify text samples based on how likely they are to have been generated by artificial intelligence vs. written by an actual person. Given a sample of text, it spits out one of five possible assessments: “Very unlikely to have been AI-generated,” “unlikely,” “unclear,” “possible,” or “likely.” 

However, in OpenAI’s own tests, the tool only correctly identified generated text as “likely AI-written” about a quarter of the time. Moreover, about one in ten times, the classifier falsely lists human-made words as computer-generated, the company noted in a blog post.

According to OpenAI, even these meh results are an improvement on the company’s previous stab at AI-text detection. And the tech startup acknowledged that, thanks to its own invention, we need improvement.

OpenAI admits that ChatGPT has thrown a complicating wrench into classrooms, newsrooms, and beyond—where the tool and others like it have stoked fears of rampant cheating, misleading info, and copyright violations. In response, the company now says it wants to help. “We recognize that identifying AI-written text has been an important point of discussion among educators, and equally important is recognizing the limits and impacts of AI generated text classifiers in the classroom,” the company said in its Tuesday blog. “While this resource is focused on educators, we expect our classifier and associated classifier tools to have an impact on journalists, mis/dis-information researchers, and other groups.” 

But in its current form, this new detection tool probably still isn’t accurate enough to meaningfully address growing concern over AI-enabled plagiarism, academic dishonesty, and the propagation of misinformation. “Our classifier is not fully reliable,” the company wrote. “It should not be used as a primary decision-making tool.”

In other words, if you suspect a news article or classroom assignment is AI-generated, whatever OpenAI’s classifier tells you may or may not be true.

In Gizmodo’s own tests, the classifier didn’t yield particularly impressive results. With multiple tests of AI-generated text, the detector gave me lukewarm results. “Possibly AI-generated,” it said about a fake news article I generated in ChatGPT moments earlier.

Screenshot of OpenAI classifier

This text was generated by OpenAI’s own ChatGPT, but the company’s new classifier isn’t so sure.
Screenshot: OpenAI / Gizmodo

I got the same result using a chunk of AI-produced text from ChatGPT’s stab at writing an article about itself.

Screenshot of OpenAI's AI-classifier

Once again: AI-generated text, and an unsure AI-detector.
Screenshot: OpenAI / Gizmodo

In response to a clip from a CNET article produced via “assist[ance] by an AI engine,” OpenAI’s detector told me it was “unlikely” to have been AI-generated.

Screenshot of OpenAI's AI-detector

This clip, from a CNET article produced with the help of AI, fooled OpenAI’s detector the best.
Screenshot: OpenAI / Gizmodo

However, to the tool’s credit, in 10 or so tries, I didn’t get a false positive on any text from recently published Gizmodo articles. The only response the classifier yielded on the Gizmodo posts I tested was “very unlikely AI-generated.” OpenAI noted that it purposely adjusted the confidence threshold to “keep the false positive rate very low,” in the web version of its new AI-tool. So potentially, that adjustment is working out well. Though the 9% false-positive rate that OpenAI self-reported is still pretty high.

Some additional limitations of the tool include that it only passably works with English and not other languages, that AI-written text can easily be edited to bypass the classifier, and that only lengthy text samples yield sort of accurate results with any reliability, according to the company.

Theoretically though, the AI-detector should get better with more use, because it itself is AI-based. The classifier is a language model trained on pairs of AI-generated/human-written text samples on the same topic. And, by opening up this stage of the classifier to the public, OpenAI is hoping to get feedback on it and “share improved methods in the future.”

Gizmodo contacted OpenAI with questions about its new tool, and was directed back to the blog post.

The company isn’t the first to try its hand at AI detection. A college student, Edward Tian, recently released his own program. And if you write about AI publicly like this Gizmodo author, then you’ll know that press releases touting the hottest new AI-detection software abound. But across the board, existing tools don’t seem to hold up so well against the forward march of AI-production capabilities. Like humans, automated AI detectors keep getting things wrong, as in one recent pre-print study where an automatic detector failed to clock AI-generated text more than one-third of the time. 

Ultimately, it’s hard to imagine how AI could learn to outsmart itself, especially as the results of AI-generation become increasingly convincing. In trying to develop reliable AI-detection, OpenAI has entered a race with itself. The better an AI text-generator, the harder it should be to suss out the resulting sentences’ AI origins. And since OpenAI is presumably trying to improve upon ChatGPT at the same time as it’s trying to improve its classification detection tool, it seems like an impossible race to win. 

#OpenAIs #AIDetector #Isnt #Great #Detecting

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like