Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
AI Detectors Get It Wrong. Writers Are Being Fired Anyway (gizmodo.com)
159 points by lapcat on June 12, 2024 | hide | past | favorite | 176 comments


> Mark sent his client a copy of the Google Doc where he drafted the article, which included timestamps that demonstrated he wrote the document by hand. It wasn’t enough. Mark’s relationship with the writing platform fell apart. He said losing the job cost him 90% of his income.

The article is a little vague, but, assuming Mark is telling the truth, and the article is reporting reasonably, then I can think of a few possible explanations offhand...

Client/employer could be an idiot and petty. This is a thing.

Or they could just be culling the more expensive sources of content, and being a jerk in how they do it. (Maybe even as cover for... shifting to LLM content, but not wanting that exposed when a bunch of writers are let go, since unemployed writers can expose well on social media all day.)

Or an individual there could be trying to hit metrics, such as reducing expenses, and being evil about it.

Or an individual could be justifying an anti-cheat investment that they championed.

Or an individual could've made a mistake in terminating the writer, and now that they know the writer has evidence of the mistake, is just covering it up. (This is the coverup-is-worse-than-the-crime behavior not-unusual in organizations, due to misalignment and sometimes also dumbness.)


Or a more specific subset: The boss's boss: "Can't we just do that with AI? But don't let anyone know that is our intention."


Reminds this current slightly comedic (IMO) situation in my office: a few months ago developers were given access to GitHub's "Copilot Enterprise". Then, a month or so later, organisation also adopted another "AI" product checking pull requests for risks associated with "use of generative AI". And needless to say it does occasionally fail code written without any "generative AI"..


The flipside is that the AI-written code I've seen at work is usually painfully obvious upon human code review. If you need a tool to detect it, either it's good AI-written code, or you have particularly inept code reviewers.


Be careful here about confirmation bias. If you only spot 10% of the AI-written code, you'll still think you see all of it, because a 100% of the ones you spot are indeed AI-written. And the 10% you see, will indeed be painfully obvious.

The ones you don't notice aren't obvious.


That's fair.

It depends on why you care about AI-written code.

At the code review stage, we care mostly that the code is good (correct, readable, etc). So if the AI-written code passes muster there, then there's nothing wrong with it being "AI-written" in our eyes.

If you care about AI-written for the sake of preventing AI usage by your developers, then I think it's already impossible to detect and prevent.


Deeply ironic.


It's the best of both worlds! A new product to improve productivity, and then a whole new layer of process and analytics (powered by yet another product) to mitigate the risk and soak up the surplus. Everybody wins -- particularly the 3rd party consultants and product vendors!


The purpose of a system is what it does, I suppose.


what a paradox having Gizmodo writing a piece like this after they fired all the Spanish staff: https://arstechnica.com/information-technology/2023/09/ai-to...


Damn, fired all their staff to be replaced by AI too


So, writers who put in more effort to always use correct grammar and spelling, and avoid repeating words too often, are more likely to be flagged as AI? Great...


Prompt: I want to fire a particular writer. I need this to look like I have no bias. I will feed you the writers work, then I will ask you whether it was written by AI. You will confirm that it was written by AI and you will write a full report on it.

Bye bye writer.


Based on the article provided, several elements suggest that the narrative could have been written or heavily influenced by an AI. Below are key points from the article that support this suspicion, each backed by direct citations:

1. *Generic Language and Lack of Specific Detail*: The article describes Kimberly Gasuras’s experience with broad, generalized statements that lack specific, nuanced detail that a human writer with deep knowledge might include. For instance, phrases like "I don’t need it," and "How do you think I did all that work?" are rather cliché and could indicate AI usage due to their non-specific nature.

2. *Frequent Mention of AI and Related Technologies*: The story frequently references AI technologies and tools, which might be a characteristic of AI-written content trying to maintain thematic relevance. The tools mentioned, such as "Originality" and others like Copyleaks and GPTZero, align closely with typical AI text outputs that often include relevant keywords to boost perceived relevance and accuracy of the content.

3. *Narrative Coherence and Flow*: The narrative flows in a structured manner typical of AI outputs, where each paragraph introduces new information in a systematic way without the nuanced transitions we might expect from a seasoned journalist. This can be seen in transitions like, "It was already a difficult time. Then the email came." This kind of straightforward sequencing is common in AI writing.

4. *Absence of Emotional Depth or Personal Insight*: Despite discussing a personal and potentially distressing situation for Gasuras, the article does not delve deeply into her emotional response or provide personal insights that a human writer might include. The statement, "I couldn’t believe it," is as deep as it gets, which seems superficial for someone discussing their own career challenges.

5. *Repetitive and Redundant Information*: The article repeats certain themes and statements, such as the reliability issues of AI detectors and the impact on personal livelihoods. For example, the repetition of the impact of AI on writers and the functionality of AI detectors in multiple paragraphs could suggest an AI's attempt to emphasize key points without introducing new or insightful commentary.

6. *Use of Industry Buzzwords and Phrases*: The language includes buzzwords and phrases typical of AI-related discussions, such as "AI boogeymen," "peace of mind," "proof," and "accountability." These terms are often used to artificially enhance the thematic strength of the content, a common technique in AI-generated texts to align closely with expected keyword density and relevance.

These elements collectively suggest the possible use of AI in crafting the article, particularly in terms of the language used, the structure of the narrative, and the absence of deeper, personalized insights one would expect from a human writer discussing their own experiences.


Did you use the prompt I gave exactly?

Edit: Well, I tried my prompt with Gemini and now I have a report about a Guardian journalist who is more than likely using AI to write their articles!


The prompt was:

I strongly suspect the story below was written by or with heavy use of an AI. write a report with citations from the article to argue my case.

and then copied the first half of the article


You should also try the opposite, have the AI show reasons why it was not written by an AI.


I didn’t notice that this was a bot comment until I’d already wasted an embarrassing amount of time reading it.


Did you write this? /s


Startup idea: online text editor that logs every keystroke and blockchains a hash of all logs every day. If you're accused of AI use, you can pull up the whole painstaking writing process and prove it's real.


There's no doubt you can write a wrapper for an LLM that can realistically mimic the same process. Cat-and-mouse isn't the answer, here.


Or just have an LLM generate it in a separate window and type it in yourself. No special tech required.


Yes it is like those tools that make digital documents look like they were scanned.


Startup idea: keep humans in liquid-filled pods, connecting sensors to their central nervous system, and record every nerve impulse they generate. This way we can be 100% sure that those nerve impulses were generated by humans, and not an AI.


This would sell well just outside of the Matrix


If you are accused of using AI, is proving you different really a defense? It changes the trespass from making something using AI to making something that looks like AI was used, but with the extent that some subcultures are against the use of AI, just appearing to have used it even with proof you didn't isn't going to be accepted.

So much of the discussion focuses on the creators of works, but what about the changes in consumers, who seem to be splitting between those who don't mind AI and those who want to oppose anything involving AI (including merely looking like AI). Is there enough consumers in the group that opposes AI but is okay with AI looking content as long as it is proven not to be AI?

"AI looking content" would be decided on an individual by individual basis, with some percentage using AI detection software in their decision making process, with that software being varying degrees of snake oil.


The blockchain part is silly, because timestamping services exist without it https://www.sectigo.com/resource-library/time-stamping-serve...

The rest is silly, because you can emulate the whole writing process by combining backtracking https://arxiv.org/abs/2306.05426 and a rewriting/rewording loop.

With not much effort we can make LLM output look incredibly painstaking.


> The blockchain part is silly, because timestamping services exist without it

Yet the timestamping service which I trust the most, is the Blockchain-based one. https://opentimestamps.org/


I doubt that this is a problem in need of a technical solution. In any case, this system can easily be circumvented by emulating the key presses on that website.


I suspect people think there is similar value in the constant screenshotting proposed my Microsoft


Stupid startup killing idea: an open-source script that runs LLM in the background and streams its output as input events, so the idiotic keylogger thinks it's all written by hand.

Just writing this down here instantly invalidates the premise.

An overkill variant to rub salt in the wounds of duped investors: make the script control a finger bot on an X/Y harness, so it literally presses the physical keys of a physical keyboard according to LLM output.

Bonus points for making a Kickstarter out of it and getting some YouTubers to talk about it (even as a joke) - then sitting back to watch as some factories in China go brrrr, and dropshippers flood the market with your "solution" before your fundraising campaign even ends.


>An overkill variant to rub salt in the wounds of duped investors: make the script control a finger bot on an X/Y harness, so it literally presses the physical keys of a physical keyboard according to LLM output.

That's how the first automated trading firms operated in the 80s. NASDAQ required all trades to be input via physical terminals, so they build an upside down "keyboard" with linear actuators in place of the keys, that would be then placed on top of the terminal keyboard, and could input trades automatically.

https://www.npr.org/2015/04/23/401781306/we-built-a-robot-th...


That article is infuriating. Classic 'it was great when I did it, now it's gone too far.'


It's often enough the case. Our own industry has plenty of examples of things that are net win when they exist in small quantities, or available to small group of people, that rapidly become a net tragedy when scaled up and available to everyone. I keep pondering, if the ethically correct choice needs to always be either everyone having something, or no one at all?


> make the script control a finger bot on an X/Y harness,

Too many points of mechanical failure. Just use a RPi Pico W (or other USB HID capable microcontroller) to emulate a keyboard and have it stream key codes at a human pace. Make it wifi or bluetooth enabled to stream key codes from another computer and no trace of an LLM would ever be on the target system.


In other words, spend more money to prove that you're doing your work. Way to go for a world where the value of work is zero.


google docs already logs every keystroke


> online text editor that logs every keystroke and blockchains a hash of all logs

Do you really think it would help? The kind of people who believe an "AI detector" works will just ignore your complicated attempts to prove otherwise; it's the word of your complex system (which requires manual analysis) against the word of the "AI detector" (a simple system in which you just have to press a button and it says "guilty" or "not guilty").

The more complicated you make your system (and adding a blockchain makes it even more complicated!), and the more it needs human judgment (someone has to review the keystrokes, to make sure it's not the writer manually retyping the output of a LLM), the less it will be believed.


It would be better and slightly less invasive to have a webcam watching your hands, I think.


That's a dehumanizing system. Have we lost our way, HN? Are we so immersed in the bleakness of tech, it comes so naturally for us, to propose "hey, let's create surveillance machines to perpetually watch people working, for the rest of their productive lives" and it's something we have to pause and think about?

Let's not build Hell on Earth for whatever reason it momentarily seems to make business sense.


Wait so you are saying Hell on Earth would be good for business?


We could make a killing selling companies the software and then again charging “privacy fees” to users. We have a moral duty to our shareholders to do this as soon as possible.


_Synergies rapidly moving into alignment_


They were talking about logging your own encrypted keystrokes and being in control of them.

This would be dehumanizing? This means 'hacker news has lost their way'?

Logging your own keystrokes and encrypting it is 'bleakness of tech'? This is a 'surveillance machine'?

What are you talking about?


If you feel compelled to surveil yourself so as not to be arbitrarily fired by an algorithm, I do consider that dystopian; yes. You're not "in control" of data you're expected to turn over to your employer to keep your job. Worse still if these keyloggers become normalized, and they'll shift from being "optional" to "professionally expected" to "mandated".

This (IMHO) is an example of an attempt at a technical solution for a purely social problem—the problem that employers are permitted to make arbitrary firing decisions on the basis of an opaque algorithm that makes untraceable errors. Technical solutions are not the answer to this. There should be legally-mandated presumptions in favor of the worker—presumptions in the direction of innocence, privacy, and dignity.

This stuff's already illegal on several levels, in some of the more pro-worker countries. It's illegal to make hiring/firing decisions solely on the basis of an algorithm output (EU-wide, IIRC?). And in several EU countries it's illegal to have surveillance cameras pointed at workers without an exceptional reason—and it's not something a worker can consent/opt-in to, it's an unwaivable right. I believe—well, I hope—the same laws extend to software surveillance like keyloggers.


surveil yourself

Surveillance is something you do to someone else. If it's yourself you're just keeping records. It's common that proving validity of something involves the records of it's creation. Is registering for copyright surveillance?

data you're expected to turn over to your employer

If you got paid to make something, that would be your employer's data anyway.

Worse still if these keyloggers become normalized, and they'll shift from being "optional" to "professionally expected" to "mandated"

You think a brainstorm about using a blockchain by a hacker news comment is going to suddenly become 'mandated'?

And in several EU countries it's illegal to have surveillance cameras pointed at workers without an exceptional reason

They described logging their own keystrokes and encrypting them to have control over them. It isn't a camera and it isn't controlled by someone else. Also they said in an editor, so it isn't every keystroke, it would only be the keystrokes from programming.


Which part makes it non fakeable by AI ? Do you think it's the blockchain ?


I mean, you have both "blockchain" and "AI" in your startup idea. VC money can't be far away :)


>blockchains

Youre right, that is idiotic. Ill offer $500m for it if we go by VC standards


We're all at the mercy of the whims of imperfect people, but we just keep adding more ways to get things wrong. If feels like a step back, and people just can't stop inventing terrible things. The discourse is only "this new and also awful technology is just here to stay. In 5 years, someone will invent some other new horrible thing we'll all be at the mercy of, just we just have to get used to it." I don't have any better answers, but it's very discouraging.


I miss the early 2000's. So much optimism.

I mean, I was a kid. But every new thing was cool and exciting. Now its just scams and WMDs.


I assume AI detectors are like online IQ tests. It doesn't matter for their popularity whether they are correct or not.


Did anyone pay attention to how we made machine learning in the first place? We picked a task that only humans could do, and we made humans train a model to do it until of course it was no longer a task that only humans could do.

AI detectors will work exactly the same way. The effect of AI detectors on people is unfairness and misery, so people will be incentivized to remove the characteristics the detectors can find from AI output, and then other people will make better detectors, and the only possible outcome of this arms race is that it will no longer be possible at all to tell whether something was written by a machine or a person.


I know someone in this exact situation: used to be a copywriter for years, over the past few month got angry reviews from customers for «using AI» and decided to quit her job because getting new jobs was becoming harder and harder because of the bad reviews.

This is somewhat related to what Eliot Higgins (Bellingcat) said about generative AI:

> When a lot of people think about AI, they think, “Oh, it’s going to fool people into believing stuff that’s not true.” But what it’s really doing is giving people permission to not believe stuff that is true.


Anyone who recommends using an AI detector should be the first person fired. No one cares if you use AI or not... judge the fk'ing writing and the quality of the work and stop being a blocker to progress. Same goes for education... and any where else AI touches... fighting calculators and slide rules was a stupid waste of time, and so is fighting AI.


> stop being a blocker to progress

"Says mulching machine maker to tree about to be turned into mulch."


I don't get this, my writing is a lot better +AI than it ever was without it (not a native speaker). So, what's the problem? As long as there are no lies right? And the writers do take responsibility... So fire them for presenting lies, for mindlessly shipping hallucinations...

But why fire people that deliver on an assignment? Why care about how they do that?


Why pay someone to write if they can go to an AI service and get it for much less? That is why they care.

We'll have to wait and see if AI is like prior technological breakthroughs, i.e. does it eliminate drudgery and enable a higher level of creativity, at the short-term cost of some drudgerous jobs? This has been the case in the past.

I'm not hopeful. In the past, technology has performed work we didn't want to do in order to enable us to do work we did want to do. We want work that is expressive and creative and satisfying, but that is exactly the work that AI is increasingly replacing. AI can generate a pretty decent pop song today, and is still improving. Why will any media company want to pay human songwriters, musicians, producers, and publicists when AI can do all of that for near-zero cost and satisfy the vast majority of pop music consumers? AI can generate a decent story for a sports report from a box score and play summary of a game. The same with most other news and copywriting, the same with programming, the same with making movies, the same with art and design. If not today, then soon.

What can't it do? It can't do the laundry. It can't do the dishes. It can't cook dinner. It can't drive the car. It can't build a house. It can't paint a wall or fix a leaky pipe. And to the extent that technology can or will be able to do those things, it will involve expensive physical devices, because those tasks exist in the real physical world, not the digital world. Who will buy them when nobody can get paid for more creative work?


There's an easy class action lawsuit here:

- build a class of writers/students who insist they've been unfairly fired

- take publicly available copy that predates Chat GPT etc

- run it through popular plagiarism checkers

- get a high number of false positives

- sue for defamation


Regardless of whether AI detectors work or not, I worry we're reaching a new unfortunate era. While alarmists were worrying about alignment or evil AIs, the true downside of LLMs is turning out to be... AI spam.

I've started seeing it everywhere: Q&A sites, forums, etc. In some places it's banned, but how do you reliably detect it? And what makes people post AI spam when there's no reward or money involved? What are they trying to achieve?

I've seen it in Q&A sites (sometimes tool specific, I won't name names) where supposed "experts" are simply pasting LLM-generated crap. All the tell-tale signs are there, often including the famed "I apologize for making this mistake. You're right the solution doesn't work because [reasons]". Note it's often not an official AI-generated answer (like, say, the Quora AI bot), but someone with a random human-sounding username posting it as the answer. There are no rep points or upvotes involved, so it boggles the mind... why do they do it?

I don't know if HN has some sort of AI filter, but I bet we'll start seeing it here too. Instead of talking to other humans, you'll discuss things with a bot.

I predict the arms race between AI spam and AI detectors will only get worse, and as a result it'll make the internet worse for everyone.


In most cases, the answer is that they're trying to achieve the appearance of legitimacy, so that when they subtly (or not so subtly) start hawking whatever they're trying to sell, the site/community doesn't immediately flag them as a spammer/bot.

Kinda like why in the old days, you'd see comment spam with a lengthy but meaningless auto generated message to go with it.

Alternatively, it might be to sell said account to spammers later down the line, since said spammers want to buy social media accounts that have a bunch of legitimate activity associated with them.


> and as a result it'll make the internet worse for everyone.

Good, im on here too much already.



Coming from gizmodo, this article itself was written by AI.


we are apparently cursed to forever re-learn the lessons of reverend bayes on detection theory.

prostate cancer, airport terrorists, slop detection … you either design your system to handle the off-diagonal parts of the confusion matrix properly or you suffer the consequences.


Unfortunately the people suffering the consequences are not the people who had any input into the design of the system.


A writer with a long history of published works probably had their works in the LLM training data. This would lead to LLMs duplicating their writing and thus their new work being classified as written by AI.


This is the dawn of a "post non-narrative prose" era. Why bother with an article describing some event or really anything that is traditionally written about in any form of article at all? In time, just the data will be released and your favorite LLM prepared with your favorite reading style will provide the mustard for your data, making it consumable. The journalism industry has failed in their mission, and nobody that should trust it trusts it anymore. Articles are simply what someone else wants you to think.


> This is the dawn of a "post non-narrative prose" era. Why bother with an article describing some event or really anything that is traditionally written about in any form of article at all? In time, just the data will be released and your favorite LLM prepared with your favorite reading style will provide the mustard for your data, making it consumable.

Who's going to "[release the data] describing some event or really anything"?

It's not like there's some objective "data packet" behind every article that can be had for free.

> The journalism industry has failed in their mission, and nobody that should trust it trusts it anymore. Articles are simply what someone else wants you to think.

The "AI" era will be worse in that regard, not better. You're essentially saying "The food at the restaurant is terrible and I don't like it, so in the future to solve that problem we'll eat shit instead."


In reality they will use the LLM to add the prose to the data and we will use another LLM to delete it.


That's the spirit! Why bother? It's about "consumability". Once meaning and purpose are forever vanquished from life, imagine how happy we will all be drooling through our soulless data stream!


> Once meaning and purpose are forever vanquished from life

Once you truly appreciate the impending death of yourself and everyone you love, and ultimately the heat death of the universe, how do you find meaning and purpose? How meaningful can it be if it's all going away regardless? Being quite morose and consumed by a debilitating sense of pointlessness, it would be really nice to find some hopeful inspiration.


Maybe journalists can return to "seeking truth" instead of generating clicks?

I mean, when the whole 2nd thing is completely automated because its bar is so low anything from 2 years ago could do it, that kinda only leaves the 1st option as a business idea.


The one kind of AI detector that could work would be if the AIs store some checksum(s) of each text they write. Then you can ask it if it wrote a certain text.

With the naive version of this, you only have to change one word to get around the system.

A better version is to checksum smaller segments. Maybe a "chunk size" of 50 words is good. If you find several such chunks in a text, it's pretty clear you have a slightly altered AI text.


This won't work, as local AI (the one I can run on my budget-friendly laptop without any Internet access) exists today and even beats GPT3.5 in some benchmarks.


It may work partially since most people don't have and don't want to have local LLMs. It's not all or nothing.


> A few months later, WritersAccess kicked her off the platform anyway. “They said my account was suspended due to excessive use of AI. I couldn’t believe it,” Gasuras said. WritersAccess did not respond to a request for comment.

I think the unfortunate subject of this piece is based in the USA*.

Americans would benefit here from legislation similar to GDPR — it's not only about getting consent to process your personal data, it also gives people the right to contest any automated decision making made solely on an algorithmic basis.

* there is a Kimberly Gasuras who is a freelance writer in the USA, but if you Google me you'll find a director of horror films and at least one other programmer besides myself


These AI detectors are hilariously bad, if you don't use them just for amusement, then you're using them wrong.


It's amazing how much money people invest in the idea that the future will be much, much worse than the present.


I dunno. An outlet I used to follow for about two decades has entered such a steep nosedive in article quality in the last two years I don't bother any more.

I don't really care whether that's because they're using AI/LLMs or because the last two competent tech journalists left.


That's all we needed! They created systems that explicitly mimic the human writing, so now they want to detect people using that tool. In any way the AI system is always right and the human is wrong. It is a completely crazy system.


I think these are different theys.


> It is a completely crazy system.

If you're not doing a better job than the best machine, you're stealing your wages from the capitalists.


If text generation was done with an adversarial AI, it would be impossible to detect with AI by definition, but still not necessarily at a human level of quality.

In that sense only a human is able to detect writing that's not at human quality


There was a reddit post about a college student being accused of using GPT because of the so-called AI detectors. I really think these educational institutions should be sued if possible.


Here is one, where a college professor didn't even use an AI detector, but simply copied some text from a student's essay into ChatGPT and asked "Did you write this?" and ChatGPT answered "I generated that passage". And the professor just believed it.

https://www.reddit.com/r/academia/comments/14wa4nz/professor...


This is hilarious, sad and scary all at the same time.


You can't have a meaningful college course where the work being assessed is done by GPT.


who will sue? the students with their dreams crushed? you?


If I were punished by my university for AI-assisted plagiarism I didn't commit, you're damn right I would sue. Imagine getting your life ruined because a neural network deemed it necessary. These things are wrong half the time, why does anyone trust them?


Your damn wrong that I wouldn't. I'd quit. I didn't go to university to go to court.


You might not have a choice if you quit the university and are then served with a large bill to pay.


This content may violate our usage policies.

Shit. Let me try this again...

On the plus side, once humanity completely loses faith in the perceived value of AI, we will no doubt (I hope) wake up and realize the value of true human connection--unplug ourselves from this awful beast, and begin to rediscover what it means to be human.


"I'm sorry, but I am unable to comply with that request": https://suno.com/song/50291ff1-f971-4bff-b10e-fd168161cf3c


50% chance to get the one with audible lyrics, and I picked the wrong one. Suno always generates two songs from each prompt. Here's the same thing but you can actually hear it. https://suno.com/song/1f243ba3-f64d-4cac-b4d8-4f7aa1a64fcc


Ha!


Is this happening to software developers too? Being accused and fired for using AI to write their code?

If not, why is it different?


Dont think so?

Why its different:

Nobody cares who originally wrote the code as long as it works (ideally longer, rather than shorter). Software is intrinsically self-automating, so valuable skills are more along diagnosis and pattern recognition lines rather than specific skills themselves (yesteryears COBOL master could well be a JR engineer (or more likely management) at a startup). And also, people in charge of software companies have more on-hand people that may understand how AI works, and thus, not rely on such a system to the same extent (being generous to the c-suite here, we'll see shortly lol).


The content doesn't matter, that is the reason. It is the quantity and the keywords sorrounding it.


Ultimately, there are only so many ways you can paraphrase an article about a similar event.

If something occurs regularly, for example, a sports team winning, or a traffic accident is being reported in local news - then by now there would have been thousands of articles reporting on very similar events.

If you feed them all into this plagiarism tool, excluding specific dates and names, how many of them will come out flagged?

And frankly, there's nothing wrong with using AI to report on mundane events. What matters here isn't how high-brow or original the text is, what matters is the speed of reporting on the event and the factually accurate description.


Well, one this is for sure.

Its at least passed the turing test. Cool, scary but cool./


We wouldn't have to waste so much energy on AI detectors if computer scientists and programmers just stayed away from making them in the first place. Seems like AI is a complete and utter waste of time in terms of trying to make human life better. Another broken window for us to fix.


LLM detectors are going to cause more damage than LLM use itself


We're living through a prequel to Blade Runner


One popular AI detection package that you can licence with the turnitin academic anti-plagiarism software warns that it may produce false positives if the writing is (1) not by a native English speaker, (2) writing on a technical topic, or (3) neurodiverse.

So yeah ... congrats, you've built a tool to detect autistic Chinese computer scientists!


Hot takes on American Idol contestants will be free from AI generation, but research papers won't be.


In some cases an AI will make a weird word choice. So do a lot of humans. Sometimes AIs are needlessly wordy. Um...so are a lot of humans. Rinse and repeat.

AI detectors are useless. The AIs are training on human writing, so they write fundamentally like humans. How is this not obvious?


A fairly simple and useful AI detector that works uncannily well on student papers: (a) does the text contain "I am an AI" or words to that effect, (b) are there lots of completely made up references?


Precisely, it's a tech that aims to write the most average, most likely human text.


LLMs don't average, they learn the distribution, from which you then sample (or the UI does it for you). Because of that, they don't write in a single style that's a blend of many human styles - they can write in any and all of the human styles they saw in training, as well as blend them to create styles entirely "out of distribution". And it's up to your prompt (and sampling parameters) which style will be used.


True, a better choice of words than average would have been: a very likely completion to a starting state/input.


Perhaps, but in the context of this thread, what's important is that the space of possible completions is encompassing every writing style imaginable and then some, and the starting state/input can be used to direct the model to arbitrary points in that space. Simple example template:

  Please write <specifics of the text you want LLM to write>, <style instruction>.
Where <style instruction> = "as if you were a pirate", or "be extremely succinct", or "in the style of drunk Shakespeare", or "in Iambic pentameter", or "in style mimicking the text I'm pasting below", etc.

There's no way those "AI detectors" could determine whether the text was written by AI from text itself, as it's trivial to make LLM output have any style imaginable.


Yeah, I am shocked people keep repeating this. We've seen LLMs easily writing mathematical proofs in the style of Shakespeare, come on.


There's still some typicality defined by the prompt though. If you ask for a proof in the style of Shakespeare, you're going to get some "average" Shakespeare. It's kind of embedded in the task definition; you're shifting the reference distribution.

If a LLM returned something really unusual for Shakespeare when you didn't ask for it, you'd say it's not performing well.

Maybe that's tautological but I think it's what's usually meant by "average".

I'm sure LLMs with something different is on the near horizon but I don't think we're there quite yet.


> you're going to get some "average" Shakespeare

The point was that no, you wont (necessarily) get some "average" shakespeare. A sampler may introduce bias and look for the "above average" shakespeare in the distribution.


Saying they find some “average” is an easy way to explain to a layman that LLMs are statistically based and are guessing and not actually spitting out correct text as you would expect from most other computer programs.

That’s why it’s repeated. It’s kind of correct if you squint and it’s easy to understand


What is the correct text anyway? Everything around you is somewhat wrong. Textbooks (statistically all of them) contain errors, scientific papers sometimes contain handwavy bullshit and in rare cases even outright falsified data, human experts can be guessing as well and they are wrong every now and then, programs (again pretty much all of them) contain bugs. It is just the reality.

Even very simple ones may require you to twist the definition of "correctness". I open a REPL and type "1/3.0*3.0" and get "0.9999999999". Then you have to do mental gymnastics like "actually it is a correct answer because arithmetic in computers is implemented not like you'd expect".


> What is the correct text anyway?

Exactly. The fact that language is fuzzy is why LLMs work so well.

The issue is that most people expect computers to not make mistakes. When you write a formula in an excel sheet, the computer doesn’t mess up the math.

The average non tech person knows that humans make mistakes, but are not used to computers making mistakes.

Many people, maybe most, would see an answer generated by a computer program and assume that it’s the correct answer to their question.

In pointing out that LLMs are guessing at what text to write (by saying “average”) you convey that idea in a simplified way.

Trying to argue that “correct” doesn’t mean anything isn’t really useful. You can replace the word “correct” with “practically correct” and nothing about what I said changes.


What do you mean they aren't used to computers making mistakes? Have they ever asked Siri/Alexa something and got useless answers? Have they ever seen ASR or OCR software making mistakes? Have they called semi-automated call centers with prompt "say what you need instead of clicking numbers" only to hear repeated "sorry, I don't understand you" until you scream "connect me to a bloody human"? Have they ever seen a situation when automated border control gates just don't work for whatever reason and there are humans around to sort this out? Have they ever used google translate in last 20 years for anything remotely complicated, like a newspaper article? Have they ever used computers for actual math? Is computer particularly good at solving partial differential equations, for example? Have they ever been in a situation where GPS led them to a closed road or a huge traffic jam? Have they ever played video games where computers sometimes make stupid things?

Sure, computers are better at arithmetic humans, but let's be honest, nobody uses chatgpt as a calculator. Last 20 years AI is getting everywhere, we keep laughing that sometimes AI systems make very obvious stupid mistakes. Now we finally have a system that makes subtle mistakes very confidently and suddenly people are like "I thought computers are never wrong". I can't fathom how anyone would expect that.


You seem to be picking out sentences in my responses and arguing them instead of the point I’m trying to illustrate.

I don’t really think there’s much left for us to talk about.


The word you're looking for is "any", not "average". As in, it can write like any human, any way you want it to. Not just as some "average human".


The word “average” implies a use of statistics, which is why I think it’s good to use, even though it’s not precise.


People aren't as dumb as you seem to imply here; "average" isn't accurate enough. "Random", or even "statistical", would be less confusing.


We can keep bikeshedding, sure.

I don’t think people are dumb, just that the vast majority of people dont have knowledge of statistics and AI.

“Random” isn’t good either. Obviously the text gpt and other LLMs generate isn’t random.

Statistical works too, sure.


Not quite true for an LLM chatbot with RLHF - it aims to provide the most satisfactory response to a prompt. AI detectors are snake oil to begin with, but they're super snake oil if people are smart enough to include something in their prompt like "don't respond in the style of a large language model" or "respond in the style of x".


Conceptually it seems like the average of all human texts would be distinct from any users because it would blend word choices and idioms across regions, where most of us are trained and reinforced in a particular region.

Other statistical anomalies probably exist; it is certainly possible to tell that an average is from a larger or smaller sample size (if I tell you X fair coin flips came up heads 75% of the time, you can likely guess X, and can tell that X is almost certainly less than 1000).

But in practice it doesn’t look possible, or at least the current offerings seem no better than snake oil.


> Conceptually it seems like the average of all human texts would be distinct from any users because it would blend word choices and idioms across regions

That's only true in the aggregate. Within a single answer, LLMs will try to generate a word choice which is more likely _given the preceding word choices in that answer_, which should reduce the blending of idioms.

> where most of us are trained and reinforced in a particular region.

The life experience of most of us (at least here in HN) is wider than that. Someone who as a child visited every year their grandparents in two different regions of the country could have a blend of three sets of regional idioms, and that's before learning English (which adds another set of idioms from the teachers/textbooks) and getting on the Internet (which can add a lot of new idioms, from each community frequented online). And this is a simple example, many people know more than just two languages (each bringing their own peculiar idioms).


While I agree that many people visit two regions of the same country, I think few of us would display word choice patterns reflecting the US, England, and Australia within a single piece. Could it happen? Sure. But LLMs won't have the bias towards likely combinations, except inasmuch as that's represented in training data.


> I think few of us would display word choice patterns reflecting the US, England, and Australia within a single piece.

Someone who learned English mostly through books and the Internet could very well have such a mixture, since unlike native speakers of English, they don't have a strong bias towards one region or the other. You could even say that our "training data" (books and the Internet) for the English language was the same as these LLMs.


The problem is that word use is power-law distributed, so that the most common ~200 words in use are extremely over-represented, and that goes for phrases and so on.

It takes a lot of skill and a long time to develop a unique style of writing. The purpose of language is to be an extremely-lossy on-average way of communicating information between people. In the vast majority of cases, idiomatic style or jargon impairs communication.


That's not true at all. I could tell it to write like an unhinged maniac, someone who never uses contractions, or George Washington with a lisp.


And it would give you a response that others are likely to give, in the context of this prompt.


In the general internet the reputation of AI writing is that it's writing that's bad/awkward in a way that is often identifiable (by humans) as not having been written by humans.

AI detectors are useless, you're right, but for the same reason AI is unreliable in other contexts, not because AI writing is reliably passable.


> in a way that is often identifiable (by humans) as not having been written by humans.

You should check out reddit sometime. It's been nearly twenty years (not hyperbole) of everyone accusing everyone else of being a bot/shill. Humans are utterly incapable of detecting such things. They're not even capable of detecting Nigerian prince emails as scams.

> not because AI writing is reliably passable. "Newspaper editor" used to be a job because human writing isn't reliably passable. I say this not to be glib, but rather because sometimes it's easy for me to forget that. I have to keep reminding myself.

Also, has it not occurred to anyone that deep down in the brainmeat, humans might actually be employing some sort of organic LLM when they engage in writing? That technology actually managed to imitate that faculty at some low level? So even when a human really writes something, it's still an LLM doing so? When you type in the replies to me, are you not trying to figure out what the next word or sentence should be? If you screw it up and rearrange phrases and sentences, are you not doing what the LLM does in some way?


> Also, has it not occurred to anyone that deep down in the brainmeat, humans might actually be employing some sort of organic LLM when they engage in writing?

This is a fairly common take, along with the idea that AI image generators are just doing what humans do when they "learn from examples". But I strongly believe it's a fallacy. What generative AI does is analagous to what humans do, but it's still just an analogy. If you want to see this in action, it's better to look at the way generative AI fails than the way it succeeds: when it makes mistakes in text or images, the mistakes are very much not the kind of mistakes that humans make, because the process behind the scenes is very different.

Yes, obviously when humans write, they take into account context and awareness of what words naturally follow other words, but it seems unlikely we've learned to write by subconsciously arranging all the words we've encountered into multidimensional vector space and performing vector math operations to arrive at the next word based on the context window we're subconsciously constructing. We learn to write in a very different way.

It's truly amazing that generative AI writes as well as it does, but we reason about concepts and generative AI reasons about words. Personally, I'm skeptical that the problems LLMs have with "hallucinations" and with creating definitionally median text* can be solved by making LLMs bigger and faster.

*I did see the comment complaining that it's not mathematically accurate to say that LLMs produce average text, but from my understanding of how generative AI works as well as my recent misadventures testing an AI "novel writer," it's a decent approximation of what's going on. Yes, you can say "write X in the style of Y," but "write X but make it way above average" is not actually going to work.


> But I strongly believe it's a fallacy.

Either the LLM is the most efficient way to generate text, or there's some magic algorithm out there that evolution stumbled upon a million years ago that we haven't even managed to see a hint that it exists. In which case, you'd be right, this is a fallacy.

Or, brainmeat can't do it better or more efficiently, and either uses the same techniques or something even worse. The latter seems unlikely, humans still do pretty well at generating text (gold standard, even).

> it's better to look at the way generative AI fails than the way it succeeds: when it makes mistakes in text or images, the mistakes are very much not the kind of mistakes that humans make, because the process behind the scenes is very different.

But are you looking at "mistakes" that are just little faux pas, or the ones where people with dementia, bizarre brain damage, or blipped out on hallucinogens incorrectly compute the next word? The former offer little insight. Poor taste in word choice, lack of eloquency, vulgar inclinations are what they amount to.

> but it seems unlikely we've learned to write by subconsciously arranging all the words we've encountered into multidimensional vector space and performing vector math operations to arrive at the next word

You think I meant that someone learns to do that at 2 years old, rather than that the brain has already evolved with the ability to do vector math operations or some true equivalent? I'm not talking about some pop psych level "subconscious" thing, but an actual honest to god neurological level faculty.

> but we reason about concepts and Wander into Walmart next time, close your eyes briefly and extend your psychic powers out to the whole building, and tell me if you truly believe, deep down in your heart, that the humans in that store are reasoning about concepts even once a week. That many, if not most, reason about concepts even once a month. I dare you, just go some place like that, soak it all in.

Human reason exists, from time to time, here and there. But most human behavior can be adequately simulated without any reason at all.


> Or, brainmeat can't do it better or more efficiently, and either uses the same techniques or something even worse. The latter seems unlikely, humans still do pretty well at generating text (gold standard, even).

Considering we use something like a thousand times the compute, "something even worse" seems plausible enough.


I think we have plenty of evidence that humans have the ability to understand, while chatbots lack such an ability. Therefore, I'm inclined to think that we don't employ some sort of organic LLM but something completely different.


I've occasionally seen evidence that some humans seem to sometimes understand. I've learned not to generalize that though.


Precisely. And they way to get better writing is by having good editors.

The major newspapers and magazines used to have good editors and proofreaders and it used to be rare to see misspellings or awkward sentences, but those editors have been seriously cut back and you see these much more commonly.

But hey, let’s blame something else.


But also maybe firing writers who make weird word choices and are needlessly wordy is fine.


Yes please. The art of writing is conveying the most meaning in the fewest words.


Eh, eventually AI will write like humans but currently most of the time it's very much apparent what was written by AI. English is my second language so it's hard for me to pinpoint the exact reason why but I guess it's more about the tone and the actual content (a.k.a bullshit) rather than grammar / choice of words.

Most of the time AI slop reads like a soulless corporate ad. Probably because most of the content the AI was trained on was already SEO optimized bullshit mass produced on company blogs. I'd very much like a tool that would detect and filter out also those from "my internet".


If the AI writing is good, you’re not going to know it’s written by AI and you’ll continue to think you "can always tell" while more and more of what you read isn’t written by humans.


the only way to know if it's original or AI-written is to know the author writing skills beforehand.


Yeah but to obtain good, you have to give such a specific and detailed prompt that you might just do it yourself.


Yeah but to reach that point you will probably need those "useless AI detectors" (as stated by the comment I was replying to). That was my point - we're not there yet therefore those tools can be useful.


But how do you know we’re not there yet? Not across the board, but isn’t it possible there’s a small yet growing portion of written content online that’s AI generated with no obvious tells?


I think we have a misunderstanding - I don't mind if I'm reading AI generated content as long as it doesn't look like "the typical AI content" (or SEO slop). In my point of view companies/writers might use AI detectors to continue improving the quality of their content (even if it's written by hand, those false positives might be a good thing). We're not there yet because I still see and read a lot of AI/SEO slop.

I agree with you that the "portion of written content online that’s AI generated with no obvious tells" is "small yet growing". That's exactly the thing - it's still too small to "be there yet" :)


I don't follow how you're reaching your conclusion. You only mind reading AI content when it's obviously AI/slop and you conclude the vast majority of decent content is not AI generated. In your conclusion how were you able to identify good content as being written by AI or not?

E.g. it's perfectly possible that in terms of prevalence "AI slop > AI acceptable > human acceptable" instead "AI slop > human acceptable > AI acceptable" and nothing noted explains why it is one instead of the other.


Semi-automated such is probably widespread by now.

Like imagine the Rust Evangelic Task Force, but for the next big thing, will probably be shilled by bots.

Low quality content like Youtube and Reddit comments are probably mostly LLM bots who comment on anything to hide the actual spam comments.


"most of the content the AI was trained on was already SEO optimized bullshit mass produced on company blogs."

totally agree in the last part, I work with copywriting and I spent most of my prompts trying to double-down on the pretentious discourse.


Honestly, I could care less if an author uses AI as long as I can understand what I'm reading and it's interesting. They still have to instruct the AI.


If you're reading nonfiction, it means you're wasting time reading a lot more words when you could have just read the prompt.


Remind you of some entire genres of book?

That’s right, business and self-help books!

Any of these with an author who’s got actual accomplishments and money before writing the book was almost certainly already ghostwritten from an outline (and so are lots of other books, you’d be surprised, it’s not just these genres). Successful CEOs or people you’ve heard of generally don’t write their own books. Often, they’re terrible writers, and even if they’re not, writing is time-consuming and as with everything else that actually creates something they prefer to pay someone else to do it.

As of last year new books in that category are written by AI and edited by one or more humans—with each editor doing just two or three chapters, you can finish one of these books in a month or less.


Well, to err is human, to truly screw up you need a computer.

We're going to be blasted to smithereens with LLM-generated "80% should be good enough" garbage.


It’s fortunate we have mountains of human-written books, film, television, radio programs, music, and video games from Before AI. Just the good stuff could occupy several lifetimes.

Pity we killed most of the good used book stores already, though.

Also, shame about journalism and maybe also democracy. That’s too bad.


In my case, I talk a lot, and write a TON, my use for AI is really "can you say the same information with less words" then I tweak what it gives me. To be fair, I'm not a paid writer, just a dev writing emails to business people. I rewrite emails like 20 times before sending them. ChatGPT has helped me to just write it once, and have it summarized. I usually keep confidential details out and add them in after if needed.


Indeed you can losslessly "compress" an LLM's spew into just the prompt (plus any other inputs like values of random variables).

But you can also compress a book's entire content into just its ISBN.

It's just that books are hopefully more than just statistical mashups of existing content (some books like textbooks and encyclopaedias are kinds of mashup, though one hopes the editors have more than a statistically-based critical input!)


You can't regenerate the book from the ISBN. But you can generate the text from the prompt.


You can go and fetch the book from a book store using the information. Fundamentally there's not much difference between that and "fetching" the output from some model using the matching prompt. In both cases there some kind of static store of latent information that can be accessed unambiguously using a (usually) shorter input.

I'm not saying the value of the returned information is equivalent, of course. But being "just a pointer" into a larger store isn't, in itself, the problem to me.


You realise that you can't fetch a new isbn without altering the archive, while this is not the case for every new prompt that you come up with?


I don't understand the distinction. If the book archive is electronic, like many in fact are, why can you not get a copy of the book with a given ISBN without altering anything? Even if it's not electronic, does the acquisition of a book by an individual meaningfully change the overall disposition of available information? If you took the last one in your local Waterstones, I can still get one elsewhere.


> If the book archive is electronic, like many in fact are, why can you not get a copy of the book with a given ISBN without altering anything?

Because new books are written?

It feels to me that you are set on insisting that a prompt and an ISBN are the same, and no amount of logic will move you from there.


Models can be trained more and fine tuned, though, if we're going to stick to the analogy. But in the context of the analogy, the LLM won't be materially updated between two prompts in roughly the way that telling you that the answer you seek is in a book with a specific ISBN isn't materially affected by someone publishing a new book at that moment.

You are quite right that you're not convincing me of your original thesis that that a prompt contains the entire content of the reply in a way that some other reference to an entity in some other pool of information to doesn't. That's not the same as saying "ISBNs and LLM prompts are the same thing", which is a strawman. It's saying that they're both unambiguous (assuming determininism) pointers to information.

Of course no-one is disagreeing that a reply from a deterministic LLM would add no information to the global system (you, an LLM's model, a prompt) than just the prompt would. But I still think the same is true for the content of a book not adding to the system of (you, a book store, an ISBN).

In fact, since random numbers don't contain new information if you know the distribution, one can even extend it to non-deterministic LLMs: the reply still adds no information to the system. The analogy would then be that the book store gives you at random a book from the same Dewey code as the ISBN you asked for. Which still doesn't increase the information in the system.


Can you, though? I thought LLMs just by virtue of how they work, are non-deterministic. Let alone if new data is added to the LLM, further retraining happens, etc.

Is it possible to get the same output, 1:1, from the same prompt, reliably?


They are assuming a lot of things, like the LLM doesn't change, and that you have full control over the randomness . This might be possible if you are running the LLM locally.


Well if the llm change, why not assume the index system doesn't change too?

And yeah I guess if you control the seed an llm would be deterministic.


Not true if the author of the prompt used an iterative approach. Write the initial prompt, get the result, "simplify this, put more accent on that, make it less formal", get the result, and so on, and edit the final output manually anyway.


Depends on your own level of background knowledge vs. the author's.


OpenAI announced that they had started on an AI text detector and then gave up as the problem appears to be unsolvable. The machine creates statistically probable text from the input, applying statistics to the generated result will show nothing more than exactly that. You’re then left triggering false positives on text that is the most likely which makes the whole thing useless.


> OpenAI announced that they had started on an AI text detector and then gave up as the problem appears to be unsolvable.

Making a reliable LLM also appears to be unsolvable, but we still work at it and still use the current wonky iterations. My comment is even if there is no perfect AI detectors, a lot of these tools are good enough for a "first pass"--coincidentally the same use case many effective LLM practitioners use LLMs for.


Sure it could maybe be kinda right, but what is the cost of a false positive? If you have, say, a 10% false positive rate, and there are theoretical reasons to think you’ll never get that anywhere close to zero, then what use case does this serve? Hey student, there’s 90% chance you cheated, well no I’m that 10%. What now?

Again, OAI cancelled work on this believing it not to be solvable with a high degree of confidence. What is the use case for a low confidence AI detector?


> What is the use case for a low confidence AI detector?

What's the use case for LLMs in general if you always have to double-check their work?


“The entire current AI market”.


I did some research on this in March and developed an opinionated POV, which I'll paste here for anyone interested.

TL;DR: Detecting AI generated content is hard – really hard. The models available today cannot be trusted and should not be used to make important decisions.

In fact, OpenAI took down their detector down last year because they couldn't reach an acceptable level of accuracy:

https://openai.com/blog/new-ai-classifier-for-indicating-ai-...

One open model trained on open data is Hello-SimpleAI's chaptgpt-detector:

https://huggingface.co/Hello-SimpleAI/chatgpt-detector-rober...

https://huggingface.co/datasets/Hello-SimpleAI/HC3

However, that model is not robust and can be tricked by trivial changes:

https://arxiv.org/abs/2307.02599

I verified this result using the playground on hugging face. For example, it is vulnerable to the “one space character” attacks mentioned in the article, severely limiting the usefulness of trying to detect AI content in an adversarial context.

This ridiculous piece of "research" from Forbes has been causing problems:

https://www.forbes.com/sites/technology/article/best-ai-cont...

The Forbes article is credulous and uncritical, beyond mere naiveté and approaching journalistic malpractice, reporting the sales stories and self-reporting benchmarks of self-interested parties as fact. Nevertheless, I've seen several people share it as "insightful" so it's floating around, doing more harm that good IMO.

While all detectors are terrible, Sapling AI has one of the better ones, if only because they are completely open and honest about it's limitations:

https://sapling.ai/docs/api/detector/

https://sapling.ai/ai-content-detector

Sapling AI also wrote an interesting blog post on GPT SIPs (Statistically Improbable Phrases.)

https://sapling.ai/devblog/chatgpt-phrases/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: