Receiving hundreds of AI generated bug reports would be so demoralizing and probably turn me off from maintaining an open source project forever. I think developers are going to eventually need tools to filter out slop. If you didn’t take the time to write it, why should I take the time to read it?
All of these reports came with executable proof of the vulnerabilities – otherwise, as you say, you get flooded with hallucinated junk like the poor curl dev. This is one of the things that makes offensive security an actually good use case for AI – exploits serve as hard evidence that the LLM can't fake.
Is "proof of vulnerability" a marketing term, or do you actually claim that XBOW has a 0% false positive rate? (i.e. "all" reports come with a PoV, and this PoV "proves" there is a vulnerability?)
These aren't like Github Issues reports; they're bug bounty programs, specifically stood up to soak up incoming reports from anonymous strangers looking to make money on their submissions, with the premise being that enough of those reports will drive specific security goals (the scope of each program is, for smart vendors, tailored to engineering goals they have internally) to make it worthwhile.
Got it! The financial incentive will probably turn out to be a double edged sword. Maybe in the pre-AI age, it’s By Design to drive those goals, but I bet the ability to automate submissions will inevitably alter the rules of these programs.
I think within the next 5 years or so, we are going to see a societal pattern repeating: any program that rewards human ingenuity and input will become industrialized by AI to the point where it becomes a cottage industry of companies flooding every program with 99% AI submissions. What used to be lone wolves or small groups of humans working on bounties will become truckloads of AI generated “stuff” trying to maximize revenue.
> What used to be lone wolves or small groups of humans working on bounties will become truckloads of AI generated “stuff” trying to maximize revenue.
You're objecting to the wrong thing. The purpose of a bug bounty programme is not to provide a cottage industry for security artisans - it's to flush out security vulnerabilities.
There are reasonable objections to AI automation in this space, but this is not one of them.
I've been on Hackerone for almost 8 years and I think the problem with this is that too many companies won't pay for legitimate bugs, even when you have a working exploit.
I had one critical bug take 3 years to get a pay out. I had a full walkthrough with videos and report. The company kept stalling and at one point told me that because they completely had the app remade, they weren't going to pay me anything.
Hackerone doesn't really protect the researcher either. I was told multiple times that there was 'nothing they could do'.
I eventually got paid, but this is pretty normal behavior with regards to bug bounty. Too many companies use it for free security work.
I do think HackerOne is problematic, in that it pushes companies that don't really understand bug bounties to stand up bounty programs without a clear reason. If you're doing a serious bounty, your incentive is to pay out. But a lot of companies do these bounties because they just think they're supposed to.
Open source maintainers have been complaining about this for a while. https://sethmlarson.dev/slop-security-reports. I'm assuming the proliferation of AI will have some significant changes on/already has had for open source projects.
Yes! I recently had to manually answer and close a Github issue telling me I might have pushed an API key to github.
No, "API_KEY=put-your-key-here;" is a placeholder and I should not have to waste time writing that.
I'm still on the AI-skeptic side of the spectrum (though shifting more towards "it has some useful applications"), but, I think the easy answer is - if different models/prompts are used in generation than in quality-/correctness-checking.
You see, the dream is another AI that reads the report and writes the issue in the bug tracker. Then another AI implements the fix. A third AI then reviews the code and approves and merges it. All without human interaction! Once CI releases the fix, the first AI can then find the same vulnerability plus a few new and exciting ones.
This is completely absurd. If generating code is reliable, you can have one generator make the change, and then merge and release it with traditional software.
If it's not reliable, how can you rely on the written issue to be correct, or the review, and so how does that benefit you over just blindly merging whatever changes are created by the model?