If Waymo has taught me anything, it’s that people will eventually accept robotic surgeons. It won’t happen overnight but once the data shows overwhelming superiority, it’ll be adopted.
I think Waymo is a little bit different and driving in general. Because you have an activity that most people don’t trust how other people perform it already. It’s easier to accept the robo driver.
For the medical world, I’d look to the Invisalign example as a more realistic path on how automation will become part of it.
The human will still be there the scale of operations per doctor will go up and prices will go down.
LASIK is essentially an automated surgery and 1-2 million people get it done every year. Nobody even seems to care that it’s an almost entirely automated process.
Any intervention, heck even jogging around your house, has a risk. The only important question is: are there less automated errors than human errors? If yes, here's the progress we asked for.
Full anesthesia - yeah, not an option, you need to be awake. Something milder - it could be an option (depending on the state, maybe? not sure, mine was done in WA).
Neither me nor my friends (all of us who got lasik) asked for it, but my clinic gave me valium, and my friends’ clinic gave them xanax shortly before the procedure.
Tangential sidenote: that was nearly 8 years ago, and I am absolutely glad I got it done.
My perception (and personal experience) is medical malpractice is so common, I’d gladly pick a Waymo-level robot doctor over a human one. Probably skewed since I’m a “techie”, but then again that’s why Waymo started at the techie epicenter, then will slowly become accepted everywhere
> My perception (and personal experience) is medical malpractice is so common [...]
I think it's interesting that we as human think it's better to create some (somewhat mostly) correct roboter to perform medical stuff instead of - together as human race - start to care about stuff.
I don’t think the problem is “caring”. Waymo has proven the obvious - a machine with higher cognitive function that never gets distracted is better than most humans at an activity that requires constant attention and fast reflexes. I’m sure the same will eventually apply to other activities too.
It’s a much better investment of time to make robots that can do delicate activities (eg Neuralink’s implant robot), consistently and correctly, than training humans and praying that all of them are equally skilled, never get older or drink coffee or come to the operating table stressed out one day…
Uhmmm... I'm sorry but when Waymo started near everyone I talked to about it says "zero % I'm going in one of those things, they won't be allowed anyway, they'll never be better than a human, I wouldn't trust one, nope, no way" and now people can't wait to try them. I understand what you're saying about the trusted side of the house (surgeons are generally high trust) - but I do think OP is right, once the data is in, people will want robot surgery.
Of course they will. I don’t argue that they won’t.
I just say that the path to that and the way it’s going to be implemented is going to be different and Invisalign is a better example to how it will happen in the medical industry compared to automotive.
By collecting data where you can and further generalizing models so they can perform surgeries that it wasn't specifically trained on.
Until then, the overseeing physician identifies when an edge case is happening and steps in for a manual surgery.
This isn't a mandate that every surgery must be done with an AI-powered robot, but that they are becoming more effective and cheaper than real doctors at the surgeries they can perform. So, naturally, they will become more frequently used.
I don't care whether human surgeons or robotic surgeons are better at what they do. I just want more money to go to whoever owns the equipment, and less to go to people in my community.
Still the robots are not used outside of their designated use cases and People still handle by hand the sort of edge cases that are the topic of concern in this context
...Except that a surgeon can reason in real-time even if he wasn´t "trained" on a specific edge-case. Its called intelligence. And unless they have been taking heavy drugs ahead of the procedure, or were sleep deprived, its very un-likely a surgeon will have a hallucination, of the kind that is practically a feature of the GenAI.
AI “hallucination” is more like confabulation than hallucination in humans (the chosen name the AI phenomenon was poor because the people choosing it don't understand thr domain it was chosen from, which is somewhat amusing given the nominal goal of their field); the risk factors for that aren't as much heavy drugs and sleep deprivation as immediate pressure to speak/act, absence of the knowledge needed, and absence of the opportunity or social permission to seek third-party input. In principle, though, yes, the preparation of the people in the room should make that less likely and less likely to be uncorrected in a human-conducted surgery.
I guess my point was less about the nuances of how we define 'hallucination' for a GenAI system, and more about the important part - not having my liver accidentally removed because the Surgery-ChatGPT had a hickup, or the rate limit was reached or whatever.
We’re already most of the way there. There’s the da Vinci Surgical System which has been around since the early 2000s, the Mako robot in orthopedics, ROSA for neurosurgery, and Mazor X in spinal surgery. They’re not yet “AI controlled” and require a lot of input from the surgical staff but they’ve been critical to enabling surgeries that are too precise for human hands.
> We’re already most of the way there. They’re not yet “AI controlled” and require a lot of input from the surgical staff but they’ve been critical to enabling surgeries that are too precise for human hands.
That does not sound like “most of the way there”. At most maybe 20%?
If you consider “robotic surgeon” to mean fully automated, then sure the percentage is lower, but at this point AI control is not the hard part. We’re still no closer to the mechanical dexterity and force feedback sensors necessary to make robotic surgeon than we were when the internet was born. Let alone miniaturizing them enough to make a useful automaton.
>If Waymo has taught me anything, it’s that people will eventually accept robotic surgeons.
I do no think that example is applicable at all. What I think people will be very tolerant of is robot assisted surgeries, which are happening right now and which will become better and more autonomous over time. What will have an extremely hard acceptance rate are robots performing unsupervised surgeries.
The future of surgery this research is suggesting is a robot devising a plan, which gets reviewed and modified by a surgeon, then the robot under the supervision of the surgeon starts implementing that plan. If complications arise beyond the robots ability to handle, the surgeon will intervene.
That calculus has a high dependency on skill of the driver. In the situation of an unskilled driver or surgeon you would worry either way.
The frequencies are also highly dependent on the subject. Some people never ride in a taxi but once a year. Some people require many surgeries a year. The frequency of the use is irrelevant.
The frequency of the procedure is the key and it’s based on the entity doing the procedure not the recipient. Waymo in effect has a single entity learning from all the drives it does. Likewise a reinforcement trained AI surgeon would learn from all the surgeries it’s trained with.
I think what you’re after here though is the consequence of any single mistake in the two procedures. Driving is actually fairly resilient. Waymo cars probably make lots of subtle errors. There are catastrophic errors of course but those can be classified and recovered from. If you’ve ridden in a Waymo you’ll notice it sometimes makes slightly jerky movements and hesitates and does things again etc. These are all errors and attempted recoveries.
In surgery small errors also happen (this is why you feel so much pain even from small procedures) but humans aren’t that resilient to the mistakes of errors and it’s hard to recover once one has been made. The consequences are high, margins of error are low, and the domain of actions and events really really high. Driving has a few possible actions all related to velocity in two dimensions. Surgery operates in three dimensions with a variety of actions and a complex space of events and eventualities. Even human anatomy is highly variable.
But I would also expect a robotic AI surgeon to undergo extreme QA beyond an autonomous vehicle. The regulatory barriers are extremely high. If one were made available commercially, I would absolutely trust it because I know it has been proven to out perform a surgeon alone. I would also expect it’s being supervised at all times by a skilled surgeon until the error rates are better than a supervised machine (note that human supervision can add its own errors).
> an "oops" in a car is not immediately life threatening either
They definitely can be. One of the viral videos of a Tesla "oops" in just the last few months showed it going from "fine" to "upside-down in a field" in about 5 seconds.
And I had trouble finding that because of all the other news stories about Teslas crashing.
While I trust Waymo more than Tesla, the problem space is one with rapid fatalities.
Per the article, all current European data centers used 62 million cubic meters of water for 2024. That is 3.4% of only Spain's existing desalination capacity (≈1.8 billion m³/yr).
Seems like this is solvable:
1. Keep rolling out the closed-loop cooling improvements now appearing in new DC designs.
2. Add more desal capacity where it’s cheap (sunny coastlines + renewables) to cover the residual demand.
Upvoting this since presumably you're actually the CTO at Snyk and people should see your official response, but wow this feels wildly irresponsible. You could have proved the PoC without actually stealing innocent developer credentials. Furthermore, additional caution should have been taken given the conflict of interest with the competitor product to Cursor. Terrible decision making and terrible response.
I know of many, many LLM systems in production system, since that's what I've been helping companies build since the start of the year. Mostly it's pretty rote automation work but the cost savings are incredible.
Agentic workflows are a much higher bar that are just barely starting to work. I can't speak to their efficacy but here's a few of the ones that are sort of starter-level agents that I've started seeing some companies adopt:
Before farming technology like tractors, 97% of people worked the fields on come capacity. Now it’s the inverse. Technology frees human potential from drudgery.
Presumably one of the PMs you’re referring to has posted this article for additional information. Feels like they’re doubling down on their initial position.
> Although the researcher did initially submit the vulnerability through our established process, they violated key ethical principles by directly contacting third parties about their report prior to remediation. This was in violation of bug bounty terms of service, which are industry standard and intended to protect the white hat community while also supporting responsible disclosure. This breach of trust resulted in the forfeiture of their reward, as we maintain strict standards for responsible disclosure.
Wow... there was no indication that they even intended on fixing the issue, what was Daniel hackermondev supposed to do? Disclosing this to the affected users probably was the most ethical thing to do. I don't think he posted the vulnerability publicly until after the fix. "Forfeiture of their award" -- they said multiple times that it didn't qualify, they had no intention of ever giving a reward.
As someone who manages a bug bounty program, this kind of pisses me off.
For some of our bugs given on h1, we openly say, "Hey, we need to see a POC in order to get this to be triaged." We do not provide test accounts for H1 users, so, if they exploit someone's instance, we'll not only take the amount that the customer paid off of their renewal price, we'll also pay the bounty hunter.
Fwiw, I wouldn't be surprised if the author of this article is a bit upset that Daniel hackermondev gained a significant % of the income that the author makes a year. If this was "fixed" by Zendesk, they would have paid less than a few % from the 50k they actually made.
Edit: to those downvoting, the fact of the matter is that Zendesk's maximum bounty is far lower than 50k; yet OP made 50k; meaning by definition the value of the vulnerability was at least 50k.
If anything, they are probably upset that they apparently lost some customers over this. That must (rightfully) hurt. But it's their own mistake - leaving a security bug unaddressed is asking for trouble.
He didn't even "go public" as that term is normally used in bug disclosure. He didn't write it up and release and exploit when Zendesk told him it was out of scope and didn't give him any indication they considered it a problem or were planning a fix. Instead he reached out to affected companies in at least a semi private way, and those companies considered the bug serious enough to pay him 50k collectively and in at least some cases drop Zendesk altogether.
I am 100% certain that every one of the companies that paid the researched would consider the way this was handled by that researched "the best alternative to HackerOne rules 'ethical disclosure' in the face of a vendor trying to cover up serious flaws".
In an ideal world, in my opinion HackerOne should publicly revoke Zendesk's account for abusing the rules and rejecting obviously valid bug payouts.
Aren't such disputes about scope relatively common? Not sure what Hackerone can do about it.
For example, most Hackerone customers exclude denial-of-service issues because they don't want people to encourage to bring down their services with various kinds of flooding attacks. That doesn't mean that the same Hackerone customers (or their customers) wouldn't care about a single HTTP request bring down service for everyone for a couple of minutes. Email authentication issues are similar, I think: obviously on-path attacks against unencrypted email have to be out of scope, but if things are badly implemented that off-path attacks somehow work, too, then that really has to be fixed.
Of course, what you really shouldn't do as a Hackerone customer is using it as a complete replacement for your incoming security contact point. There are always going to be scope issues like that, or people unable to use Hackerone at all.
Once they'd brushed him off and made it clear they were not interested in listening to him, resolving the bug, or living up to the usual expectations that researchers have in companies claiming to have bug bounties on HackerOne, I'd say they lost any reasonable expectation that he'd do that.
I'll note he did go to the effort of having the first stab at that sort of resolution, when he pushed back on HackerOne's inaccurate triage of the bug as an SPF/DKIM/DMARC email issue. He clearly understood the need for triage for programs like this, and that the HackerOne internal triage team didn't understand the actual problem, but again was rebuffed.
When in doubt, go with the side which has been forthcoming. Zendesk didn’t publish details and wrote their post misleadingly describing it as a supply chain problem sounding almost as if they were a victim rather than the supplier of the vulnerability. It’s always possible that there are additional details which haven’t come out yet but that impression of a weasel-like PM is probably accurate.
That article claims to have “0 comments”, but currently sits at a score of -7 (negative 7) votes of helpful/not helpful. I think they have turned off comments on that article, but aren’t willing to admit it.
EDIT: It’s -11 (negative 11) now. Still “0 comments”.
In damage control mode, Zendesk can't pay a bounty out here? Come on. This is amateur hour. The reputational damage that comes from "the company that goes on the offensive and doesn't pay out legitimate bounties" impacts the overall results you get from a bug bounty program. "Pissing off the hackers" is not a way to keep people reporting credible bugs to your service.
I don't understand what this tries to accomplish. The problem is bad, botching the triage is bad, and the bounty is relatively cheap. I understand that this feels bad from an egg-on-face perspective, but I would much rather be told by a penetration tester about a bug in a third-party service provider than not be told at all just to respect a program's bug bounty policy.
> "Pissing off the hackers" is not a way to keep people reporting credible bugs to your service.
That doesn’t matter if your goal with a bug bounty program is not to have people reporting bugs, but instead to have the company appear to care about security. If your only aim is to appear serious about security, it doesn’t matter what you actually do with any bug reports. Until the bugs are made public, of course, which is why companies so often try to stop this by any means.
sounds like a great way to get a bunch of black hats to target you after pissing off the white hats. Playing nice with people this smart should be precisely to prevent this kind of damage to a company that results in losing clients.
But I geuss corporations ignoring security for more immediately profitable ventures on the quarterly report is a tale as old as software.
"Hi, we are ZenDesk, a support ticket SaaS with a bug bounty program that we outsource to our effected customers, who pay out an order of magnitude more than our puny fake HackerOne program. Call now, to be ridiculously upsold on our Enterprise package!"
We, the company that doesn't understand security, can't tell whether this was exploited, therefore we confidently assert that everything is fine. It's self consistent I suppose but I wouldn't personally choose to scream "we are incompetent and do not care" into the internet.
As a former ZD engineer, shame on you Mr Cusick (yes, I know you personally) and shame on my fellow colleagues for not handling this in a more proactive and reasonable way.
Another example of impotent PMs, private equity firms meddling and modern software engineering taking a back seat to business interests. Truly pathetic. Truly truly pathetic.
I think LLM technology, not necessarily all of CNN, has plateaued. We've used up all the human discourse, so there's nothing to train it on.
It's like fossil fuels. They took billions of years to create and centuries to consume. We can't just create more.
Another problem is that the data sets are becoming contaminated, creating a reinforcement cycle that makes LLMs trained on more recent data worse.
My thoughts are that it won't get any better with this method of just brute-forcing data into a model like everyone's been doing. There needs to be some significant scientific innovations. But all anybody is doing is throwing money at copying the major players and applying some distinguishing flavor.
Progress on benchmarks continues to improve (see GPT-o1).
The claim that there is nothing left to train on is objectively false. The big guys are building synthetic training sets, moving to multimodal, and are not worried about running out of data.
o1 shows that you can also throw more inference compute at problems to improve performance, so it gives another dimension to scale models on.
Actually, the sources we had (everything scraped from the internet) turns out to be pretty bad.
Imagine not going to school and instead learning everything from random blog posts or reddit comments. You could do it if you read a lot, but it's clearly suboptimal.
That's why OpenAI, and probably every other serious AI company, is investing huge amounts in generating (proprietary) datasets.
To avoid disappointment, just think of the mass news media as a (shitty) LLM. It may occasionally produce an article that on the surface seems to be decently thought out, but it's only because the author accidentally picked a particularly good source to regurgitate. Ultimately, they just type some plausible sentences without knowing or caring about the quality.
Good. In the US, we should all wake up to the fact that we enjoy the lives we lead in large part to the fact that we’re able to project strength. AI will be another piece in the puzzle of national defense.
One can support a strong military and still not want analysis or decision making to be overly automated. Certainly not by an LLM that could easily hallucinate that World War 3 has started.
One can also want to contribute to the military via taxes or service but not personal data. Just as one can be pro-police while also pro-fourth amendment. Respect people’s privacy or you won’t have a country worth protecting.
What exactly do you think they are going to use it for? To discuss the best way to make a sandwich?
At best the AI will only assist with tactical analysis. Eventually it will be more directly involved in decision making. Come on, we already saw this movie on July 2nd, 2003.
Fuck if I ever work on anything that intentionally kills people. I see it as little different from shooting someone in the head, except you get to pretend that you’re squeaky clean at the end of the workday.
What lives exactly do you mean? Because there's nothing unique about America that plenty of other countries don't have anymore, and most of us manage to spend considerably less on defence than America.
Those countries manage to spend less on military than America, because of Pax Americana. This is a uniquely peaceful and prosperous time throughout history where there has been a liberalist unipolar political climate worldwide because of the USA, and dozens of countries have taken advantage of the fruits of America's defense labor - while chipping in little for themselves, and others, in return. America has had the strength and posture to hold the entire Western world on its behalf, and did so.
Europe is re-arming, one of the two leading candidates for this election cycle are Anti-NATO. Expect this generations-old luxury for other western countries beyond America to soon change.
Seems reasonable? If you’re planning on staying free forever and are already gated to 90 days of history, why would you care if they delete 1+ year old content? From a privacy perspective, I’d even say it’s a positive update.