> I'd bet there's some calculation that people who try to sign up for a plan over the phone end up using the phone more down the line, which would mean more costly operator time. So the math works out where the overall savings of making enough people give up before reaching a human outweighs the cost of potentially lost new subscriptions by phone call.
That's an example of a weird heuristic I frequently see applied to corporations: assume some awful decision is the result of some scarily hyper-competent design process, and construct a speculative explanation along those lines.
But must of us have worked in corporations, an know how stupid and incompetent they can be.
> Or, they just didn't study that. Or, the decision-makers don't contact customer support for themselves and so don't know how infuriatingly unhelpful AI ones are.
>> One important reason people like to write code is that it has well-defined semantics, allowing to reason about it and predict its outcome with high precision. Likewise for changes that one makes to code. LLM prompting is the diametrical opposite of that.
> You’re still allowed to reason about the generated output. If it’s not what you want you can even reject it and write it yourself!
You missed the key point. You can't predict and LLM's "outcome with high precision."
Looking at the output and evaluating it after the fact (like you describe) is an entirely different thing.
For many things you can though. If I ask an LLM to create an alert in terraform that triggers when 10% of requests fail over a 5 minute period and sends an email to some address, with the html on the email looking a certain way, it will do exactly the same as if I looked at the documentation, and figured out all of the fields 1 by 1. It’s just how it works when there’s one obvious way to do things. I know software devs love to romanticize about our jobs but I don’t know a single dev who writes 90% meaningful code. There’s always boilerplate. There’s always fussing with syntax you’re not quite familiar with. And I’m happy to have an AI do it
I don’t think I am. To me, it doesn’t have to be precise. The code is precise and I am precise. If it gets me what I want most of the time, I’m ok with having to catch it.
> Re: timing - They were triggered to explode en masse, which implies that there was zero consideration to minimizing civilian harm.
Zero? The whole nature of the attack shows consideration towards "minimizing civilian harm." Tricking an enemy agent into carrying a small explosive device on his person, then detonating it, will have far less civilian harm than the standard procedure of dropping a bomb on whatever building they happen to be in.
Your thinking appears unreasonably binary here, as shown by your use of phrases like "zero consideration" and "definitely no consideration," in reaction to Israel not meeting an unrealistically high standard for "minimizing civilian harm." Could Israel have done more to minimize civilian harm with that attack? Perhaps, but that doesn't mean they did nothing.
@Cyph0n, if you think Israel's approach led to too much collateral damage, why don't you propose a solution that would have led to less collateral damage while still taking the Hezbollah leaders out of action?
I bet you won't do this, because I think we can ultimately agree it wasn't possible for Israel to take all these men out of action simultaneously and minimize collateral damage much beyond what it did.
I think where we disagree is that you think Israel should not have taken these men out of action.
Nice deflection. All I need to care about as a lowly SWE is that this attack injured thousands of Lebanese civilians. This is the real world, not a movie or simulated war game.
And I would wager that you would immediately condemn such a barbaric attack if the sides were reversed.
So you weren't able to propose a solution that would have led to less collateral damage because no such solution exists. You know it. I know it. Everyone reading this knows it.
Instead of answering directly you make a comment about deflection, and insist an obvious falsehood (the attack injured thousands of Lebanese civilians) is all you care to believe. On this, we agree. It's all you care to believe, the evidence be damned!
> FWIW, our CEO has declared us to be AI-first, so we are to leverage AI in everything we do which I think is misguided. But you can bet they will be reviewing AI usage metrics and lower wont be better at $WORK.
I've taken some pleasure in having GitHub copilot review whitespace normalization PRs. It says it can't do it, but I hope I get my points anyway.
> As I see it, the purpose of AI is the same as the purpose of every technology ever since the hand axe - to reduce labor. Humans have a strong drive to figure out ways to achieve more with less effort.
Yes.
> Obviously there are higher order effects, but same as we wouldn't expect the Homo Erectus to stop playing with stone tools because they'd disrupt their society (which of course they did), I don't understand why we should decide to halt technological progress now.
The difference is the relationship of that technology to the individual/masses. When a Homo Erectus invented a tool, he and every member of his species (who learned of it) directly benefited from the technology, but with capitalism that link has been broken. Now Homo Sapiens can invent technologies that may greatly benefit a few, but will be broadly harmful to individuals. AI is likely one of those technologies, as its on the direct path to the elimination broad classes of jobs with no replacement.
This situation would be very different if we either had some kind of socialism or a far more egalitarian form of capitalism (e.g. with extremely diffuse and widespread ownership).
I think you might have an overly noble view of Homo Erectus. I believe that a fellow member of the species is at least as likely to get that hand axe smashed into their skull as they're likely to benefit from it.
> This strategy doesn't make sense. What was the end goal? To have the other person keep buying new computers.
I would assume it was to interfere with the other student's research. That other person almost certainly had data on the destroyed computers that he either lost completely, or had to do extra work to recover when they failed.
> I do think there’s a risk of societal stagnation if we all stick around forever. But, maybe we can make a deal—if we all end up immortal, we can make a threshold, maybe even as young as 80 or something, and have people retire and stop voting at that point. Let society stay vivacious, sure. Give us an end point for our toils, definitely, and a deadline for our projects.
> Long term consequences: China outperforms Nvidia, by producing cheaper, faster chips at a large scale, by getting inspired by the IP but using their own production lines.
Unlike your typical free market fanboy, the Chinese leadership isn't stupid. They were always planning to do that, sanctions or no.
Realistically, all sanctions can do is mess with their timelines for some temporary strategic advantage, slowing some things down and forcing reallocation of investment away from other areas into the sanctioned areas.
The US refraining from sanctions is likely the stupid move, because that lever of control will expire at some point. To not use it is to squander it.
But if there's one thing the US government and its business elite is good at, it's squandering things.
"Planning to do it" is one thing, but thanks to Trump's erratic and corrupt trade policy, they now have a Manhattan Project-level incentive to make it happen.
It's ridiculous to think they won't succeed, just by dint of sheer numbers alone.
> "Planning to do it" is one thing, but thanks to Trump's erratic and corrupt trade policy, they now have a Manhattan Project-level incentive to make it happen.
The plans weren't wishes, they were things they were actively working on to make happen. The point is they didn't need "Trump's erratic and corrupt trade policy" to motivate it, they were already motivated to do it anyway.
The US's problem is that its actions are uncoordinated. Sanctions and tariffs need to be coupled with massive investments to build new capabilities, and the latter is usually lacking. For instance, tariff revenue (and then some) should be poured directly into subsidies for building new facilities that support critical industries (like rare earths and electronics manufacturing). And things would probably be counterintuitively more effective if there was more tolerance of waste For instance, China's subsidized hundreds of solar panel manufacturers, none of them make money and a lot have probably failed, but the vicious domestic competition has helped them dominate that technology globally. The US freaked out in a massive scandal when one subsidized solar panel maker went out of business.
The plans weren't wishes, they were things they were actively working on to make happen. The point is they didn't need "Trump's erratic and corrupt trade policy" to motivate it, they were already motivated to do it anyway.
Yes, they were "actively working on it"; no, they had made little significant progress despite throwing tons of money at the initiative.
There were lots of stories along the lines of https://www.nytimes.com/2021/07/19/technology/china-microchi... from the early 2020s, not so many lately. Their internal posture will now be the same as Russia's post-1945 push for the Bomb. Continued failure will (possibly literally) place heads at stake.
The US's problem is that its actions are uncoordinated.
They are coordinated well enough, but with the goal of magnifying Cheeto Benito's personal influence and cultivating his in-group's fortunes.
No. That problem is bigger than the Trump administrations, focusing on him is lazy.
It's absurd to say that without elaborating on how anyone else was "just as bad," which I expect will be a key part of your next reply.
Trump is fucking bad, and if you disagree after all we've seen, you're either arguing in bad faith, or you're not such a great person yourself. He is costing us every jot and tittle of soft power we ever wielded as a nation.
Trump is the living embodiment of the old cliché about how in the Chinese language, the words for "threat" and "opportunity" are similar. His actions have comforted Russia, alienated Europe, and galvanized China.
> It's absurd to say that without elaborating on how anyone else was "just as bad," which I expect will be a key part of your next reply.
> Trump is fucking bad, and if you disagree after all we've seen, you're either arguing in bad faith, or you're not such a great person yourself. He is costing us every jot and tittle of soft power we ever wielded as a nation.
Sorry dude, all of that is coming from inside your own head. You're so blinded by Trump that you're incapable of having this conversation.
I don't want to put in the effort to try to fix that. Have a nice day.
> Right, this result seems meaningless without a human clinician control.
> I'd very much like to see clinicians randomly selected from BetterHelp and paid to interact the same way with the LLM patient and judged by the LLM, as the current methodology uses. And see what score they get.
Does it really matter? Per the OP:
>>> Across all models, average clinical performance stayed below 4 on a 1–6 scale. Performance degraded further in severe symptom scenarios and in longer conversations (40 turns vs 20).
I'd assume a real therapy session has far more "turns" than 20-40, and if model performance starts low and gets lower with longer length, it's reasonable to expect it would be worse than a human (who typically don't the the characteristic of becoming increasingly unhinged the longer you talk to them).
> Betterhelp is a nightmare for clients and therapists alike. Their only mission seems to be in making as much money as possible for their shareholders. Otherwise they don't seem at all interested in actually helping anyone. Stay away from Betterhelp.
So taking it as a baseline would bias any experiment against human therapists.
Yes, it absolutely does matter. Look at what you write:
> I'd assume
> it's reasonable to expect
The whole reason to do a study is to actually study as opposed to assume and expect.
And for many of the kinds of people engaging in therapy with an LLM, BetterHelp is precisely where they are most likely to go due to its marketing, convenience, and price. It's where a ton of real therapy is happening today. Most people do not have a $300/hr. high-quality therapist nearby that is available and that they can afford. LLM's need to be compared, first, to the alternatives that are readily available.
And remember that all therapists on BetterHelp are licensed, with a master's or doctorate, and meet state board requirements. So I don't understand why that wouldn't be a perfectly reasonable baseline.
> I love how the top comment on that Reddit post is an affiliate link to an online therapy provider.
Posted 6 months after the post and all the rest of the comments. It's some kind of SEO manipulation. That reddit thread ranked highly in my Google search about Betterhelp being bad, so they're probably trying to piggyback on it.
I’m not against affiliate links. I’m just pro-disclosure especially for something as important as therapy and it seems like maybe you should mention you make $150 for each person that signs up.
> And the article is not written in any kind of cautionary humanitarian approach, but rather from perspective of some kind of economic determinism? Have you ever thought that you would be compared to a gasoline engine and everyone would discuss this juxtaposition from purely economic perspective?
One of the many terrible things about software engineers their the tendency to think and speak as if they were some kind of aloof galaxy-brain, passively observing humanity from afar. I think that's at least partially the result of 1) identifying as an "intelligent person" and 2) computers and the internet allowing them to in-large-part become disconnected from the rest of humanity. I think they see that aloofness as being a "more intelligent" way to engage with the world, so they do it to act out their "intelligence."
I always thought intentionally applying an emotional distance was a strategy to help us see what's really happening, since allowing emotions to creep in causes us reach conclusions we want (motivated reasoning) instead of conclusions that reflect reality. I find it a valuable way to think. Then there's always the fact that the people who control the world have no emotional attachment to you either. They see you as something closer to a horse than their kin. I imagine a healthy dose of self-dehumanization actually helps us understand the current trajectory of our future. And people tend to vastly overvalue our "humanity" anyway. I'm guessing the ones that displaced horses didn't give much of a fuck about what happened to horses.
I wish I knew what you were so I could say "one of the many terrible things about __" about you. Anyway, I think you have an unhealthy emotional attachment to your emotions.
> I wish I knew what you were so I could say "one of the many terrible things about __" about you.
I'm a software engineer, so I beat you to it.
> I always thought intentionally applying an emotional distance was a strategy to help us see what's really happening, since allowing emotions to creep in causes us reach conclusions we want (motivated reasoning) instead of conclusions that reflect reality. I find it a valuable way to think.
And the problem is taking that too far, and doing it too much. It's a tactic "to help us see what's really happening," but it's wrong to stop there and forget things like values, interests, and morality.
> And people tend to vastly overvalue our "humanity" anyway.
WTF, man.
> I'm guessing the ones that displaced horses didn't give much of a fuck about what happened to horses.
Who cares what "the ones that displaced horses" thought? You're the horse in that scenario,and the horse cares. Another obnoxious software engineer problem is taking the wrong, often self-negating, perspective.
Yes, the robber who killed you to steal your stuff probably didn't mind you died. So I guess everything's good, then? No.
> Anyway, I think you have an unhealthy emotional attachment to your emotions.
Emotions aren't bad, they're healthy. But a rejection of them is probably a core screwed-up belief that leads to "aloof galaxy-brain, passively observing humanity from afar" syndrome.
There's probably parallel to the kind of obliviousness that gets you the behavior in the Torment Nexus meme ("Tech Company: "At long last, we have created the Torment Nexus from the classic sci-fi novel Don't Create The Torment Nexus.'") i.e. "Software Engineer: 'At long last, I've purged myself of emotion and become perfectly logical like Lt. Cmdr. Data from the classic sci-fi Logical Robot Data Wants to Be Human and Feel Emotions."
Thus strikes more in the tone of Orwell who used a muted emotional register to elicit a powerful emotional response from the reader as they realize the horror of what’s happening.
That's an example of a weird heuristic I frequently see applied to corporations: assume some awful decision is the result of some scarily hyper-competent design process, and construct a speculative explanation along those lines.
But must of us have worked in corporations, an know how stupid and incompetent they can be.
> Or, they just didn't study that. Or, the decision-makers don't contact customer support for themselves and so don't know how infuriatingly unhelpful AI ones are.
Occam's razor points to this as the reason.
reply