Hacker Newsnew | past | comments | ask | show | jobs | submit | spoaceman7777's commentslogin

"Most of these students are claiming mental health conditions and learning disabilities, like anxiety, depression, and ADHD."

This is clickbait. There are diseases and disorders, and we have medicine to treat them so that people can be functional in society (particularly, work and school).

Nothingburger.


Yeah, this is pure clickbait. And it's barely even new "news" anyway. Apple has been actively testing Intel's 18A process, and has been in talks with Samsung as well.

There are only three advanced chip manufacturers in the world (TSMC, Samsung, and Intel), so of course Apple has been in talks with all of them.

The whole story is based on a tweet from an analyst anyway. "Company that designs its own chips might use one of the three chip manufacturers in the world, based on my hunch." hmmmmmmmmm... Sounds more like someone just loaded up on Intel calls to me. (Not that the reality of the situation isn't real and obvious.)


So, basically, this is just SIM card functionality for the age of eSIMs?

A lot of people in this thread seem unaware of what SIM cards actually are and do.


Hmm. I think you may be confusing sycophancy with simply following directions.

Sycophancy is a behavior. Your complaint seems more about social dynamics and whether LLMs have some kind of internal world.


Even "simply following directions" is something the chatbot will do, that a real human would not -- and that interaction with that real human is important for human development.


A solution I haven't yet seen in this thread is to buy multiple drives, and sacrifice the capacity of one of those drives to maintain single parity via a raidz1 configuration with zfs. (raidz2 or raidz3 are likely better, as you can guard against full drive failures as well, but you'd need to increase the number of drives' capacity that you're using for parity.)

zfs in these filesystem-specific parity-raid implementations also auto-repairs corrupted data whenever read, and the scrub utility provides an additional tool for recognizing and correcting such issues proactively.

This applies to both HDDs and SSDs. So, a good option for just about any archival use case.


this is about drives that are not plugged in. are you saying parity would let you simply detect that the data had gone bad? increasing the number of drives would increase the decay rate, more possibilities for a first one to expire. if your parity drive expired first, you would think you had errors when you didn't yet.


No, I'm talking about parity raid (raidz1/z2/z3, or, more familiarly, raid 5 and 6).

In a raidz1, you save one of the n drives' worth of space to store parity data. As long as you don't lose that same piece of data on more than one drive, you can reconstruct it when it's brought back online.

And, since the odds of losing the same piece of data on more than one drive is much lower than the odds of losing any piece of data at all, it's safer. Upping it to two drives worth of data, and you can even suffer a complete drive failure, in addition to sporadic data loss.


but this article is about unpowered ssds, those that are not in service but being kept for the data that is on them, as a type of back up copy. they could represent an entire raid set.

How would this work? Wouldn't all these drives start loosing data at roughly at the same time?


Yes, but different pieces of data. The stored parity allows you to reconstruct any piece of data as long as it is only lost on one of the drives (in the single parity scenario).

The odds of losing the same piece of data on multiple drives is much lower than losing any piece of data at all.


But the data is not disappearing, it's corrupted - so how do you know which bits are good and which are not?


checksums

Where are you storing your checksums? If the answer is 'on the corrupted drive' then what makes you think that the checksums are correct?

?? It's free, and it protects you from all sorts of nasty things.

I can't think of any reason not to use cloudflare. It's _dead easy_ to set up too.

I can't help but think that the author understands what cloudflare actually does, or just has a poor understanding of what goes on on the internet. Probably a bit of just being in a bad mood about cloudflare being down too.


The biggest argument against using it is that if everyone uses it, there is no Internet but Cloudflare; and so CLoudflare is the decider and arbiter of Internet access for all.


I get these arguments and I see the appeal. But should this be the primary reason to use them, this way the web is being massively centralized. Everything running through them doesn't seem that smart to me.

But of course I understand that for most users this isn't really a concern and the benefits that cf provides are much more important rather then the centralization problem.


Yeah, for me this is the main reason. I don't need it (even though I self host many websites, some having 100k requests/day, which is reasonable for a homelab). But most importantly, and don't want all the traffic to my websites being MITM by a company, even more so when it's foreign


Many also put their personal stuff behind CloudFlare because it's a good way to learn a tool that they might need professionally later.

I'm all for decentralizing and I don't feel the need for CloudFlare personally, but yes, arguing that people really shouldn't be doing it, period, requires some good technical reason or a more convincing political stance.


If you use Cloudflare, your website will be inaccessible by well over half of German connections in the evening.


I instantly knew you are talking about Deutsche Telekom and their shit-tier transits.


But your site will be down for 3 hours once every 3 years!!1


Wow. They must have had some major breakthrough. Those scores are truly insane. O_O

Models have begun to fairly thoroughly saturate "knowledge" and such, but there are still considerable bumps there

But the _big news_, and the demonstration of their achievement here, are the incredible scores they've racked up here for what's necessary for agentic AI to become widely deployable. t2-bench. Visual comprehension. Computer use. Vending-Bench. The sorts of things that are necessary for AI to move beyond an auto-researching tool, and into the realm where it can actually handle complex tasks in the way that businesses need in order to reap rewards from deploying AI tech.

Will be very interesting to see what papers are published as a result of this, as they have _clearly_ tapped into some new avenues for training models.

And here I was, all wowed, after playing with Grok 4.1 for the past few hours! xD


The problem is that we know in advance what is the benchmark, so Humanity's Last Exam for example, it's way easier to optimize your model when you have seen the questions before.


From https://lastexam.ai/: "The dataset consists of 2,500 challenging questions across over a hundred subjects. We publicly release these questions, while maintaining a private test set of held out questions to assess model overfitting." [emphasis mine]

While the private questions don't seem to be included in the performance results, HLE will presumably flag any LLM that appears to have gamed its scores based on the differential performance on the private questions. Since they haven't yet, I think the scores are relatively trustworthy.


The jump in ARC-AGI and MathArena suggests Google has solved the data scarcity problem for reasoning, maybe with synthetic data self-play??

This was the primary bottleneck preventing models from tackling novel scientific problems they haven't seen before.

If Gemini 3 Pro has transcended "reading the internet" (knowledge saturation), and made huge progress in "thinking about the internet" (reasoning scaling), then this is a really big deal.


How do they hold back questions in practice though? These are hosted models. To ask the question is to reveal it to the model team.


They pinky swear not to store and use the prompts and data lol


A legally binding pinky swear LOL


with fineprint somewhere on page #67, that there are exceptions.


Who needs fine print when there is an SRE with access to the servers who is friends with a research director who gets paid more if the score goes up?


You have to trust that the LLM provider isn't copying the questions when Humanities Last Exam runs the test.


There are only eleventy trillion dollars shifting around based on the results, so nobody has any reason to lie.


Seems difficult to believe, considering the number of people who prepare this dataset, who also work(ed) or hold shares in Google or OpenAI, etc.


So everybody is cheating in your mind? We can't trust anything? How about taking a more balanced take: there's certainly some progress, and while the benchmark results most likely don't represent the world reality, the progress is continuous.


This. A lot of boosters point to benchmarks as justification of their claims, but any gamer who spent time in the benchmark trenches will know full well that vendors game known tests for better scores, and that said scores aren’t necessarily indicative of superior performance. There’s not a doubt in my mind that AI companies are doing the same.


I don't think any of these companies are that reductive and short-sighted to try to game the system. However, Goodhart's Law comes into play. I am sure they have their own metrics that arr much more detailed than these benchmarks, but the fact remains LLMs will be tuned according to elements that are deterministically measurable.


shouldn't we expect that all of the companies are doing this optimization, though? so, back to level playing field.


Its the other way around too, HLE questions were selected adversarially to reduce the scores. I'd guess even if the questions were never released, and new training data was introduced, the scores would improve.


not possible on ARC-AGI, AFAIK


SWE-Bench Verified | 76.2% | 59.6% | 77.2% | 76.3% is actually insane.


Anthropomorphic found their corner and are standing strong there.


Wezterm was great, but I had to stop using it recently because it keeps crashing immediately (across two different computers) on CachyOS/Arch :/

It's just broken on KDE permanently I guess :/ There have beem tickets about it, and there is an AUR repo with a patch that used to fix it... but :/

Was already worried about the project given that it hasn't seen a new release in quite a long time. Got the feeling that the maintainer has mostly moved on.


It's been almost a year for me now but I had also stopped using it due to crashes. And since it shares a process for all your windows, it would close all my windows which just drove me nuts after awhile.

Though for me, I only wanted the absolute bare minimum, which Alacritty covers. I was sad to lose ligatures, but Alacritty is stable and very fast.


Ghostty replaced wezterm for me, plus it supports ligatures and is just as fast as alacritty.


I was really excited for it, because I didn't know about it, but seems to scratch every terminal itch I've got.

Except it didn't run on my Windows VDI because "The OpenGL implementation is too old to work with glium". There's a config workaround here https://github.com/wezterm/wezterm/issues/1813 though. I don't know if this setting, or the implementation per se, makes rendering slow, but it's unusable for me. I can't wait a couple of seconds for every keystroke inside mc, rendering text is supposed to be lightning fast.


> Was already worried about the project given that it hasn't seen a new release in quite a long time

IIRC the maintainer was moving countries. Not saying that's the main or only reason, but it is likely a factor.


I'm also suffering this on macOS. I can get through an entire work day, but when I close the laptop for the night and come back the next day, wez has crashed and I don't know why.


Why do people care? It's nice, been using it since Nightly. And you have to actually hook it up to a service for it to do anything.

The toggle for the popup is in (as you might expect) the settings hamburger menu in the AI Panel. There's even a remove button! Lots of new things have been added to browsers over the years, and these AI features are becoming incredibly popular as more users recognize the utility (and what they actually are).

This just seems like some more anti-AI hysteria.


> Why do people care?

It's clutter for a feature that I'm not going to use. I'm not upset it's there for those who want it, but it's also nice to be able to get rid of it.


I don’t store passwords in Firefox, nor do I use the ”Save page as”, I have never used the ”Report broken site” feature, and never activated ”Troubleshooting Mode”. I have never needed to configure network settings in my browser, and so on… As far as this discussion is concerned all these are bloat because they are not used. Seems like a strange yardstick to keep, when it cannot be properly applied, no?


I think it'd be cool to be able to remove those from the UI if you're not using them, yeah. For me personally, I find the Firefox UI is pretty streamlined, so suddenly seeing new right-click menu elements that I'm not going to use was a bit jarring and I'm glad there's a setting to remove them.


What in the world is this doing at the top of hackernews...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: