Hacker Newsnew | past | comments | ask | show | jobs | submit | 0000000000100's commentslogin

Yeah just took down the prod site for one of our clients since we host the front-end out of their CDN. Just got wrapped up panic hosting it somewhere else for the past hour, very quickly reminds you about the pain of cookies...


... and DNS caching, and browser file cache, and sessions...

Moving a website quickly is never fun.


Not egregious API spending, but ChatGPT Pro was been one of the best investments our company has paid for.

It is fantastic at reasonable scale ports / refactors, even with complicated subject matter like insurance. We have a project at work where Pro has saved us hours of time just trying to understand the over complicated that is currently in place.

For context, it’s a salvage project with a wonderful mix of Razor pages and a partial migration to Vue 2 / Vuetify.

It’s best with logic, but it doesn’t do great with understanding the particulars of UI.


How are you getting these results? Even with grounding in sources, careful context engineering and whatever technique comes to your mind we are just getting sloppy junk out of all models we have tried.

The sketchy part is that LLMs are super good at faking confidence and expertise all while randomly injected subtle but critical hallucinations. This ruins basically all significant output. Double-checking and babysitting the results is a huge time and energy sink. Human post-processing negates nearly all benefits.

Its not like there is zero benefit to it, but I am genuinely curious how you get consistently correct output for a "complicated subject matter like insurance".


I genuinely think that biggest issue LLM tools is that most people expect magic because first attempts at some simple things feel magical. however, they take insane amount of time to get expertise in. what is confusing is that I think SWEs spent immense amounts of time in general learning the tools of the trade but this seems to escape a lot of people when it comes to LLMs. on my team, every developer is using LLMs all day, every day. on average based on sprint retros each developer spends no less than an hour each day experimenting/learning/reading… how to make them work. the realization we made early is that when it comes to LLMs there are two large groups:

- group that see them as invaluable tools capable of being an immense productivity multiplier

- group that tried things here and there and gave up

we collectively decided that we want to be in the first group and were willing to put time to be in that group.


I'm persisting, have been using LLMs quite a bit for the last year, they're now where I start with any new project. Throughout that time I've been doing constant experimentation and have made significant workflow improvements throughout.

I've found that they're a moderate productivity increase, i.e. on a par with, say, using a different language, using a faster CI system, or breaking down some bureaucracy. Noticeable, worth it, but not entirely transformational.

I only really get useful output from them when I'm holding _most_ of the context that I'd be holding if writing the code, and that's a limiting factor on how useful they can be. I can delegate things that are easy, but I'm hand-holding enough that I can't realistically parallelise my work that much more than I already do (I'm fairly good at context switching already).


I have been in teams that do this and in teams that dont.

I have not see any tangible difference in the output of both.


year-over-year we are at around 45% in increased productivity and this trajectory is on an upward slope


How are you measuring increased productivity? Honest question, because I've seen teams claim more code, but I've also seen teams say they're seeing more unnecessary churn (which is more code).

I'm interested in business outcomes, is more code or perceived velocity translating into benefits to the business? This is really hard to measure though because in pretty much any startup or growing company you'll see better business outcomes, but it's hard to find evidence for the counterfactual.


same as we have before LLMs for a decade - story points. we move faster now, we have automated stuff we could never automate before. same project, largely same team since 2016, we just get a lot more shit done, a lot more


So something like: automate unit tests, where the tests are X points where you'd not have done these before?

Not snarking, but if they are automated away, then isn't this like 0 story points for effort/complexity?


hehe not snarky at all - great question. this was heavily discussed but in order to measure productivity gains (we are phasing this out now) we kept the estimations the same as before. as my colleague put it you don’t estimate based on “10x developer” so we applied the same concept. now that everyone is “on board” we are phasing this out


Thanks, I'm probably a kook but I've never wanted to put any non-product, user-visible feature-related tasks on the board with story points (tests, code cleanup, etc) and just folded that into the related user work (mainly to avoid some product person thinking they "own" that and can make technical decisions).

So the product velocity didn't exactly go up, but you are now producing less technical debt (hopefully) with a similar velocity, sounds reasonable.


I'm glad you're more productive, although I would question this result both in terms of objectivity (story points are typically very subjective), and in terms of capturing all externalities of the LLM workflow. It's easy to have "build the thing", "fix the thing", "remove tech debt in the thing", "replace the thing" be 4 separate projects, each with story points, where "build the better thing" would have been one, and churn is something that is evidenced with LLM development.


This reads like the bullshit bulletpoints people write on their CV.


comments like this give me warm and fuzzy feeling that theoretically we compete for same jobs - no worries about job security for forseeable future :)


Someones ego got hurt.


talking to yourself in third person? :)


You keep comming back to this fights online because is the only real interactions you can have with people outside work.

You will live the rest of your life like that. Because nobody likes you. Enjoy.


ouch that is not very nice :)


dont you think it would be better off getting that expertise in actual system design, software engineering and all the programming related fields. by involving chat GPT to make code, we ll eventually lose the skill to sit and craft code like we used to do all these years. after all the brain s neural pathways only remember what you put to work daily


Where are you finding the best material for reading/learning?


- everything that simon writes (https://simonwillison.net/)

- anything that goes deep into issues (I seldom read “i love llms” type posts like this is great: https://blog.nilenso.com/blog/2025/09/15/ai-unit-of-work/)

- lots of experimentation - specifically I have spent hours and hours doing the exact same feature (my record is 23 times).

- if something “doesn’t work” I create a task immediately to investigate and understand it. even the smallest thing that bother me I will spend hours to figure out why it might have happened (this is sometimes frustrating) and how to prevent it from happening again (this is fun)

My collegue describes the process as Javascript developer trying to learn Rust while tripping on mushrooms :)


> Its not like there is zero benefit to it, but I am genuinely curious how you get consistently correct output for a "complicated subject matter like insurance".

Most likely by trying to get a promotion or bonus now and getting the hell out of Dodge before anyone notices those subtle landmines left behind :-)


Cynical, but maybe not wrong. We are plenty familiar with ignoring technical debt and letting it pile up. Dodgy LLM code seems like more of that.

Just like tech debt, there's a time for rushing. And if you're really getting good results from LLMs, that's fabulous.

I don't have a final position on LLM's but it has only been two days since I worked with a colleague who definitely had no idea how to proceed when they were off the "happy path" of LLM use, so I'm sure there are plenty of people getting left behind.


Wow the bad faith is quite strong here. As it turns out, small to mid sized insurance companies have some ridiculously poorly architected front ends.

Not everyone is the biggest cat in town with infinite money and expertise. I have no intention of leaving anytime soon, so I have confidence that the code that was generated by the AI (after confirming with our guy who is the insurance OG) is solid improvement over what was before.


The bad faith is super strong when it's being swamped by a lot more bad faith driven by greed. I'm not talking about you, but about all these companies with overnight valuations in the billions and their PR machines.

To your example, frankly, I would have started with that very important caveat, of an initial situation defined by very poor quality. It's a very valid angle as a lot of code that's available today is of very low quality and if AI can't take 1/10 or 2/10 and make it 5/10 or 6/10, yes, everyone benefits.


A lot of programmers that say that LLMs are awesome tend to be inexperienced, not good programmers, or just gloss over the significant amount of extra work that using LLMs requires.

Programmers tend to overestimate their knowledge of non-programming domains, so the OP is probably just not understanding that there are serious issues with the LLM's output for complicated subject matters like insurance.


What are you trying to use LLMs for and what model are you using?


Depends a lot. Use it for one off scripts, particularly for anything Microsoft 365 related (expanding Sharepoint drives, analyzing AWS usage, general IT stuff). Where there is a lot of heavy context based business logic it will fail since there’s too much context for it to be successful.

I work in custom software where the gap in non-LLM users and those who at least roughly know how to use it is huge.

It largely depends on the prompt though. Our ChatGPT account is shared so I get to take a gander at the other usages and it’s pretty easy see: “okay this person is asking the wrong thing”. The prompt and the context has a major impact on the quality of the response.

In my particular line of work, it’s much more useful than not. But I’ve been focusing on helping build the right prompts with the right context, which makes many tasks actually feasible where before it would be way out of scope for our clients budgets.


Could you give an example of a prompt?


You are a top stackoverflow contributor with 20 years of experience in...


I meant an example of the prompts he was attempting, in case it helped provide advice.


I agree (not OP). The difference in addictiveness between the three big boys (Facebook/Instagram, YouTube, TikTok) grows smaller with every passing year as their back-catalog of content grows.

Pretty much everyone I know consumes TikTok style content these days. I personally have blocked myself from this stuff via deleting the Insta, YouTube and I even wrote a TamperMonkey script to block myself from getting trapped down the rabbit hole.

Self shout out: https://greasyfork.org/en/scripts/534969-begone-youtube-shor...


Go look at the PR man, it's pretty clear that he hasn't just dumped out LLM garbage and has put serious effort and understanding into the problem he's trying to solve.

It seems a little mean to tell him to stop coding forever when his intentions and efforts seem pretty positive for the health of the project.


One of resolved conversation contains a comment "you should warn about incorrect configuration in constructor, look how it is done in some-other-part-of-code."

This means that he did not put serious effort into understanding what, when and why others do in a highly structured project like LLVM. He "wrote" the code and then dumped "written" code into community to catch mistakes.


That is normal for a new contributor. You can't reasonably expect knowledge of all the conventions of the project. There has to be effort to produce something good and not overload the maintainers, I agree, but missing such a detail is not a sign that is not happening here.


Every hobby at some point turns into an exclusive, invitation-only club in order to maintain the quality of each individual's contribution, but then old members start to literally die and they're left wondering why the hobby died too. I feel like most people don't understand that any organization that wants to grow needs to sacrifice quality in order to attract new members.


Have you ever contributed to a very large project like LLVM? I would say clearly not from the comment.

There are pitfalls everywhere. It’s not so small that you can get everything in your head with only a reading. You need to actually engage with the code via contributions to understand it. 100+ comments is not an exceptional amount for early contributions.

Anyway, LLVM is so complex I doubt you can actually vibcode anything valuable so there are probably a lot of actual work in the contribution.

There is a reason the community didn’t send them packing. Onboarding new comer is hard but it pays off.


  > Have you ever contributed to a very large project like LLVM?
Oh, I did. Here's one: https://github.com/mariadb-corporation/mariadb-columnstore-e...

  > I would say clearly not from the comment.
Of course, you are wrong.

  > It’s not so small that you can get everything in your head with only a reading.
PSP/TSP recommends writing typical mistakes into a list and use it to self-review and to fix code before sending it into review.

So, after reading code, one should write down what made him amazed and find out why it is so - whether it is a custom of a project or a peculiarity of code just read.

I actually have such a list for my work. Do you?

  > You need to actually engage with the code via contributions to understand it. 100+ comments is not an exceptional amount for early contributions.
No, it is not. Dozens of comments on a PR is an exceptional amount. Early contributions should be small so that one can learn typical customs and mistakes for self review before attempting a big code change.

That PR we discuss here contains a maintainer's requirement to remove excessive commenting - PR's author definitely did not do a codebase style matching cleanup job on his code before submission.


The personal dig was unwarranted. I apologise.

> So, after reading code, one should write down what made him amazed and find out why it is so - whether it is a custom of a project or a peculiarity of code just read.

Sorry but that’s delusional.

The amount of people actually able to meaningfully read code, somehow identify what was so incredible it should be analysed despite being unfamiliar with the code base, maintain a list of their own likely error and self review is so vanishingly low it might as well not exist.

If that’s the bare a potential new contributor has to cross, you will get exactly none.

I’m personally glade LLVM disagree with you.


  >The amount of people actually able to meaningfully read code, somehow identify what was so incredible it should be analysed despite being unfamiliar with the code base, maintain a list of their own likely error and self review is so vanishingly low it might as well not exist.
List of frequent mistakes gets collected after contributions (attempts). This is standard practice for high quality software development and can be learned and/or trained, including on one's own.

LLVM, I just checked, does not have a formal list of code conventions and/or typical errors and mistakes. Could they have that list, we would not have the pleasure to discuss that. That PR we are discussing would be much more polished and there would be much less than several dozens of comments.

  > If that’s the bare a potential new contributor has to cross, you will get exactly none.
You are making very strong statement, again.



It's really nice to have something like this baked in. I can see this being handy if it's connected to external learning resources / sites to have a more focused area of search for it's answers. Having hard defined walls in the system prompt to prevent just asking for the answer seems pretty handy to me, particularly in a school setting.


Yeah, for sure. I wasn't asking from the framing of saying it's a bad idea, my thoughts were more driven by this seeming like something every other major player can just copy with very little effort because it's already kind of baked into the product.


Our company bought about 4-5 Framework 13s, and boy were they a bad experience. All sorts of driver issues, random crashes, USB ports not working right, etc.

Just about all of them had some kind of issue, which is really fun when your PM has a USB port not work randomly.

Ended up going back to HP laptops, 30% cheaper for the same specs and they just work consistently.

Would love to hear a hobbyist perspective, Frameworks are not a good choice for a business but I would be interested to hear if the replaceable parts / ports provided value for someone. My gut feeling is that something that can't be replaced easily in the Frameworks will die and it'll just end up being cheaper to replace the whole laptop.


Hobbyist here, and while my issues have been fixed, I had a pretty bad experience. I had the 12th-gen Intel model I bought in 2022, and moderate amounts of load would trigger thermal protection and throttle all CPU cores to 400MHz. The throttling could last for seconds, or several tens of minutes, or even require me to power down the laptop for a while and come back to it later. (This was even though temperatures would always drop out of the danger zone in under a second.)

After nearly two years (two years!) of back and forth with support, including a mainboard replacement that didn't fix the problem, they finally upgraded me to the 13th-gen Intel mainboard, and the problems immediately went away.

Right now I'm struggling with a keyboard issue; a few of the keys intermittently don't register presses. I have a new keyboard that I ordered that I hope will fix the problem, and need to install, just haven't gotten to it. (I'm not sure if this is a result of a defect, or of one of my cats walking on the keyboard and possibly damaging it, so I'm not ready to blame Framework for this one.)

Aside from that, I haven't had driver issues, random crashes, or any problems with the USB ports. But I assume you're talking about Windows; I use Linux, so that's not an apples-to-apples comparison.

> My gut feeling is that something that can't be replaced easily in the Frameworks will die and it'll just end up being cheaper to replace the whole laptop.

The mainboard is of course the most expensive part, but it's still gong to be cheaper to replace it than the entire laptop. I don't believe there are any available replacement parts to the laptop that cost more than the full cost of the laptop.


The replaceable parts definitely add value as someone who's had one for 4 years now or something like that. It's probably got more new parts than old, some for performance improvements, others for damage because I'm not especially gentle.

I don't really think it's tremendous value if you're purely talking about laptop per dollar. I probably could've bought two similarly performant laptops for the amount I've spent on the Framework over the years, maybe two and a half. But it is incredible peace of mind to know that the same machine I already have will keep working even if some part of it breaks, I don't have to worry about reinstalling or losing anything or losing the stickers I have on the thing or whatever else. The old mainboard I upgraded from is now a home server with a nice 3D printed case. There's way less e-waste, one thing going wrong doesn't make the whole device a brick. And there is just a genuinely enjoyable novelty to how easy it is to take apart.

It's a hobbyist device through and through. It's for people who like using desktop Linux, because they feel empowered by being able to fix their problems, with the occasional side effect that sometimes they'll have to.


Thing is, the major part the motherboard will cost you the price of a competitive laptop.

I want to love framework, but their prices just don’t justify the switch for most people.


The first run of Frameworks had a weak hinge on the monitor, which isn't an uncommon problem with other brands of laptop. With Framework, you can easily replace the hinge, but that's unlikely with most other brands, and you'll need to pay to replace the entire monitor.

Another example, I didn't need an HDMI port anymore, and wanted an extra USB-C instead. Just a few bucks to swap with Framework, but impossible with other laptops.

I did have an issue with one of my USB ports on the Framework however. It was solved by removing the module and updating the bios firmware. Can't say I've ever had that happen with another laptop. I agree they're probably not ready for business use yet, where cost is the primary measurement.


It seems that the swappable modules would also make it easy for someone to install e.g. a keylogger, though.


You can lock the modules with a button and also screw them in from the inside.

Not saying it's perfect but it's a far cry from just swapping a module.


If they are close enough to do that without me noticing I already have a ton of problems to fix instead of worrying about my Framework's module security.


You could simply be in a coffee shop or library.


I have one as a developer laptop running Linux. It works fine, battery life is bad. (On AMD 7640U Framework 13).

I currently couldn't recommend them to anyone except users (developers?) who want to run Linux specifically. Otherwise a Macbook is going to be a much better computer at a better value, or just get any boring Windows laptop provider.

Pros compared to Macbook: - Runs Linux - amd64 makes some legacy software work easier - Easy and commodity prices to get 96gb of RAM and 2tb SSD.

Macbook pros: - Massively better battery life - Snappier/faster in general usage - Much more polished than Linux

I evaluated Thinkpads as well but trying to find one with the right configuration that wasn't too expensive or worse than the Framework was pretty hard.


Pretty neat. The automatic breakdowns are cool, but you absolutely need to move the delete button inline. Confirm dialog if there are items beneath it, otherwise just delete.

Generated like 10 sub-items for me, 5 of which were relevant. But to remove the 5 junk ones, you have to open the dropdown for each and hit delete.


I can't personally. OpenAI's o3 aside, the rate of progress in the past two years has been eye watering to say the least.

It's tricky since the future of AI isn't something anyone can really prove / disprove with hard facts. Doomers will say that the rate of improvement will slow down, and anti-doomers will say it won't.

My personal believe is that with enough compute, anything is possible. And our current rate of progress in both compute and LLM improvement has left Doomers with shaky ground to discount the eventuality of an AGI being developed. This just leaves ASI as a true question mark in my mind.


> rate of progress in the past two years

This took me down a memory lane:

- Dragon Dictate speed recognition improvement curve in the mid-90s would have led to today's Siri sometime around 1999.

- The first couple of years of Siri & Alexa updates...

- Robots in the '80s led us to believe that home robots would be more or less ubiquitous by now. (Beyond floor cleaners.)

- CMU winning the DARPA Urban challenge for autonomous vehicles was a big fake-out in terms of when AVs would actually land.

Most of the benefits of computing come from relatively small improvements, continuously made over many years & decades. 2-4 years is not enough time to really extrapolate in any computing domain.

> with enough compute

"enough" here could be something that is only measurable on the Kardashev scale.


Wouldn't there be so many more possible futures though? Geopolitical conflict, economic crisis, the clime crisis, civil strife, demographic collapse, the end of globalization, an unexpected black-swan event, etc. Any one of these, even a pandemic, could push back, if not utterly prevent us from reaching these.

Endless growth and technological improvement isn't the only option, and seems to me like the least likely. The other option means that there will be a peak somewhere.


Very true and prescient. All of the technological growth of the last two decades has only been possible because of peace and cooperation between the Core Countries, but that world is as its most unstable point in decades and future peace is not guaranteed. As impressive as LLMs can be, a computer still loses to a crude home-made bomb.


> past two years has been eye watering to say the least

Are we seeing the same progress? GPT-4 was released in March 2023, that's almost two years. Tools are much better but where is the vast improvement?


I legitimately dont know how to reply, bc by this point llms co-own all aspects of my life and jumps between gpt4->claude3->claude3.5->o1 have all been very noticeable


I'm the opposite. We're presumably in a similar line of work, but while I've experimented with every major release from OpenAI and Anthropic last year -- I've barely ever used an LLM outside of that.

I still Google things I want to know and skip the AI part.


> I still Google things I want to know and skip the AI part.

My Google use is down significantly. And I mostly reach for it when I am looking for current information that LLMs do not yet have training data for. However, this is becoming less of an issue as of late. DeepSeek for example has a lot of current data.


GPT-2 was generating snippets of HTML ten years ago. Was it valid? Not always, but neither is the current crop. It's been incremental logarithmic gains approaching an asymptote for ten years now. Since before "Open"AI stopped being open.


GPT-1 was released 7 years ago, but ok. You really think GPT-4 to o1 is increasingly logarithmic the same way 4 to 4o is?


The rate of improvement in the models is nothing short of phenomenal, but the applications are meh at best, even after a few years of billions of dollars and endless hours poured by the world's best product and engineering minds. Every AI leader is pushing "agentic AI" as the next big thing but as a specialist in business automation I have my reservations. A lot of problems in automation in business happen because of insufficient investment in IT but can be solved fairly economically by off-the-shelf software, Zapier, a custom web service, or traditional ML techniques in order of complexity. Out of the more difficult problems that remain at the edges, only a small fraction can be solved by LLMs in my experience. The idea that chains of small agents will be composed and generally applied to any business problem under the sun doesn't sound right to me. I think not even the big bosses in AI know at present, but they're surely betting the house on it, and if the bet doesn't play out, things will start looking even more desperate.


> My personal believe is that with enough compute, anything is possible.

Dunno, we're already at ridiculous amounts of compute and progress has slowed, a lot. I think we need another technological breakthrough, a change in technique, something. LLMs don't seem to be capable of actually learning in the way humans do, just being trained on data, of which we've reached the limit.


Privacy is a big one, but avoiding censorship and reducing costs are the other ones I’ve seen.

Not so sure about the reducing costs argument anymore though, you'd have to use LLMs a ton to make buying brand new GPUs worth it (models are pretty reasonably priced these days).


I never understand these guardrails. The whole point of llms (imo) is for quick access to knowledge. If I want to better understand reverse shell or kernel hooking, why not tell me? But instead, “sorry, I ain’t telling you because you will do harm” lol


Key insight: the guardrails aren't there to protect you from harmful knowledge; they're there to protect the company from all those wackos on the Internet who love to feign offense at anything that can get them a retweet, and journalists who amplify their outrage into storms big enough to depress company stock - or, in worst cases, attract attention of politicians.


There are also plausibly some guardrails resulting from oversight by three letter agencies.

I don't take everything Marc Andreessen said in his recent interview with Joe Rogan at face value, but I don't dismiss any of it either.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: