Hacker Newsnew | past | comments | ask | show | jobs | submit | sneilan1's commentslogin

I love it! I'm at level 6 and brave enough to try. I'm in. Giving this a shot!

I am at 5 at home, 3 or something shit at work. I don't like wasting money tho.

This wins the internet today. Amazing work!!!


Published an edit today (post dated in Nov. but I've rewritten it 5x now) on my tutorial to use llama3.2:3b to generate fine tuning data to train tinyllama1.1b https://seanneilan.com/posts/fine-tuning-local-llm/ It took a while to figure out that when I made llama3.2 generate json, it didn't have enough horsepower to generate training data that was varied enough to successfully fine tune llama1.1b! Figured that out :) Something you never learn with the bigger models. Every token costs something even if it's a little bit.



I'm so confused. Why would the United States care about people's health? It feels out of character for this administration given the times.


It's a marketing campaign and that's all it is. Zero substance on the website about what they're doing to make sure more people actually eat like this.


Have you found any models that work better for your use case?


To answer your question: no but we haven't looked because Sam is sota. Trained our own model with limited success (I'm no expert). We are pursuing a classical computer vision approach. At some level segmenting a monochrome image resembles or is actually an old fashioned flood fill - very generally. This fantastic sam model is maybe not the right fit for our application.

Edit: answered the question


This is a "classic" machine vision task that has traditionally been solved with non-learning algorithms. (That in part enabled the large volume, zero defect productions in electronics we have today.) There are several off-the-shelf commercial MV tools for that.

Deep Learning-based methods will absolutely have a place in this in the future, but today's machines are usually classic methods. Advantages are that the hardware is much cheaper and requires less electric and thermal management. This changes these days with cheaper NPUs, but with machine lifetimes measured in decades, it will take a while.


Way late response: the off the shelf stuff is very very expensive as one would expect for industrial solutions. I was tasked to build something from scratch (our own solution). It was quite the journey and was not successful. If anyone has pointers or tips in this department I would truly love to hear about them!


My initial thought on hearing about this was it being used for learning. It would be cool to be able to talk to an LLM about how a circuit works, what the different components are, etc.


Does anyone have any information on how much this will cost? Or is it one of those products where if you have to ask you can't afford it.


Lots of existing posts in this discussion talking about prices in various regions and configurations.


Yes, performance can be a big issue with postgres. And vertical scaling can really put a damper on things when you have a major traffic hit. Using it for kafka is misunderstanding the one of the great uses of kafka which is to help deal with traffic bursts. All of a sudden your postgres server is overwhelmed and the kafka server would be fine.


It's worth noting that Oracle has solved this problem. It has horizontal multi-master scalability (not sharded) and a queue subsystem called TxEQ which scales like Kafka does, but it's also got the features of a normal MQ broker. You can dequeue a message into a transaction, update tables in that same transaction, then commit to remove the message from the queue permanently. You can dequeue by predicate, delay messages, use producer/consumer patterns etc. It's quite flexible. The queues can be accessed via SQL stored procs, or client driver APIs, or it implements a Kafka compatible API now too I think.

If you rent a cloud DB then it can scale elastically which can make this cheaper than Postgres, believe it or not. Cloud databases are sold at the price the market will bear not the cost of inputs+margin, so you can end up paying for Postgres as much as you would for an Oracle DB whilst getting far fewer features and less scalability.

Source: recently joined the DB team at Oracle, was surprised to learn how much it can do.


Agree but we really have to put a number on baseline traffic and max traffic burst in order to be productive in the discussion. I would argue that the majority of use cases never need to be designed for a max-traffic-number that PG can't handle


>And vertical scaling can really put a damper on things when you have a major traffic hit.

Wouldn't OrioleDB solve that issue though?


Not familiar with OrioleDB. I’ll look it up. May I ask how this helps? Just curious.


I'm starting to like mongodb a lot more given the python library mongomock. I find it wonderful to create tests that run my queries against mongo in code before I deploy them. Yes, mongo has a lot of quirks and you have to know aws networking to set it up with your vpc so you don't get nailed with egress costs. And it's not the same query patterns and some queries are harder and you have maintain your own schemas. But the ability to test mongo code with mongomock w/o having to run your own mongo server is SO VALUABLE. And yes, there are edge cases with mongomock not supporting something but the library is open source and pretty easy to modify. And it fails loudly which is super helpful. So if something is not supported you'll know. Maybe you might find a real nasty feature that's hard to implement but then just use a repository pattern like you would for testing postgres code in your application.

https://github.com/mongomock/mongomock Extrapolating from my personal usage of this library to others, I'm starting to think that mongodb's 25 billion dollar valuation is partially based on this open source package :)


Curious why you think the risk of edge cases from mocking is a worthwhile trade off vs the relatively low complexity of setting up a container to test against?


Because I can read the mongomock library and understand exactly what it's doing. And mongo's aggregation pipelines are easier to model than sql queries in code. Sure, it's possible to run into an edge case but for a lot of general queries for filtering & aggregation, it's just fine.


The other unspoken aspect of this is with agentic coding, the ability to have the ai also test queries quickly is very valuable. In a non-agentic coding setup, mongomock would not be as useful.


That might work for some.

I prefer not to start with a nosql database and then undertake odysseys to make it into a relational database.


This is the way.


You can also do this with sqlite, running an in-memory sqlite is lightning fast and I don't think there are any edge cases. Obviously doesn't work for everything, but when sqlite is possible, it's great!


True but if you wind up using parts of postgres that aren't supported by sqlite then it's harder to use sqlite. I agree however, if I was able to just use sqlite, I would do that instead. But I'm using a lot of postgres extensions & fields that don't have direct mappings to sqlite.

Otherwise SQLITE :)


Or just use devcontainers and have an actual Postgres DB to test against? I've even done this on a Chromebook. This is a solved problem.


True but then my tests take longer to run. I really like having very fast tests. And then my tests have to make local network calls to a postgres server. I like my tests isolated.


They are isolated, your devcontainer config can live in your source repo. And you're not gonna see significant latency from your loopback interface... If your test suite includes billions of queries you may want to reassess.


You know what, you have a very good point. I'll give this another shot. Maybe it can be fast enough and I can just isolate the orm queries to some kind of repository pattern so I'm not testing sql queries over and over.


This is awesome! I’m already integrated into TillerHQ. Apart from price, do you have any differentiators that make it take less time for auto categorizing? Tiller doesn’t have any built in AI tools for auto categorization so I had to roll my own.

Also I’m extremely skeptical of your pricing. $5 one time seems too good to be true.


>$5 one time seems too good to be true.

OP isn't hosting anything and has 0 costs apart from domain registration and website hosting.


You can check how it works in the demo. Idk, charging more than $5 for a google sheet like this felt wrong.


Why is it too good to be true? It's just a premade spreadsheet.


It's just a premade spreadsheet indeed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: