More

michaelmior · 2026-02-11T08:40:12 1770799212

I have a side project of a new type of JSON database. Schema discovery is performed on the fly and that schema is then used to compress the stored data. This eliminates the need to use short key names to save space in addition to reducing overall storage requirements.

michaelmior · 2026-02-01T11:50:35 1769946635

> First, it is hard, especially in at least somewhat portable manner.

I'm curious what portability concerns you've run into with JSON serialization. Unless you need to deal with binary data for some reason, I don't immediately see an issue.

> Such representation, which is by the way specified to be executable bidirectionally (roll back capabilities), is a full blown program

Of course this depends on the complexity of your problem, but I'd imagine this could be as simple as a few configuration flags for some problems. You have a function to execute the process that takes the configuration and a function to roll back that takes the same configuration. This does tie the representation very closely to the program itself so it doesn't work if you want to be able to change the program and have previously generated "plans" continue to work.

friendzis · 2026-02-02T07:04:25 1770015865

> I'm curious what portability concerns you've run into with JSON serialization.

The hard part concerns instructions and it is not technical implementation of serializing an in-memory data structures into serialization format (be it JSON or something bespoke) that is the root of complexity.

> You have a function to execute the process that takes the configuration and a function to roll back that takes the same configuration.

Don't forget granularity and state tracking. The opposite of a seemingly simple operation like "set config option foo to bar" is not a straightforward inverse: you need to track the previous value. Does the dry run stop at computing the final value for foo and leaves possible access control issues to surface during real run or does it perform "write nothing" operation to catch those?

> This does tie the representation very closely to the program itself so it doesn't work if you want to be able to change the program and have previously generated "plans" continue to work.

Why serialize then? Dump everything into one process space and call the native functions. Serialization implies either strictly, out of band controlled interfaces, which is a fragile implementation of codegen+interpreter machinery.

michaelmior · 2026-01-30T20:18:31 1769804311

For Android, you can hold down the power button and press the Lockdown option that appears. (I think this may need to be enabled in settings.)

ranger_danger · 2026-01-30T20:26:55 1769804815

Probably a much better idea to just go ahead and hit shutdown if you're on that screen anyway, since many phones are more susceptible to gear like Greykey or Cellebrite if they have ever been unlocked since the last power-on.

michaelmior · 2026-01-23T15:28:55 1769182135

> We are getting rid of gRPC

I'm curious why and what challenges you had with gRPC. s2-lite looks cool!

shikhar · 2026-01-23T15:45:28 1769183128

We wanted S2 to be one API. Started out with gRPC, added REST - then realized REST is what is absolutely essential and what most folks care about. gRPC did give us bi-directional streaming for append/read sessions, so we added that as an optional enhancement to the corresponding POST/GET data plane endpoints (the S2S "S2-Session" spec I linked to above). A nice side win is that the stream resource is known from the requested URL rather than having to wait for the first gRPC message.

gRPC ecosystem is also not very uniform despite its popularity, comes with bloat, is a bit of a mess in Python. I'm hoping QUIC enables a viable gRPC alternative to emerge.

michaelmior · 2026-01-23T10:47:21 1769165241

I found at least one example[0] of authors claiming the reason for the hallucination was exactly this. That said, I do think for this kind of use, authors should go to the effort of verifying the correctness of the output. I also tend to agree with others who have commented that while a hallucinated citation or two may not be particularly egregious, it does raise concerns about what other errors may have been missed.

[0] https://openreview.net/forum?id=IiEtQPGVyV&noteId=W66rrM5XPk

michaelmior · 2026-01-19T14:49:58 1768834198

The fingerprint reader is not embedded in the screen, but in the power button on the side of the device.

michaelmior · 2026-01-17T12:16:58 1768652218

Note that the headline is from Langfuse, not ClickHouse. Reading the announcement from ClickHouse[0], the headline is "ClickHouse welcomes Langfuse: The future of open-source LLM observability". I think the Langfuse team is suggesting that they will be continuing to do the same work within ClickHouse, not that the entire ClickHouse organization has a goal of building the best LLM engineering platform.

[0] https://clickhouse.com/blog/clickhouse-acquires-langfuse-ope...

michaelmior · 2026-01-17T11:26:14 1768649174

> What you are saying is that we don't need 'install.md'

I think the point was that install.md is a good way to generate an install.sh.

> validate that, and put it into the repo

The problem being discussed is that the user of the script needs to validate it. It's great if it's validated by the author, but that's already the situation we're in.

chme · 2026-01-17T12:04:41 1768651481

> The problem being discussed is that the user of the script needs to validate it. It's great if it's validated by the author, but that's already the situation we're in.

The user is free to use a LLM to 'validate' the `install.sh` file. Just asking it if the script does anything 'bad'. That should be similarly successful as the LLM generating the script based on a description. Maybe even more successful.

falloutx · 2026-01-17T12:36:42 1768653402

I still dont understand why we need any of them. If I am installing something, It would take me more time to write this install.md or install.sh than if I just went to the correct website and copied the command, see the contents, run it and opening help.

michaelmior · 2026-01-15T11:21:46 1768476106

I think there's a lot of room to push this further. Of course there are LLMs being used for this case and I guess it's nice to be able to ask your house who the candidates were in the Venezuelan presidential election of 1936, but I'd be happy if I could just consistently control devices locally and a small language model definitely makes that easier.

michaelmior · 2026-01-12T15:31:13 1768231873

> I typically do something mindless with my hands (weave chainmail, cross stitch, sew)

For me, that's exactly the sort of "something else" I interpreted the previous comment to refer to.