Hacker Newsnew | past | comments | ask | show | jobs | submit | kennethallen's commentslogin

The fundamental reason JSON won over XML is that JSON maps exactly to universal data structures (lists and string-keyed maps) and XML does not.


This article/blog post [1] has been on HN several times before, but it is well worth a reminder.

[1] https://seriot.ch/software/parsing_json.html


There are many different supersampling patterns you can use: https://en.wikipedia.org/wiki/Supersampling#Supersampling_pa...

A grid can have unwanted aliasing effects. It all depends on the kinds of images you're working with.


Two big advantages:

You avoid an unnecessary copy. Normal read system call gets the data from disk hardware into the kernel page cache and then copies it into the buffer you provide in your process memory. With mmap, the page cache is mapped directly into your process memory, no copy.

All running processes share the mapped copy of the file.

There are a lot of downsides to mmap: you lose explicit error handling and fine-grained control of when exactly I/O happens. Consult the classic article on why sophisticated systems like DBMSs do not use mmap: https://db.cs.cmu.edu/mmap-cidr2022/


you lose explicit error handling

I've never had to use mmap but this is always been the issue in my head. If you're treating I/O as memory pages, what happens when you read a page and it needs to "fault" by reading the backing storage but the storage fails to deliver? What can be said at that point, or does the program crash?


If you fail to load an mmapped page because of an I/O error, Unix-like OSes interrupt your program with SIGBUS/SIGSEGV. It might be technically possible to write a program that would handle those signals and recover, but it seems like a lot more work and complexity than just checking errno after a read system call.


> Consult the classic article on why sophisticated systems like DBMSs do not use mmap: https://db.cs.cmu.edu/mmap-cidr2022/

Sqlite does (or can optionally use mmap). How come?

Is sqlite with mmap less reliable or anything?


If an I/O error happens with read()/write(), you get back an error code, which SQLite can deal with and pass back up to the application, perhaps accompanied by a reasonable error message. But if you get an I/O error with mmap, you get a signal. SQLite itself ought not be setting signal handlers, as that is the domain of the application and SQLite is just a lowly library. And even if SQLite could set signal handlers, it would be difficult to associate a signal with a particular I/O operation. So there isn't a good way to deal with I/O errors when using mmap(). With mmap(), you just have to assume that the filesystem/mass-storage works flawlessly and never runs out of space.

SQLite can use mmap(). That is a tested and supported capability. But we don't advocate it because of the inability to precisely identify I/O errors and report them back up into the application.


Thanks for the response. I am more worried about losing already committed data due to an error

https://www.sqlite.org/mmap.html

> The operating system must have a unified buffer cache in order for the memory-mapped I/O extension to work correctly, especially in situations where two processes are accessing the same database file and one process is using memory-mapped I/O while the other is not. Not all operating systems have a unified buffer cache. In some operating systems that claim to have a unified buffer cache, the implementation is buggy and can lead to corrupt databases.

What are those OSes with buggy unified buffer caches? More importantly, is there a list of platforms where the use of mmap in sqlite can lead to data loss?


I know that the spirit of HN will strike me down for this, but sqlite is not a "sophisticated system". It assumes the hardware is lawful neutral. Real hardware is chaotic. Sqlite has a good reputation because it is very easy to use. In fact this is the same reason programmers like mmap: it is a hell of a shortcut.


I think the main thing is whether mmap will make sqlite lose data or otherwise corrupt already committed data

... it will if two programs open the same sqlite, one with mmap, and another without https://www.sqlite.org/mmap.html - at least "in some operating systems" (no mention of which ones)

https://www.sqlite.org/mmap.html

> The operating system must have a unified buffer cache in order for the memory-mapped I/O extension to work correctly, especially in situations where two processes are accessing the same database file and one process is using memory-mapped I/O while the other is not. Not all operating systems have a unified buffer cache. In some operating systems that claim to have a unified buffer cache, the implementation is buggy and can lead to corrupt databases.

Sqlite is otherwise rock solid and won't lose data as easily


This is a very interesting link. I didn't expect mmap to be less performant than read() calls.

I now wonder which use cases would mmap suit better - if any...

> All running processes share the mapped copy of the file.

So something like building linkers that deal with read only shared libraries "plugins" etc ..?


mmap is better when:

  * You want your program to crash on any I/O error because you wouldn't handle them anyway
  * You value the programming convenience of being able to treat a file on disk as if the entire thing exists in memory
  * The performance is good enough for your use. As the article showed, sequential scan performance is as good as direct I/O until the page cache fills up *from a single SSD*, and random access performance is as good as direct I/O until the page cache fills up *if you use MADV_RANDOM*. If your data doesn't fit in memory, or is across multiple storage devices, or you don't correctly advise the OS about your access patterns, mmap will probably be much slower
To be clear, normal I/O still benefits from the OS's shared page cache, where files that other processes have loaded will probably still be in memory, avoiding waiting on the storage device. But each normal I/O process incurs the space and time cost of a copy into its private memory, unlike mmap.


One reason to use shared memory mmap is to ensure that even if your process crashes, the memory stays intact. Another is to communicate between different processes.


It's 2025 and you're telling people to `pip install` into system packages instead of running via `uvx`.


I don't understand the use case here. Is this supposed to be for enterprise to control access to internal applications via network access policies?


Yes. The acronym is “ZTNA” (Zero Trust Network Access).

It is an alternative to a traditional corporate VPN that addresses a few architectural issues; namely:

- L3 connectivity (which allows for lateral movement) to the corporate network. - Inbound exposure to the VPN gateway (scaling can become a challenge, not to mention continuous vulnerabilities from… certain vendors) - Policy management can get convoluted if you want to do micro-segmentation properly.

ZTNA is essentially an “inside-out” architecture and acts (kind of) like a L4 proxy. I’m going to butcher this explanation, but:

1. Company installs apps/VMs/containers throughout their network. These must have network reachability to the internal apps/services the company wants to make available to its users.

2. These apps/VMs/containers establish TLS tunnels back to the company’s tenant in the vendor’s cloud.

3. Company rolls out the vendor’s ZTNA client to user devices. This also establishes a TLS tunnel to the vendor’s cloud. Hence the vendor’s cloud is like a MitM gatekeeper.

4. Company creates policies in the vendor’s cloud that says “User A can access App X via app/VM/container Z”

5. Even if App X is on the same LAN segment as App Y, App Y is invisible to User A because connectivity to the internal apps happens at L4.

It is an interesting architecture. That being said, ZTNA solutions have their own issues as well (you can probably already spot some based on my explanation above!)

(Note: I worked for a security vendor that sold a ZTNA solution as part of their ~4-5 years ago. Things could be different now.)


Yes, this is exactly what this does.


Running LLMs will be slow and training them is basically out of the question. You can get a Framework Desktop with similar bandwidth for less than a third of the price of this thing (though that isn't NVIDIA).


> Running LLMs will be slow and training them is basically out of the question

I think it's the reverse, the use case for these boxes are basically training and fine-tuning, not inference.


The use case for these boxes is a local NVIDIA development platform before you do your actual training run on your A100 cluster.


Author on Twitter a few years ago: https://x.com/tqbf/status/851466178535055362


Oh, joy


I have a few questions after reading the README.

First, if it uses PRNG with a fixed-size state, it isn't accurate to say it never repeats, correct? It will be periodic eventually, even if that takes 2^256 operations or more.

Second, can you go more into the potential practical or theoretical advantages? Your scheme is certainly more complicated, but I don't see how it offers better tamper protection or secrecy than a block cipher operating in an authenticated mode (AES+GCM, for instance). Those have a number of practical advantages, like parallel encryption/decryption and ubiquitous hardware support.


You are correct. The probability of a state collision is cryptographically negligible, on the order of breaking a 256-bit hash function.

You're also right that AES-GCM is faster and has hardware support. Ariadne explores a different trade-off. Its primary advantage is its architectural agility.

Instead of a fixed algorithm, the sequence of operations in Ariadne is dynamic and secret, derived from the key and data history. An attacker doesn't just need to break a key; they have to contend with an unknown, ephemeral algorithm.

This same flexible structure allows the core CVM to be reconfigured into other primitives. We've built concepts for programmable proofs-of-work, verifiable delay functions, and even ring signatures.


FYI your comments seem to be showing up as dead (dead comments don't show up by default, only when people logged into HN have them enabled), I think something may have triggered a shadowban on your account. Might want to send a message to the moderators.

I hit 'vouch' for the comment I'm responding to so it should be visible, but the other response you gave (https://news.ycombinator.com/item?id=44353277) is still listed as dead.


It does not show up as dead for me, but your comment was made 7 hours ago.


They cannot just relicense the work of all of their public contributors without them agreeing in writing. This is completely illegitimate. (They don't seem to require signing any contributor agreement.)


Have you come by your certainty that they have not asked because you were a contributor?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: