More

goombacloud · on May 12, 2024

If you mean using Incus on Flatcar, there is a PR for adding Incus as systemd-sysext extension.

Flatcar inside Incus is a bit more difficult: for Flatcar being a container one can import https://stable.release.flatcar-linux.net/amd64-usr/current/f... and for it being a VM I don't know if the regular image works. A major hurdle is that one has to tweak the way VMs/containers are configured because normally Ubuntu's cloud-init is used but in Flatcar only coreos-cloudinit or Ignition is supported and there are differences in the way the user-data has to be set up and the contents as well. But in the end Incus would be one more "cloud" platform to support and one could make the Incus integration as nice as with other platforms where Flatcar runs on (OpenStack, VMware, etc.).

jamesponddotco · on May 12, 2024

Incus on Flatcar is what I mean. This seems to be the PR you're talking about[1]. I'll keep an eye on it, thanks!

[1]: https://github.com/flatcar/scripts/pull/1655

goombacloud · on May 12, 2024

With static binaries that is not needed (and you can use OS=_any in the extension release file to mark them compatible).

If you want to repackage distro binaries without recompilation, you can have a look here: https://github.com/flatcar/sysext-bakery/pull/74 There are two tools, one can bundle the needed libs in a separate folder, and the other one works more like Flatpak and uses a full chroot. Since you already know what files are needed at runtime I think you could try the first approach, otherwise the second might be easier.

goombacloud · on April 2, 2024

This might not be complete because this statement "More patches that seem (even in retrospect) to be fine follow." lacks some more backing facts. There were more patches before the SSH backdoor, e.g.: "Lasse Collin has already landed four of Jia Tan’s patches, marked by “Thanks to Jia Tan”" and the other stuff before and after the 5.4 release. So far I didn't see someone make a list of all patches and gather various opinions on whether the changes could be maliciously leveraged.

VonGallifrey · on April 2, 2024

I get that there is a reason not to trust those Patches, but I would guess they don't contain anything malicious. This early part of the attack seems to only focus on installing Jia Tan as the maintainer, and they probably didn't want anything there that could tip Lasse Collin off that this "Jia" might be up to something.

rsc · on April 2, 2024

Yes, exactly. I did look at many of them, and they are innocuous. This is all aimed at setting up Jia as a trusted contributor.

goombacloud · on April 2, 2024

In https://archive.softwareheritage.org/browse/revision/e446ab7... one can open the patches and then click the "Changes" sub-tab. Stuff like this looks like a perf improvement but who knows if a tricky bug is introduced that was aimed to be exploited https://archive.softwareheritage.org/browse/revision/e446ab7... There are more patches to be vetted unless one would give up and say that 5.2 should be used as last "known-good".

goombacloud · on Feb 16, 2024

When socat is around a simple server can also be constructed with it:

        tee /tmp/server > /dev/null <<'EOF'
        #!/bin/bash
        set -euo pipefail
        SERVE="$1"
        TYPE="$2"
        read -a WORDS
        if [ "${#WORDS[@]}" != 3 ] || [ "${WORDS[0]}" != "GET" ]; then
          echo -ne "HTTP/1.1 400 Bad request\r\n\r\n"; exit 0
        fi
        # Subfolders are not supported for security reasons as this avoids having to deal with ../../ attacks
        FILE="${SERVE}/$(basename -- "${WORDS[1]}")"
        if [ -d "${FILE}" ] || [ ! -e "${FILE}" ]; then
          echo -ne "HTTP/1.1 404 Not found\r\n\r\n" ; exit 0
        fi
        echo -ne "HTTP/1.1 200 OK\r\n"
        echo -ne "Content-Type: ${TYPE};\r\n"
        LEN=$(stat -L --printf='%s\n' "${FILE}")
        echo -ne "Content-Length: ${LEN}\r\n"
        echo -ne "\r\n"
        cat "${FILE}"
        EOF
        chmod +x /tmp/server
        # switch from "text/plain" to "application/octet-stream" for file downloads
        socat TCP-LISTEN:8000,reuseaddr,fork SYSTEM:'/tmp/server /tmp/ text-plain'

# test: curl -v http://localhost:8000/server

cf100clunk · on Feb 16, 2024

There are other such tiny web server tricks out there too, but his GitHub README says:

  A purely bash web server, no socat, netcat, etc...

goombacloud · on Dec 14, 2023

I really think we should have means of spawning wasm components from wasm components. How the runtime runs them should be up to the runtime - it could be directly backed by kernel primitives but it could also be in a browser. Leaking posix things into wasm… is something I'd rather never want to see. Let's come up with something better as wasm did aim for from the start.

goombacloud · on Nov 23, 2023

To spot more common problems I recommend:

  alias shellcheck='shellcheck -o all -e SC2292 -e SC2250'

throw0101a · on Nov 23, 2023

SC2292: Prefer [[ ]] over [ ] for tests in Bash/Ksh.

* https://www.shellcheck.net/wiki/SC2292

SC2250: Prefer putting braces around variable references (${my_var}) even when not strictly required.

* https://www.shellcheck.net/wiki/SC2250

mgdlbp · on Nov 24, 2023

Opportunistic interjection that unnecessary ${} is the most bothersome style choice in any language I know of:

- It obscures actual uses of modifiers, particularly ${foo-} when set -u is in effect,

- It's obvious when a name runs into subsequent text, even if one has somehow avoided syntax highlighting,

- And expansions followed by identifier chars don't actually occur in practice. Cases where the quotes cannot be moved to surround the variable are often interpolation of an argument to echo, whose behaviour is such a mess not even portable between bash and dash that shellcheck ought to be demanding printf at all times instead!

arp242 · on Nov 24, 2023

Related pet peeve: always writing variables as $UPPER_CASE in shell scripts.

Useful: $UPPER_CASE for exported variables ("super globals"), $lower_case for anything else. Can also use $lower_case for function locals and $UPPER_CASE for exported and script global variables (stylistic preference; both are reasonable).

Not useful or reasonable: $ALWAYS_UPPER_CASE_NO_MATTER_WHAT.

I suppose people started doing it because they saw $EXPORTED_VARIABLE and thought "oh, I need to always upper case it", not realizing what that meant. And then after that more copy-copy of this "style".

goombacloud · on Oct 2, 2023

For regular Linux users you can do:

  sudo touch /etc/udisks2/tcrypt.conf
  sudo systemctl restart udisks2

and then any veracrypt volumes can be used in Nautilus or GNOME Disks similar to LUKS volumes.

Avamander · on Oct 3, 2023

Why would one want to use VeraCrypt volumes over regular LUKS?

kdtsh · on Oct 3, 2023

Cross platform support is a good reason.

goombacloud · on Sept 28, 2023

Does this finally have UEFI by default?

goombacloud · on Aug 30, 2023

The next logical step would be to only support wasm programs ;)

goombacloud · on July 13, 2023

Specially compression algos that use arithmetic coding with interval weights adjusted based on the prediction of what is likely coming next are very similar. They adjust the arithmetic coding (https://en.wikipedia.org/wiki/Arithmetic_coding) based on the context of the byte/bit to predict, so the more accurate the predicted continuation is, the more efficient is the encoding. The task is very similar to that of the transformers like GPT. A perfect prediction will almost have no additional storage cost because the arithmetic interval doesn't get smaller, and thus no bit gets stored - but anyway, you have to count the size of the decompressor to get a fair benchmark.

gcr · on July 13, 2023

How is it similar?

There’s been a lot of study going the other direction - using neural networks to aid the entropy prediction in classical compression algorithms - but I’m not seeing the conceptual link between how transformer/attention models work internally and how gzip works internally beyond “similar words are easy to compress”

I’m not seeing it because GPT representations are just vectors of fixed, not varying, size

willvarfar · on July 13, 2023

An LLM or, well, any statistical model, is about prediction. As in, given some preceding input, what comes next?

One way to measure the accuracy of the model, as in it's "intelligence", is to use the predictions to turn input into all the differences from the prediction; if it's good at predicting then there will be fewer differences and it will compress it.

So seeing how well your model can compress some really big chunk of text is a very good objective measure of it's strength and compare it to the strength of others?

So a competition is born! :)

sumtechguy · on July 13, 2023

Good summary.

The LLM vs a static tree has some interesting oppositions. With a static tree as emitted by a compression alg will probably many times beat an LLM. As it has full knowledge of the whole stream (or in gzips version that window). So it can do things where it can look back and say 'hm the tree I spit out was not that good let me build a better one'. Where as an LLM does not really have that before hand knowledge. Using a pre-cooked LZW tree for all inputs would be more akin to using an LLM.

willvarfar · on July 13, 2023

I would envisage the LLM is allowed to train on each and every input token. So, to begin with, it knows nothing; but to predict the very last token, it has internalised the whole preceding stream.

Now I wouldn't expect it to be particularly competitive in enwik8 or enwik9, but the question would be: is there any max-model-size and input-length for which it would right now pull ahead and become the best known or at least competitive predictor?

sumtechguy · on July 13, 2023

I would expect it to be 'ok'. Basically as if you used a pre-trained LZW table only and shipped that along with each stream but the results would be mixed. A compressor has the advantage of foresight and hindsight whereas a LLM would only have hindsight. As any input stream would be basically at the mercy of the previous streams fed into it. Those may or may not be optimal.

It is an interesting hypothesis. But my gut feeling is I would expect a LLM to perform on average worse. Competitive? Yes, but still worse. But it is something I am sure someone will test.

From the hundreds of different compressor models I have made for myself over the years. Usually believe it or not the compressed data is usually the best part. It is the decode tree/table/key/whatever that usually ends up crowding out the savings on the compressed data. In this case it would be the LLM weights or whatever the LLM spits out for the tree/decode.

goombacloud · on July 13, 2023

gzip uses LZ and Huffman coding and not arithmetic coding with a predictor, so yes, these are not similar.