More

hthh · on Jan 18, 2018

What is "Nitro"?

ygra · on Jan 18, 2018

Safari's JS engine, as far as I know.

hthh · on Jan 17, 2018

"Copyright infringement" is not "trademark infringement" and shouldn't be in the title.

hthh · on Oct 4, 2017

Why? I mean you're still sending the referer header, right?

kemitche · on Oct 4, 2017

No, browsers don't send the referer header for https links.

tomcorrigan · on Oct 4, 2017

That's only if linking from https to an http page which is not the case here.

hthh · on Jan 4, 2017

Making the atomic bomb scores pretty high on the evil-genius scale.

yoloswagins · on Jan 5, 2017

Fun fact, Feynman didn't work directly on the bomb. His role was managing high school students who performed computations, fixing adding machines, and did some safety work at the uranium isotope plant in Tennessee.

[1] https://robertlovespi.net/2014/09/07/how-richard-feynman-sav...

bkanber · on Jan 5, 2017

More fun facts: almost nobody worked directly on the bomb. Thousands of people worked on manufacturing equipment and components and had no idea what they were supposed to be building. They split the project up into very granular pieces so that nobody really had full insight into what was going on except for the core group of scientists. (Source for this I believe is Feynman himself in Surely You're Joking)

at-fates-hands · on Jan 5, 2017

They have several examples of this on the Wikipedia page, but this one struck me when I saw it some years ago:

https://en.wikipedia.org/wiki/Manhattan_Project#/media/File:...

Gladys Owens, the woman seated in the foreground, did not know what she had been involved with until seeing this photo in a public tour of the facility fifty years later.

srean · on Jan 5, 2017

Feynman is one of the coolest guys in my books, but do note that "didn't work directly on the holocaust...was just counting beans to facilitate" was not a defense that saved necks, they were hanged. No moral judgement implied here, just the fact that history is written by winners.

hthh · on Dec 4, 2016

Wow, in a happy coincidence syntect is exactly what I was looking for all last week! Now I just need to figure out Rust.

trishume · on Dec 4, 2016

Cool, email me if you need any help with it. In fact, email me regardless of if you need help telling me what your project is and how syntect might help.

hthh · on Sept 30, 2016

Dynamic resizing fixes this, although hash collisions can still be a problem (albeit a very unlikely one if your hash is secure). https://en.wikipedia.org/wiki/Hash_table#Dynamic_resizing

mason55 · on Sept 30, 2016

The number of collisions is only loosely based on the hash function, it's much more closely tied to the fill factor.

If you keep your hash table 1% full then you're unlikely to run into a collision no matter how bad your hash function is. If you keep your table 99% full then it doesn't matter how good your hash function is. You still have to modulo your hash by the number of buckets.

wongarsu · on Sept 30, 2016

Yes, there are implemenations of hash tables that solve this problem. Hash collisions still ruin your worst case performance, but as long as nobody is asking how long you spend on inserts you can solve that too. As a trivial example, on each insert make the hashtable larger and change the hash function randomly until you found a solution that has O(1) lookup for all elements.

But now we're far outside the realm of "properties of hash tables" and instead inside "properties of my hash table", and any decent interviewer should recognize that.

lorantt · on Sept 30, 2016

I think people here seem to implicitly assume linked buckets, those are bad on modern architectures for several reasons.

Look at (Hopscotch,) Robin Hood or Cuckoo Hashing for hashing with linear probing, high fill factor (~0.9) and _amortized_ O(1). I've seen a paper somewhere that proved Robin Hood worst-case O(log log n) afair.

http://www.sebastiansylvan.com/post/robin-hood-hashing-shoul...

https://en.m.wikipedia.org/wiki/Hopscotch_hashing

You can make open probing performantly concurrent with 2 bytes of memory overhead per key/value too:

http://preshing.com/20160314/leapfrog-probing/

hvidgaard · on Oct 3, 2016

My first tought about dynamic resizing is that you can have an amortized O(1) but still is O(n) worst case. Both then exploiting the hash function to force all elements into the same bucket, but also when triggering a resize.

You can make an expanding array with O(1) operations, but I'm unsure if it's applicable to a hashtable. It may be, but it's beginning to become very complex.

hthh · on May 1, 2016

"High quality" is an strange concept. I would look at code you actually use and rely on - that's the best indication of quality. A lot of critical code deals with inelegant, complex problems correctly and efficiently - I'd consider anything that can be relied on to manage that "high quality", even if it is unclean, inelegant, poorly formatted and algorithmically mundane.

That said, if you want to read elegant code, I'd recommend the stb parser libraries (written in C). They are small self-contained decoders for many common media formats, with excellent documentation:

https://github.com/nothings/stb/blob/master/stb_image.h https://github.com/nothings/stb/blob/master/stb_truetype.h https://github.com/nothings/stb/blob/master/stb_vorbis.c

These libraries are likely insecure, handle many edge-cases incorrectly, implement fewer features, and perform worse than other options. However, they meet your criteria better.

sklogic · on May 1, 2016

It is not nearly enough for a code to just work and be useful. Code quality is what determines how maintainable it is, how long will it stay relevant, how long will it survive the changing requirements and environment. And it is much harder to get this than just something that (sometimes) work.

hthh · on May 1, 2016

Absolutely, by all means look at old code - code that has survived and been useful for a long time. It's either adaptable (Linux) or doesn't need to change or adapt much (TeX).

Do you currently use and rely on software which you expect won't be useful to you in ten years time? I can't think of much personally.

(I do use IDA Pro, which has clearly adapted poorly to changing requirements - it still has scars of the 32-bit to 64-bit transition that get in the way of day-to-day usage. I hope there'll be something better in ten years. Of course, I could buy a cheaper, "higher quality" tool instead, but none of them are as powerful or as useful.)

hthh · on Feb 11, 2016

This is an interesting article, but the "Portability" comments could be a lot more useful: strndup and open_memstream are both "POSIX 2008", but strndup can be used on OS X while open_memstream cannot.

sortie · on Feb 12, 2016

I deliberately didn't write that to avoid the page going stale when OS X adds it. But OS X is behind the times and that's harmful. POSIX 2008 has been out for years and most of the lacking features are trivial to add. They're being actively harmful to Unix software by not having modern interfaces, forcing portable software to be worse. The purpose of the article is to highlight the interfaces, and when they're suited, rather than being a replacement for your system manual page or a portability guide. Since OS X isn't a free software Unix (though its libc is), I don't really consider it among the relevant modern Unix systems. Linux, the BSDs, and so on all have the POSIX 2008 features mentioned here.

apaprocki · on Feb 12, 2016

While not immediately clear, OS X is POSIX 2003. So if you go strictly by POSIX standards, you shouldn't rely on either.

to3m · on Feb 12, 2016

You'd probably be able to duplicate open_memstream on OS X using funopen.

btrask · on Feb 12, 2016

Ran into this recently. It's not open_memstream but same idea: https://github.com/NimbusKit/memorymapping

hthh · on Feb 11, 2016

On Windows, sure. On other platforms UTF-8 is generally preferable (in my opinion).

jhallenworld · on Feb 11, 2016

Annoyingly, there is no simple user accessible UTF-8 decoder in libc. The only standard way to use iswalpha is to convert to wchar_t first.

One hack is to assume that bytes of UTF-8 encoded strings above 127 are all letters. It mostly works :-)

akira2501 · on Feb 12, 2016

> Annoyingly, there is no simple user accessible UTF-8 decoder in libc.

Am I misunderstanding you, because I've always thought that's what the mbtowc(3) family of functions was?

jhallenworld · on Feb 12, 2016

Well you are right, but these functions are not terribly fun to use. Consider a parsing function which extracts an identifier. For ASCII it's:

    if (isalpha(*s)) {
        *d++ = *s++;
        while (isalnum(*s))
          *d++ = *s++;
    }

To use UTF-8 / Unicode should require only small changes:

    if (iswalpha(decode(&s)) {
        encode(&d, advance(&s));
        while (iswalnum(decode(&s))
            encode(&d, advance(&s));
    }

For efficiency, don't decode twice- have the decoder return a pointer to the next sequence:

    if (iswalpha(c = utf8(&s, &n))) {
        encode(&d, c);
        s = n;
        while (iswalnum(c = utf8(&s, &n))) {
            encode(&d, c);
            s = n;
        }
    }

Also should be able to match a string in line:

   if ('A' == utf8(&s, &t) && 'B' == utf8(&t, &s) && 'C' == utf8(&s, &t)) // we have 'ABC'.

sortie · on Feb 12, 2016

mbtowc isn't necessarily thread safe, it's better to recommend mbrtowc.

sortie · on Feb 12, 2016

Just use setlocale(LC_ALL, "") in main, and use mbrtowc to translate from whatever the system encoding is into the wchar_t type. There's no need to bake assumptions about the system encoding into most programs.

hthh · on Dec 1, 2015

I think this is the COFF document which they say "is directly applicable". https://github.com/llvm-mirror/lld/blob/master/COFF/README.m...

I'm guessing the speed difference comes from this linker using a different approach to symbol resolution which requires visiting files fewer times, but I'm not sure.