I like the idea, though I use nanoid with the safe letter dictionary (it exclude...

lxgr · on Nov 25, 2023

> it excludes letters used for profanity

That doesn't seem possible. How would that work?

> I looked at the implementation and it’s hardcoded to look for “bad” words.

If you mean https://github.com/y-gagar1n/nanoid-good, that seems to be doing the same thing.

In general, I'm a bit weary of solutions that "guarantee no bad words" – this is usually highly language-specific: One language's perfectly acceptable name is another language's swear word.

no_wizard · on Nov 25, 2023

This is the implementation: https://github.com/CyberAP/nanoid-dictionary

We use it in a highly internationalized product spanning multiple languages and haven’t yet ran into a complaint or value on audit that would constitute something offense in any language per our intl content teams anyway.

That isn’t to say it’s 100% (and simply enough we don’t audit every single URL) but I suspect we would have gotten at least a user heads up by now

Never the less we are moving our approach to uuids that get base32 encoded for some of our use case for this. They’re easier to work for us in many scenarios

Silasdev · on Nov 25, 2023

It's particularly funny because their example docs for .NET outputs "B4aajs", which to any Swedish l33t speaking individual, would read "Bajs", which means "shit"

owyn · on Nov 26, 2023

Somewhere there's a database for every bad word and every bad typo in every language and that one just got added.

Sharlin · on Nov 25, 2023

Omit vowels and you're 90% of the way there; omit the vowel-looking digits 0,1,3,4 and you're probably >99% of the way there.

gberger · on Nov 25, 2023

Sharlin · on Nov 25, 2023

Which is, evidently, why nanoids also excludes x and X, as well as v and V (fvck).

cdelsolar · on Nov 27, 2023

njharman · on Nov 25, 2023

> That doesn't seem possible. How would that work?

agree; b00b, DlCK, cntfcker

But I suppose, if user doesn't get to craft input, the collision space of converted numerical ids and words like above is sufficiently small to be ignorable.

Sharlin · on Nov 25, 2023

Besides vowels, nanoid excludes 0, 1, 3, 4, 5, I, l, x, X, v, V, and other lookalikes, so the chances of generating something naughty in any language are close to zero.

jl6 · on Nov 26, 2023

Humans have a high capacity for spotting rudeness. Nanoid’s nolookalikesSafe alphabet would allow blwjb69FKmyD7CK.

(Sorry)

Two4 · on Nov 26, 2023

Buy me drink first, jeez

livrem · on Nov 25, 2023

Looks like the dictionaries used are from this file?

https://registry.npmjs.org/naughty-words/-/naughty-words-1.2...

From a quick look, the lists are pretty short, except for the one with English words that at least have some 404 words, but I can imagine there are far more bad words that you want to avoid than just those?

ape4 · on Nov 25, 2023

Here's the C++ of the sqid blocked words https://github.com/sqids/sqids-cpp/blob/main/include/sqids/b...

seanhunter · on Nov 27, 2023

Grepping out naughty words in randomly generated text definitely strictly weakens the information content if you're using it for a secure application but is often necessary.

In the early dotcom era the company I worked for were about to go live and the final step was demoing the end to end flow to the ceo. I had done the back end stuff and hadn't paid much attention to the front-end. The person who did the account creation process wanted to nudge people to generate memorable yet strongish passwords, so when it created your account it would generate with a random password which he did by choosing 2 four letter words at random from the unix dictionary and putting a two digit number between them. He ran that past me as an idea and I thought "yeah, good idea" and didn't think more of it.

However he forgot to first grep out all the naughty words so when we demoed it to the CEO/non-technical founder both of the words in his randomly generated password were swearwords.

tttp · on Nov 25, 2023

I tried something similar with a fixed alphabet that guarantees no profanity and a checksum (luhn)

https://github.com/tttp/dxid