> Output sanitization makes sense, though. Part of my job is to see how tech wil...

> Output sanitization makes sense, though.

Part of my job is to see how tech will behave in the hands of real users.

For fun I needed to randomly assign 27 people into 12 teams. I asked a few different chat models to do this vs doing it myself in a spreadsheet, just to see, because this is the kind of thing that I am certain people are doing with various chatbots. I had a comma-separated list of names, and needed it broken up into teams.

Model 1: Took the list I gave and assigned "randomly..." by simply taking the names in order that I gave them (which happened to be alphabetically by first name. Got the names right tho. And this is technically correct but... not.

Model 2: Randomly assigned names - and made up 2 people along the way. I got 27 names tho, and scarily - if I hadn't reviewed it would've assigned two fake people to some teams. Imagine that was in a much larger data set.

Model 3: Gave me valid responses, but a hate/abuse detector that's part of the output flow flagged my name and several others as potential harmful content.

That the models behaved the way they did is interesting. The "purple team" sort of approach might find stuff like this. I'm particularly interested in learning why my name is potentially harmful content by one of them.

Incidentally I just did it in a spreadsheet and moved on. ;-)