Here a common example: Given the following tables: CREATE TABLE person ( id INT ...

brabel · on April 14, 2023

I run into the equivalent problem in Kotlin sometimes. I want to group things by a key, but that results in type `Map<K, List<V>>` because the compiler can't be sure that there won't be more than one `V` for the field being used for grouping... which is why there's also `associateBy` which does return `Map<K, V>`, but will throw if there happens to be more than one value for each K.

Not relevant, but I thought it may help other readers understand the problem.

mtts · on April 14, 2023

The common solution in this case is simply to also GROUP BY owner.name.

That said, if owner.name has different values and you didn’t actually care which one you’d get ANY_VALUE could maybe be useful.

Though in that case you could also just use MIN or MAX, which already exist.

orthoxerox · on April 15, 2023

Yes, but all these solutions require the optimizer to be smart enough to be smart enough to optimize them away: "oh, owner.id is the primary key, so any value from owner will appear at most once, I don't have to group by them or track the maximum value, I can pick and store any value".

I think all optimizers are smart enough already, as this is a super common pattern. I honestly would have preferred this logic to be made more explicit: if you group by the primary key (or by the table name, as someone suggested), you can use all columns from the table without aggregate functions.

This kinda breaks down when you introduce subqueries and CTEs, which is where myopic optimizers can screw the pooch as well, so any_value can be of use there.