Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think that this topic gets tripped up with people wanting to assume boolean means two possible states... NULL adds an extra state to every type much like the "bottom" type/value in some functional programming languages or type systems.

It's not the `=` operator that is special or significant here. It is the `WHERE` clause semantics, which are defined as returning rows when the boolean expression is true. This treats both false and null cases as the other possible outcome, when rows are not returned. Not understanding this distinction is also where people write buggy queries using WHERE NOT ..., because like equality the NOT operator will return NULL for a NULL input (so NULL is not like false and does not negate to true).

SELECT 'hidden by null result' WHERE CAST(NULL AS boolean);

SELECT 'also hidden by null result' WHERE NOT CAST(NULL AS boolean);

SELECT 'shown by true result' WHERE CAST(NULL AS boolean) IS NULL;

SELECT 'hidden by false result' WHERE NOT (CAST(NULL AS boolean) IS NULL);

The existing GROUP BY behavior is different than WHERE because it does not have a special rule for handling true values. It just returns a group for each distinct value.

The prior (new default) UNIQUE constraint behavior made sense when you think of NULL as this bottom value and meaning "unknown", while also thinking that a uniqueness constraint is only applied to complete keys. The check is effectively deferred, and the partial or incomplete tuple might become valid and unique once its missing part(s) are filled in with real values. The new optional behavior makes more sense if you think that the uniqueness constraint enforces distinct records even with partial keys, just as the GROUP BY clause will determine distinct groups.

There is also a similar issue with FOREIGN KEY constraints. The default MATCH SIMPLE behavior is consistent with the default UNIQUE constraint behavior. An incomplete foreign key is essentially ignored and not subject to integrity checks. The MATCH FULL behavior is almost a style checker to say that a compound foreign key should be all null or all non-null, while still ignoring incomplete foreign keys. As far as I know, the MATCH PARTIAL case is not implemented, but I think it might complement this new UNIQUE NULLS NOT DISTINCT behavior?

One can imagine a further branching point in behaviors for foreign keys and I'm not sure how MATCH PARTIAL will be defined. One mode might be easily implementable with the new index method, if we want partial foreign keys to be able to reference partial keys in the referred table, i.e. the partial key with NULLs is considered an actual row key. But we might also imagine another partial matching mode where the the non-null components of a compound foreign key have to all exist in at least one key in the referred table, while ignoring the null fields as incompletely formed references. This would be an improvement on MATCH SIMPLE which allows the non-null columns in a foreign key to hold values which could not possibly match the referring table no matter which values are filled in for the remaining null fields of the foreign key. However, this enforcement would not be easily implemented without a powerset of indexes for all partial combinations allowed in the referencing tables!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: