Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My immediate thought was to store the information in a doubly-linked list type structure in SQL. Obviously the space efficiency is not optimal, but I am surprised that the article didn't even point this out.

The table could be designed with a unique PK, the todo item, and some binary or bool field to determine first (head) and last (tail). You would then add 'previous' and 'next' fields that would be updated for insert/delete/updates.



The problem with that is that it kind of breaks SQL to traverse the list.

Let's say you have a 1M element list, and you want to get the first 100 elements. Normally you would write something like:

  > select top(100) * ... order by itemPosition asc
To walk a linked list, you'd either have to:

- walk it on the client, one query per element, resulting in way too many roundtrips to the server.

- get all the data in one go and then walk it on the client. Here we're selecting and returning 1M records and throwing most of those away.

- walk the list in the database with a recursive common table expression. CTE's aren't appropriate for this; it'll be slow, and we'll run up against a recursion limit for large lists.

- walk the list in the database with cursors/loops/etc. Very icky, and breaks composability.


You are right and make many succinct points. I imagined this in the context of the todo list example given in the article, and assumed there would be a sane limit on items in a list, and that each record in the database would be tied to a list ID, or user ID, or some other method to allow only pulling the items that are actually in the list.

That being said, this would not alleviate the problems you mentioned when walking the list. It would need to be arranged once retrieved either by the client, or by the backend before passing to the client. I imagined that I would traverse the list recursively to order it before passing it off to the client. This should take O(n) time, since we only need to traverse the list once, and we haven't retrieved from the database any unrelated rows (list ID, user ID, etc - there has to be some boundary in place).


This is probably decently efficient if you use recursive CTE's to traverse it, and use keyset pagination rather than limit/offset.

    with recursive cte (udo_id, prv, nxt, label) as (
        select * 
        from udo
        where udo_id = :first_item_id
        union all 
        select u.*
        from udo u
        join cte
        on u.prv = cte.udo_id
    ) select * from cte limit 100


How would you query such structure to retrieve elements in-order?


Your best solution would probably be to use a recursive CTE if your DB supported it, but as a sibling post mentioned, it's probably not a good approach.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: