Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Personally the implementation of nth_element is a much more interesting thing to see than just knowing it exists.


It's STL-implementation specific. look in your c++ kit.

in mine (osx 10.6.7, g++ 4.2.1), it's in

/usr/include/c++/4.2.1/bits/stl_algo.h

and it uses an algorithm very similar to the original article


    template<typename _RandomAccessIterator>
    inline void
    nth_element(_RandomAccessIterator __first, _RandomAccessIterator __nth,
                _RandomAccessIterator __last)
    {
      typedef typename iterator_traits<_RandomAccessIterator>::value_type _ValueType;
 
      // concept requirements
      __glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<_RandomAccessIterator>)
      __glibcxx_function_requires(_LessThanComparableConcept<_ValueType>)
      __glibcxx_requires_valid_range(__first, __nth);
      __glibcxx_requires_valid_range(__nth, __last);

      if (__first == __last || __nth == __last)
        return;

      std::__introselect(__first, __nth, __last,
                         std::__lg(__last - __first) * 2);
    }


you should look at how std::__introselect is defined (hint: its in /usr/include/c++/4.2.1/bits/stl_algo.h )


The original article uses heap to get top K numbers. It's runtime complexity is O(N*log(K)) and nth_element is O(N), so the algorithms are not similar at all.


scorchin posted the implementation of nth_element, except he forgot to look at how the gnu implementation of __introselect behaves.

If you look there, you will see it internally uses a heap.


It uses heap only as a fallback if maximum recursion depth is reached. The key difference between __introselect and the posted article is that __introselect uses pivoting to (ideally) "throw" away half of the array during each step.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: