Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think Spark was the best tool out there when data engineering started taking off, and it just works (provided you don't have to deal with jar dependency hell) so there's not a huge incentive to move away from it.


This is so true! Even a few years ago, these benchmarks would have been against pandas (instead of polaes and duckdb) and would likely have looked very different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: