Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the things that greatly improved my csv workflow is duckdb. It’s a small binary that allows querying csv with sql.


You can do the same with SQLite, which is usually already installed in most environments

    sqlite> .import test.csv foo --csv


A poor man's version of csvlens is something like:

    sqlite -column :memory: '.import --csv file.csv tmp' 'select * from tmp;' | bat
which imports the csv into sqlite and outputs it to bat, my favorite pager - use `less` or whatever else you desire.


In addition to using a pager with sqlite3's fantastic text-only output .modes, if the CSV contains hyperlinks I use a custom UNIX filter I wrote that outputs simple, minimalist HTML. Then I view with text-only browser.

For example, this is how I use YouTube. I never use the YouTube website, with its gigantic pages and its "Javascript player", not to mention all of the telemetry. All the search results and information about videos is stored in SQL or CSV, viewed with a text-only sqlite3 output .mode, and optionally converted to simple HTML.

For me, this is better than a "modern" web browser that's too large for me to compile.


duckdb is a single file with no dependencies and it's fast. Still blows my mind how quickly it can query GB sized gzipped CSVs.


Wow, this is awesome. Thanks a lot.


I use DuckDB for queries and Visidata for quick inspections.

Between those two, I can work with not only CSVs, but also JSON and Parquet files (which are blazing fast -- CSVs are good for human readability and editability, but they're horrendous for queries).

CLI CSV tools pop up every now and then, but there's too many of them and I feel that my use cases are sufficiently addressed with only 2 tools.


If you use jupyter check out what I'm building with Buckaroo. The aim is to have a table viewer that does the right/obvious thing for most data (formatting, sorting, downsampling large datasets, summary statistics, histograms). all customizable. Supports pandas and polars.

https://github.com/paddymul/buckaroo


If using python, I’ll also recommend polars sql context manager to run queries on csv.

https://docs.pola.rs/py-polars/html/reference/api/polars.SQL...


Long live duckDB! Big fan here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: