One of the things that greatly improved my csv workflow is duckdb. It’s a small ...

lyjackal · on Jan 6, 2024

You can do the same with SQLite, which is usually already installed in most environments

    sqlite> .import test.csv foo --csv

llimllib · on Jan 6, 2024

A poor man's version of csvlens is something like:

    sqlite -column :memory: '.import --csv file.csv tmp' 'select * from tmp;' | bat

which imports the csv into sqlite and outputs it to bat, my favorite pager - use `less` or whatever else you desire.

1vuio0pswjnm7 · on Jan 7, 2024

In addition to using a pager with sqlite3's fantastic text-only output .modes, if the CSV contains hyperlinks I use a custom UNIX filter I wrote that outputs simple, minimalist HTML. Then I view with text-only browser.

For example, this is how I use YouTube. I never use the YouTube website, with its gigantic pages and its "Javascript player", not to mention all of the telemetry. All the search results and information about videos is stored in SQL or CSV, viewed with a text-only sqlite3 output .mode, and optionally converted to simple HTML.

For me, this is better than a "modern" web browser that's too large for me to compile.

dbreunig · on Jan 6, 2024

duckdb is a single file with no dependencies and it's fast. Still blows my mind how quickly it can query GB sized gzipped CSVs.

xcdzvyn · on Jan 7, 2024

Wow, this is awesome. Thanks a lot.

wenc · on Jan 6, 2024

I use DuckDB for queries and Visidata for quick inspections.

Between those two, I can work with not only CSVs, but also JSON and Parquet files (which are blazing fast -- CSVs are good for human readability and editability, but they're horrendous for queries).

CLI CSV tools pop up every now and then, but there's too many of them and I feel that my use cases are sufficiently addressed with only 2 tools.

paddy_m · on Jan 7, 2024

If you use jupyter check out what I'm building with Buckaroo. The aim is to have a table viewer that does the right/obvious thing for most data (formatting, sorting, downsampling large datasets, summary statistics, histograms). all customizable. Supports pandas and polars.

https://github.com/paddymul/buckaroo

ok_computer · on Jan 7, 2024

If using python, I’ll also recommend polars sql context manager to run queries on csv.

https://docs.pola.rs/py-polars/html/reference/api/polars.SQL...

dkga · on Jan 6, 2024

Long live duckDB! Big fan here.