Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Commits search and more for GitHub
2 points by sdesol on Dec 21, 2015 | hide | past | favorite
GitSense is a free Chrome extension that I've developed, which makes it possible to search for commits, diffs and more, all from within GitHub. You can learn more about it here:

http://gitsense.github.io

GitSense relies on my indexing technology, which was designed specifically to index Git repositories at the branch level. So far I'm focused on indexing GitHub repositories with 300 or more stars, which has been pushed to within the last 6 months.

Right now, I've indexed 36,375 repositories, 128,357 branches, 3,682,9480 commits, 14,155,250 diffs and 19,016,254 blobs.

From a technology point of view, I use Java, JavaScript, Postgres, Lucene and RocksDB. The RocksDB is more of an experiment right now, to find the right persistent key/value store DB.

The indexer is designed to scale horizontally and I'm finding a single machine with 32GB of RAM and 1TB of SSD can easily process 10,000 - 20,000 unique repositories. All of this depends on the repo sizes of course. Something like the linux kernel is a beast and consumes 8GB of RAM during the initial index and takes about 8 hours to fully index.

If you have any technology questions, ask away.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: