"0x" is another Almquist shell script that can search about 63 different servers that return search results, e.g., Google, DDG, and so on.^1
yy___ are UNIX filters written in C.
0x uses some yy proggrams as well, e.g., yy025, along with sed and a TCP client, e.g., tcpclient, netcat, socat, bssl, openssl, etc.
yy084 outputs SERPs as SQL.
This makes it easy to create simplified "mixed" SERPs with results from different servers.
Where possible 0x allows for "continuation search". Going past page 1 of SERPs is discouraged or even prevented in recent times, all focus is on "the top result",^2 and some www search engines actively try to block exhaustive research and discovery. By continuing searches over time, e.g., page 1 of results on day 1, page 2 on day 2, page 3 on day 5, etc., one can sometimes avoid being blocked when doing exhaustive searches.
1. This is an ongoing experiment. Sometimes a site will "break" if the site operator changes something but this does not happen too often. Majority have remained stable over time.
2. This coincidentally benefits an advertising services racket.
> Currently, the script looks like this
>
> #!/bin/sh
> 0x $1|yy084|yy030|yy073
>
> "0x" is another Almquist shell script that can search about 63 different servers that return search results, e.g., Google, DDG, and so on.^1
>
> yy___ are UNIX filters written in C.
>
> 0x uses some yy proggrams as well, e.g., yy025, along with sed and a TCP client, e.g., tcpclient, netcat, socat, bssl, openssl, etc.
>
> yy084 outputs SERPs as SQL.
>
> This makes it easy to create simplified "mixed" SERPs with results from different servers.
>
> Where possible 0x allows for "continuation search". Going past page 1 of SERPs is discouraged or even prevented in recent times, all focus is on "the top result",^2 and some www search engines actively try to block exhaustive research and discovery. By continuing searches over time, e.g., page 1 of results on day 1, page 2 on day 2, page 3 on day 5, etc., one can sometimes avoid being blocked when doing exhaustive searches.
>
> 1. This is an ongoing experiment. Sometimes a site will "break" if the site operator changes something but this does not happen too often. Majority have remained stable over time.
>
> 2. This coincidentally benefits an advertising services racket.
How can I get more details about these filters. Are there existing implementations somewhere online which I can test? Seems pretty interesting.
Details are just another computer user, not a "developer", writing their own utilties. Continually editing, updating to suit personal needs. No prerequisites except flex and C89 compiler. Generally, static binaries under 50k.
A search on one of the search engines that indexes HN may provide some pointers. Something like site:ycombinator.com plus [name of filter].
Not sure about "interesting" but these are portable across OS, low resource requirements. Faster than sed. Boring stuff that works for me year after year.
Note there are some web search engines that only block exhaustive searching when the computer user is not "signed in". This could be a coincidence, not an attempt to force computer users into submitting to increased surveillance and data collection foor coommerciall purposes. 0x is written for a computer user who did not "sign in" to websites in the 90s, 00s, 10s and as a matter of habit does not do so today.
yy___ are UNIX filters written in C.
0x uses some yy proggrams as well, e.g., yy025, along with sed and a TCP client, e.g., tcpclient, netcat, socat, bssl, openssl, etc.
yy084 outputs SERPs as SQL.
This makes it easy to create simplified "mixed" SERPs with results from different servers.
Where possible 0x allows for "continuation search". Going past page 1 of SERPs is discouraged or even prevented in recent times, all focus is on "the top result",^2 and some www search engines actively try to block exhaustive research and discovery. By continuing searches over time, e.g., page 1 of results on day 1, page 2 on day 2, page 3 on day 5, etc., one can sometimes avoid being blocked when doing exhaustive searches.
1. This is an ongoing experiment. Sometimes a site will "break" if the site operator changes something but this does not happen too often. Majority have remained stable over time.
2. This coincidentally benefits an advertising services racket.