Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For some reason, I think that this will only add to our existing problem of having too much data to digest, but no way to reasonably digest it. In other words - we need a way to extract valuable data from millions of articles, not generate millions more.


There are still many cases where there is too much data to digest. What Automated Insights does is take data and turn it into interesting and hopefully insightful bits of digestible content.


I don't know that this helps or hurts anything really, what they're really doing (for this example at least) is translating stat lines to text.

It's pretty easy to say something like:

This week's top quarterback was %qb% from the %team1% who had %qbyards% and %qbtds% while playing the %team2%.

The total amount of information is exactly the same, they just use a little NLP to make it sound 'hand written' but display a box score in a easier to read manner.


The technology is significantly more complicated than simple search and replace a bunch of variables. Trust me, if we did millions of stories by swapping out the same sentence every time it wouldn't work.

Also, the part of the value of what we do is describe "insights" not just spit back raw numbers (which again wouldn't be very valuable).


I didn't mean to demean the work, there is clearly a whole lot going on behind the scenes here. It was more of a rebuttal of the "more data is bad" idea. My point above was unclear, but this idea doesn't necessarily represent more data, just the same data presented in a more human-friendly manner.

I really like what the team has done, this could have a long-lasting impact on anything with heavy use of stats and figures. I honestly have a few enterprise applications that would benefit from a similar treatment. e.g. reports that aggregate certain operational stats and must be hand-written every week.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: