I had an interesting experience a while back where it came to light that I was working on the same problem as another team in the org (this was a huge multinational), so a meeting was arranged so we could compare notes. The other team was slightly shocked to see that I could train a model in a minute or two, where it took them an hour or two using essentially the same algorithm.
They insisted that shouldn't be, because I was doing it on my laptop and they were using a high performance computing cluster. They of course wanted to know how my implementation could be so much faster despite running on only a single machine. I didn't have the heart to suggest that maybe it was because, not despite.
Ironically, I also got the implementation done in a lot fewer person-hours. I just did a straight code-up of the algorithm in the paper, where they had to do a bunch of extra work to figure out how to adapt it to scale-out.
This isn't to say that big data doesn't happen. Just that it's a bit like sex in high school: People talk about it a lot more than they actually have it, perhaps because everyone's afraid their friends will find out they don't have it.
They insisted that shouldn't be, because I was doing it on my laptop and they were using a high performance computing cluster. They of course wanted to know how my implementation could be so much faster despite running on only a single machine. I didn't have the heart to suggest that maybe it was because, not despite.
Ironically, I also got the implementation done in a lot fewer person-hours. I just did a straight code-up of the algorithm in the paper, where they had to do a bunch of extra work to figure out how to adapt it to scale-out.
This isn't to say that big data doesn't happen. Just that it's a bit like sex in high school: People talk about it a lot more than they actually have it, perhaps because everyone's afraid their friends will find out they don't have it.