Hey honest question. I have worked with ai and nlp guys. How i have seen this wo...

sseveran · on May 12, 2017

So I have been doing what I shall call applied machine learning since I was in college when I built an ad classifier for a web crawler I was building at the time. I made the real transition while working in the search team of a web company almost 10 years ago now.

Let me first say that I am unlikely to ever design a new novel algorithm like an SVM kernel. I have however studied ML theory extensively and have a good grasp of the underlying math. I also had the advantage of working in medical research starting in high school and even before college I had learned a lot about statistics and was comfortable using a tool like SPSS to perform ROC analysis as well as gaining a solid understanding of what real statistical rigor was.

I, and those I know and work with, do a lot more than clone some repos from GitHub and see if they work. Typically there is some sort of a business problem that needs solving. Sometime we know of an approach that will work but often there is a literature survey that needs to be conducted to see if anyone has solved a similar enough problem and written about it. I am comfortable reading ML/NLP literature and evaluating the methodologies described. Often there is some open source stuff to get us started but rarely (I can't think of any, but its early in the morning) have I been able to put together a complete solution without solving some difficult problems on my own.

If I were to give someone advice it would be probably not the advice that they would want but here goes. I assume that the person would already have a solid mathematical foundation like engineering calculus. 1. Start by getting solid foundation in statistics and probability. 2. You will need a foundation in linear algebra. 3. Find a mentor(s) that can help you with both the theoretical side of ML and the applied side. In my case they were different people. 4. Implement some learning algorithms from scratch. I build a NN library a long time ago. I never used it in a production application but the learnings it gave me are still invaluable. 5. Read the research. You need to feel comfortable picking up a paper, understanding it, and evaluating whether you should believe the authors or not.

Maybe there are shorter roads. Personally I don't believe so. I was lucky to be paid to learn these skills through my career. I am sure there are people who are smarter than me or who can just learn by reading. I learn by doing. But this has led to success for me and I think gave me the ability to succeed in different environments, using different technologies, and long before the entire world was so enamored with deep learning.

a1exyz · on May 12, 2017

do you have advice on whether it is worth going back to school if your goal is to build novel and useful software/ai tools using deep learning - not necessarily improving the algorithms themselves. Would you expect to still need those 5 things you listed as advice.

Scarblac · on May 12, 2017

Yes, to actually improve the state of the art, you start a PhD, I agree.

But there's still also a lot of work that people can do applying the "Github repositories" to new problems. And to do that effectively, you also have to know stuff (e.g. you need to be able to read the most recent research, now when tool X is appropriate over tool Y, know what preprocessing makes sense in a given situation, etc). There's money to be made there and people want to do that work.

mrdrozdov · on May 12, 2017

Stanford parser is very good for preprocessing data. Things like part of speech tagging, named entity recognition, and dependency parsing. If you want to do something fun and interesting with your data, you will probably need to implement it yourself. Note, there are lots of other go-to tools nowadays besides the Stanford parser. Things like GloVe embeddings, open source translation systems (harvardseq2seq, open source sentence encoders (facebook fasttext) are probably necessary in many NLP pipelines.

When things "just work" with off the shelf tools then you probably don't need the researcher (although sometimes you will need them to just find the right solution/tool). When things don't work, you will need them. I guess this can be said about many fields though? (Databases, front end development, etc)

sjg007 · on May 12, 2017

The big gains now are taking these pre-processing tools in speech, vision, NLP etc and using that as input to a NN for some problem domain. Every one of these is a startup.