Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a computer vision/ML applications engineer I disagree with this. What you describe is someone who is actively implementing cutting edge tech.

That is VERY different than what 99% of people should be doing with ML which is: Spinning up some K80s on Azure, installing TF/CUDA/OpenCL, pulling existing pre-trained models off the shelf, and running inference on a novel data set.

That's how you get into it as a garden variety dev.

Otherwise, go for the PhD if you want to actually make new stuff.



You are missing a lot of things that you don't know. If you want to do machine learning at some point you have to train a model. You need to know how to clean the data, how to create the train/validation/test set, how to measure how good your model is, how to compare to other models you trained previously. If the model is not performing correctly you need to know why. You need to know the trade offs between precision and recall. This is like 95% of your work, the other 5% is running the training in Amazon or whatever you want to use.

I have worked with people who get a training example code and apply it to a dataset. And few weeks later they were still pulling off their hair because the model wasn't working in production but they have such a great results in their test. I took a look to their way of doing the training and I could point to so many errors they were doing why the model will never work in production.

That is not cutting edge, but at some point there is a new model that works better, and you should understand why in order to improve you current model. So probably you will have to read the paper and understand it.


If you're trying to build or train new models then you probably need to go to school for ML or at least math.

The garden variety dev shouldn't be trying to implement a research paper or train new models - that's the point. There are enough proven tools out there to do good work and more are being put out there every day.


If you're not trying to build or train new models you are not doing ML.


This thread is starting to sound absurd in a very Reggie Watts sort of way:

https://genius.com/Reggie-watts-if-youre-fcking-youre-fcking...


I think there are good points here!

> Spinning up some K80s on Azure, installing TF/CUDA/OpenCL

I think a single k80 instance is roughly ~$1/hr. If you had an experiment running 24hrs a day for a year, you'd spend a little over $8.5k. You can build an equivalent desktop machine for less than $2k [1], which might be slightly more convenient (once it's built), although I haven't really factored in energy costs.

> That's how you get into it as a garden variety dev.

Btw, you don't really need a GPU to start learning about deep learning. You can train a SotA modal on MNIST using Caffe I think in roughly 10m on CPU (maybe 1m on GPU). You can also train a reasonable sentiment classifier or natural language inference classifier in less than an hour on CPU. My perception is that these types of tasks are really solid for someone who is beginning to learn about machine learning or deep learning, as they'll provide a playground to mess around with different optimization techniques (SGD v. SGD+Momentum vs. Adam vs. etc), regularization (L1, L2, dropout, batch norm, etc), data augmentation, error analysis, and so on. If you do an ML interview for an entry level position, chances are these are the types things they will ask about.

I guess deploying ML solutions for a company you are working at is a different story.

> Otherwise, go for the PhD if you want to actually make new stuff.

There's some truth to this! PhD (like a Master's) probably doesn't make sense most of the time as a dollar-efficient career move. Rather, it's something you should pursue if you find being in an academic environment personally satisfying. You definitely don't need to be in a PhD program to work on new stuff (although it might make things easier because you will hopefully be surrounded by lots of fresh ideas). I've heard about people in bootcamps working on novel research. Now that so many powerful tools are open source and easy to use (Pytorch, Tensorflow, etc.), it's pretty easy for anyone to put together a novel model.

[1] pjreddie.com/darknet/hardware-guide/


I would definitely extend this to running training as well, but I agree with the concept - for most people, it should be either transfer learning to adapt existing models to their data, or running training from scratch with currently known best practice methods, NN architectures and hyperparameters, but doing it on their particular datasets. Possibly by using mostly existing code and modifying mostly the data input/output routines.


Cleaning the data, compare the models and understand why the results are like they are, those are huge things in Machine Learning. Actually it is like 90% of my job. Training the model it is nothing compared to it. As I said in another comment, I have seen people doing so many mistakes before training or comparing the models. They spent weeks seeing the models with good results in their test but performing like a random classifier in production. Just because they training setup was wrong, they didn't know how to compare models, etc. Machine learning is not like learning a new framework. You can learn the framework and use it, but you are going to do so many mistakes because all the other machine learning knowledge you need.


I think you need a bit more competence to get into the training realm though, because it's a bigger step to create a new model - especially the hard step of data labeling.

Unless you have a novel data set and a way to quickly train you're probably better off using existing trained models in most cases.

I agree with the transfer learning piece wholeheartedly though.


Data labeling isn't hard, it's labor intensive, which is an entirely different resource. If the business goal is valuable enough, then a non-tech manager without any special expertise can organize twenty man-months of grunts to do the labeling, three man-months of cookie-cutter junior dev work for tools of labeling and data management, and a single man-month of an external consultant with proper expertise to write sensible guidelines on how the labeling should be done and supervise the process. All of which will cost something comparable to a the annual cost a single ML developer.

Training models often is tricky, but it's not that hard, my experience shows that decent undergrads learn to train standard models on their own datasets after a single one semester course, and train quite difficult models after two semesters; so teaching/learning basic ML takes comparable time and effort to e.g. teaching/learning basic JS frontend development.

So if some company's IT department has some minimum ML skills, lack of expertise shouldn't be preventing them from training models. And even more so, using your own data (IMHO) is the whole point of adopting ML; if the problem is so generic that you don't need to adapt it to your data, then you shouldn't be learning to use ML but rather buying and integrating a SaaS API run by someone else.


it's labor intensive

Which is a form of hard...for example if you need 60,000 semantically labeled images, you need to train people to know how to do that specific of labeling and then have them do it, then QC the data, break it up into training and validation sets etc...

Don't forget that this advice is for a front end dev who hasn't ever touched caffe or torch or whatever. In many cases it takes new people a week to set up drivers and an environment on a GPU.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: