Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have no clue about Watson and Machine Learning, I mean I've talked to people and quickly glanced at the platforms. But in terms of the speech processing, does that not require any back-end at all? I'm assuming your using the browsers API to capture sound and then passing this off to Watson in some back-end method to analyze? Or can this seriously be done all fron-end? Pretty amazing if so.


It's not that the actual ML computation is done client side -- that would be way too slow. With Watson and many other ML platforms, you're leveraging someone else's (IBM, Google, Microsoft, Amazon, etc.) computing power.

Some people have made javascript-based ML models to run in the browser (I think some were made for this course[1]), but these are for educational purposes rather than actual use.

[1]: https://cs231n.github.io/


Well, I am currently developing a full-blown back-end service using AWS containers, but for a prototype I got it going as a simple Python script, based on an outline I found on github. Took a few hours.

Basically you get your source speech as an uncompressed WAV file, create an IBM Bluemix account (free trial), create a Watson "app" on the site (basically gives you some credentials for calling the API), and then write a script to upload your WAV file to the API and decode the JSON response.

It gets more complex when you want to start parallelizing the process to make it faster, and dealing with the results in an intelligent manner, but the initial proof of concept is remarkably easy.

If I recall, the Google one was even easier - no script at all, did it all with curl I think.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: