Tailgunneri's comments

Tailgunneri · on March 22, 2017

Hi HN!

We’re Eero, Otso, Aarni and Ruksi from Finland! We are the founders of Valohai, a machine learning infrastructure as a service startup. We support existing frameworks like TensorFlow, Keras, Torch and Caffe – actually, anything you can package into a Docker image. Our platform helps with the process of training machine learning models at scale with a focus on collaboration, realtime results, record keeping and repeatability. We are doing to machine learning what continuous integration and version control have done to programming. We just went to open beta. It’s still early so all feedback is super welcome!

jtraffic · on March 22, 2017

Your pricing is close to 1/2 of AWS. Assuming you scale, would you stay profitable? How?

I think the front page lacks clarity. "GitHub for Machine Learning" suggests a framework for hosting and cloning machine learning models, datasets, and training scripts. I may be just stupid, but it didn't come through as clearly what you actually offer. I see the term "experiments" up front and I wonder what you mean exactly. Then in collaboration I see "projects." What does a typical project consist of?

Don't get me wrong, the idea seems to have huge potential. My advice is to clean up the front page and develop a very clear explanation of what a typical person would use this for. You might want to take all of the features that say "Coming Soon" and move them into a separate page to avoid overwhelming users.

Perhaps try to differentiate this against Floyd in some way.

Tailgunneri · on March 22, 2017

Hi jtraffic!

And thanks for the feedback! Here are some answers :)

> Your pricing is close to 1/2 of AWS. Assuming you scale, would you stay profitable? How?

Exact pricing is currently work in progress like for the most of startups but we do feel confident that we can make this model work. First of all current Amazon prising is very high. When we scale we can cut costs way below Amazons normal pricing by reserving instances for longer periods. Then it becomes a problem of keeping demand stable vs optimizing how many instances we reserve. Of course there are other providers too and prices are dropping as we speak. We just don't think that keeping this level of pricing will be an issue.

Computational resources are also not the only revenue stream. In fact, our first pilot customers were enterprises with their own hardware, where we plan on having per-seat licensing fee much like GitHub Enterprise works.

> I think the front page lacks clarity. "GitHub for Machine Learning" suggests a framework for hosting and cloning machine learning models, datasets, and training scripts. I may be just stupid, but it didn't come through as clearly what you actually offer. I see the term "experiments" up front and I wonder what you mean exactly. Then in collaboration I see "projects." What does a typical project consist of?

Thanks for the feedback. Getting feedback on the website was one of the major reasons on posting here :) But, Yup! You got our vision! So in that sense our website copy communicates our intention. But then again, we don’t have as much of those collaborative features that we’d like so this critique is well deserved. I’ll try to explain how it works currently and what is in the roadmap going forward.

Currently you can fork any Valohai enabled ML project on GitHub and start from there but it doesn’t yet copy previous executions or outputs (such as model weights and biases). It only “forks” how the environment is set up, the training scripts and how they are ran so you don’t need to worry about any of that.

Features like making project public, more full fledged forking of project (executions, models, datasets), commenting on entities, watching an aspect of a project, starring project, pull request (using VCS integrations) and other social features are in the roadmap.

Project is basically an entity like “repository” in GitHub:

- A single project is meant to be “namespace” for solving and collaborating on a specific ML problem. For example, you might be in a machine learning team for an organization where you have multiple projects to solve but don’t want to mix the executions between the projects.

- Projects have executions, which are like “runs” in some of other systems I’ve used.

- A link to one git repository+branch that contains your training scripts and YAML file that defines who different kinds of “experiments” are ran. Single project can have multiple branches linked to it in the future, even multiple git repositories if we find an use case for it.

- Experiment term itself has not meaning in our system at the moment. It's an umbrella term for "anything that you might need computing" such as training or feature extraction.

- Task is a collection of executions that are mean to tackle a sub-problem, such as applying grid search hyperparameter optimization on a specific training to find the most optimal network.

Here are some examples of how project is defined at the GitHub side, the integration part with Valohai is in valohai.yaml file:

- https://github.com/valohai/tensorflow-example

- https://github.com/valohai/darknet-example

- https://github.com/valohai/keras-example

If you use any of those repositories as a source, you can run those pre-defined experiments with a click or two.

We also have other stuff right around the corner such as command line client, which should be released this week or early next week.

> Don't get me wrong, the idea seems to have huge potential. My advice is to clean up the front page and develop a very clear explanation of what a typical person would use this for. You might want to take all of the features that say "Coming Soon" and move them into a separate page to avoid overwhelming users.

Haha, don’t worry; I’ve been lurking at HN for long enough to know that there is a difference constructive and outright negative feedback.

Good advice on the clarity. We have been working on this for some months now and not everything that is clear to us is clear to a moderate visitor on our website. More in-detail use cases or user stories do make a lot of sense. Maybe even add one to the pricing page for more specific pricing that normal machine learning training project might take per month on AWS compared on our infrastructure.

Having a separate coming soon page might be an option, we’ll have to try it out.

> Perhaps try to differentiate this against Floyd in some way.

So far Floyd seems to concentrate on being a computational platform for singular users by “eliminating engineering bottlenecks in deep learning” while we are focusing on creating collaborative work flows and supporting private hardware use. But we do have a lot of similarities. We should focus on these differences, thanks for the tip! If FloydHub keeps on focusing on the engineering side of the machine learning, I wouldn’t be surprised to see FloydHub as one of our future backends after AWS and Google Compute Cloud.

jtraffic · on March 22, 2017

Cool. Good answer. Just FYI, the first link you provided, about TensorFlow, is broken.

Tailgunneri · on March 22, 2017

Thanks fixed!