I have ask: where do you run your models, and how do you keep them... Determinis...

Boxxed · on April 25, 2023

Your experience matches mine exactly. Onnx seems to be relatively unknown for some reason. I feel like I'm on crazy pills -- how is everyone else delivering ML models? Are they really shipping multi GB pytorch environments? Are they using some sort of rube-goldberg machine to run things off a janky jupyter notebook?

WorldMaker · on April 25, 2023

In my experience, yes, all of the above. Azure ML Studio is a Jupyter Notebook Rube Goldberg machine builder. It's not cheap and it is not pretty, but it gets stuff done quickly. You are lucky if some developer took the time to properly nearly-productionize the Jupyter notebook as a Flask app or something (with yeah, a multi-gigabyte container). (Even luckier, in an Azure shop, if they skipped Flask and gave you a Functions app. Not quite fewer dependencies or smaller size, but Functions gives you free Application Insights telemetry and that is an operational beauty.)

Every time I've convinced data scientists to just hand off an ONNX file to me for Production everyone comes out pleased: building an ONNX file is easier than productionizing a notebook any other way; the speed and performance of ONNX runtime are great and it is more easily integrated in C#-based pipelines (potentially avoiding expensive data transfers to/from Python VMs).

The biggest, ugliest hurdle I've seen is how many pre/post-processing steps data scientists tend to accidental convince themselves "can only be done in Python" either because they don't have time to research alternatives, believe Python to be inherently magical, or found some massive, obscure Huggingface-like corpus with gigabytes of data that would result in the most bloated ONNX files and "obscured in a Python or VM install step" hides how big the corpus is.