Fabulous stuff! Oh my request … the vision head on the Gemma models is super slo...

		rao-v 6 months ago \| parent \| context \| favorite \| on: Gemma 3 270M: Compact model for hyper-efficient AI Fabulous stuff! Oh my request … the vision head on the Gemma models is super slow on CPU inferencing (and via Vulcan), even via llama.cpp. Any chance your team can figure out a solve? Other ViTs don’t have the same problem.