Mithril [https://mithril.ai/] | ONSITE in Palo Alto -or- San Francisco, CA (SF Bay Area) | Full Time
We’re building Mithril to be the cloud compute platform AI developers actually want—no more battling procurement, limited quotas, or clunky tooling. Our platform gives ML engineers frictionless access to high-performance GPUs, clean APIs, and modern infra primitives to train, fine-tune, and serve state-of-the-art models. Backed by Sequoia, Lightspeed, and founders of Databricks, Google Brain, and Scale.
>> SRE, Supply (Site Reliability Engineer) << Manage GPU provisioning, spot bidding, and node pool health across clouds and on-prem. Work on the systems behind our global GPU fleet. Apply here: https://job-boards.greenhouse.io/mithril/jobs/4651793007
>> General Application - Exceptional Talent << If you are an experienced, high-impact individual who believes you can help us achieve our most ambitious goals, please use this general posting to introduce yourself: https://job-boards.greenhouse.io/mithril/jobs/4965590007
Looking for compute? We support distributed deep learning workloads, including long-running batch jobs, streaming inference, and GPU autoscaling for LLM training and fine-tuning. https://mithril.ai/contact-sales
Foundry [https://mlfoundry.com/] | ONSITE in Palo Alto -or- San Francisco, CA (SF Bay Area) | Full Time
We’re building Foundry to be the cloud compute platform AI developers actually want—no more battling procurement, limited quotas, or clunky tooling. Our platform gives ML engineers frictionless access to high-performance GPUs, clean APIs, and modern infra primitives to train, fine-tune, and serve state-of-the-art models. Backed by Sequoia, Lightspeed, and founders of Databricks, Google Brain, and Scale.
>> SRE, Supply (Site Reliability Engineer) << Manage GPU provisioning, spot bidding, and node pool health across clouds and on-prem. Work on the systems behind our global GPU fleet. Apply here: https://job-boards.greenhouse.io/foundrytechnologiesinc/jobs...
Looking for compute? We support distributed deep learning workloads, including long-running batch jobs, streaming inference, and GPU autoscaling for LLM training and fine-tuning. https://mlfoundry.com/contact-sales
Foundry [https://mlfoundry.com/] | ONSITE in Palo Alto -or- San Francisco, CA (SF Bay Area) | Full Time
We’re building Foundry to be the cloud compute platform AI developers actually want—no more battling procurement, limited quotas, or clunky tooling. Our platform gives ML engineers frictionless access to high-performance GPUs, clean APIs, and modern infra primitives to train, fine-tune, and serve state-of-the-art models. Backed by Sequoia, Lightspeed, and founders of Databricks, Google Brain, and Scale.
>> SRE, Supply (Site Reliability Engineer) <<
Manage GPU provisioning, spot bidding, and node pool health across clouds and on-prem. Work on the systems behind our global GPU fleet.
Apply here: https://job-boards.greenhouse.io/foundrytechnologiesinc/jobs...
Looking for compute? We support distributed deep learning workloads, including long-running batch jobs, streaming inference, and GPU autoscaling for LLM training and fine-tuning.
https://mlfoundry.com/contact-sales
Foundry (www.mlfoundry.com) | Hybrid w. in-office requirement (San Francisco, CA & Palo Alto, CA)
Foundry is building the future of AI infrastructure with our Cloud Platform, providing self-serve access to high-performance GPU compute for training, fine-tuning, and serving AI models. We’re simplifying infrastructure for dynamic AI workflows, enabling AI practitioners to focus on innovation, not infrastructure.
We’re well-funded ($80M, Series A), growing quickly, and looking for talented people to join our team.
Here are some of the roles we’re hiring for:
* Senior Software Engineer, Full Stack
Design and build our compute marketplace and products. Focus on both backend and frontend technologies, REST APIs, and microservice architecture. [6+ years experience with Typescript, Python, etc.]
* Site Reliability Engineer (SRE), Supply
Work with our supply partners to build & maintain reliable systems for distributed compute. Work across Kubernetes, Linux, and cloud services to ensure reliability, scalability and performance. [Focus on customer interactions and reliability.]
* Software Engineer, Security Engineer
Design and implement security strategies for our AI/ML infrastructure. Build systems that keep our platform secure at scale.
Learn more about us and apply: www.mlfoundry.com/company Or email us directly: careers@mlfoundry.com
Foundry (www.mlfoundry.com) | Hybrid w. in-office requirement (San Francisco, CA & Palo Alto, CA)
Foundry is building the future of AI infrastructure with our Cloud Platform, providing self-serve access to high-performance GPU compute for training, fine-tuning, and serving AI models. We’re simplifying infrastructure for dynamic AI workflows, enabling AI practitioners to focus on innovation, not infrastructure.
We’re well-funded ($80M, Series A), growing quickly, and looking for talented people to join our team.
Here are some of the roles we’re hiring for:
* Senior Software Engineer, Full Stack
Design and build our compute marketplace and products. Focus on both backend and frontend technologies, REST APIs, and microservice architecture. [6+ years experience with Typescript, Python, etc.]
* Site Reliability Engineer (SRE), Cloud
Build reliable systems for AI workflows. Work across Kubernetes, Linux, and cloud services to ensure platform scalability and performance. [Focus on system design and reliability.]
* Site Reliability Engineer (SRE), Supply
Work with our supply partners to build & maintain reliable systems for distributed compute. Work across Kubernetes, Linux, and cloud services to ensure reliability, scalability and performance. [Focus on customer interactions and reliability.]
* Software Engineer, Security Engineer
Design and implement security strategies for our AI/ML infrastructure. Build systems that keep our platform secure at scale.
Learn more about us and apply: www.mlfoundry.com/company Or email us directly: careers@mlfoundry.com
Foundry (www.mlfoundry.com) | Hybrid w. in-office requirement (San Francisco, CA & Palo Alto, CA)
Foundry is building the future of AI infrastructure with our Cloud Platform, providing self-serve access to high-performance GPU compute for training, fine-tuning, and serving AI models. We’re simplifying infrastructure for dynamic AI workflows, enabling AI practitioners to focus on innovation, not infrastructure.
We’re well-funded ($80M, Series A), growing quickly, and looking for talented people to join our team.
Here are some of the roles we’re hiring for:
- Senior Software Engineer, Full Stack
Design and build our compute marketplace and products. Focus on both backend and frontend technologies, REST APIs, and microservice architecture. [6+ years experience with Typescript, Python, etc.]
- Infrastructure Engineer
Architect and deploy infrastructure solutions. Work with Kubernetes, Terraform, and AWS products to optimize cloud performance. [6+ years experience in infrastructure and automation. Experience with Python, Kubernetes, and Terraform.]
- Site Reliability Engineer (SRE)
Build reliable systems for AI workflows. Work across Kubernetes, Linux, and cloud services to ensure platform scalability and performance. [Focus on system design and reliability.]
- Software Engineer, Security Engineer
Design and implement security strategies for our AI/ML infrastructure. Build systems that keep our platform secure at scale.
Learn more about us and apply: www.mlfoundry.com/company
Or email us directly: careers@mlfoundry.com
We’re building Mithril to be the cloud compute platform AI developers actually want—no more battling procurement, limited quotas, or clunky tooling. Our platform gives ML engineers frictionless access to high-performance GPUs, clean APIs, and modern infra primitives to train, fine-tune, and serve state-of-the-art models. Backed by Sequoia, Lightspeed, and founders of Databricks, Google Brain, and Scale.
We’re hiring:
>> General Software (SWE) / Infrastructure Engineers << Build our batch + streaming workload engine for ML. Think: GPU scheduling, fault-tolerant execution, rich job DAGs. 0→1 ownership. Apply here: https://job-boards.greenhouse.io/mithril/jobs/4199826007
>> SRE, Supply (Site Reliability Engineer) << Manage GPU provisioning, spot bidding, and node pool health across clouds and on-prem. Work on the systems behind our global GPU fleet. Apply here: https://job-boards.greenhouse.io/mithril/jobs/4651793007
>> Founding Product Manager << Define the roadmap for the most advanced ML infra users. Apply here: https://job-boards.greenhouse.io/mithril/jobs/4199830007
>> General Application - Exceptional Talent << If you are an experienced, high-impact individual who believes you can help us achieve our most ambitious goals, please use this general posting to introduce yourself: https://job-boards.greenhouse.io/mithril/jobs/4965590007
Looking for compute? We support distributed deep learning workloads, including long-running batch jobs, streaming inference, and GPU autoscaling for LLM training and fine-tuning. https://mithril.ai/contact-sales
discovery tags: gpu, kubernetes, hpc, terraform, helm, distributed, ml, training, inference, llm, deep learning, fine tuning, customer, product, api, on-demand, infrastructure, elastic, scalable, cloud, platform