More

pqn · 2025-12-23T01:57:13 1766455033

I'm biased (work at Cognition) but I think it's worth giving the Windsurf JetBrains plugin a try. We're working harder on polish these days, so happy to hear any feedback.

pqn · on Dec 9, 2024

This reminds me of `tiny::tombstone_traits` in C++: https://github.com/foonathan/tiny#tombstones

pqn · on April 2, 2024

Codeium (https://codeium.com) | Software Engineer | Mountain View, CA | Full-time | Onsite

We're working on AI tools for developers (autocomplete, chat, and more unannounced things). We train our own LLMs from scratch and have over 1M downloads across our surfaces. We have many paying enterprise customers and have raised a total of $93M from Kleiner Perkins, Greenoaks, and Founders Fund.

We're hiring for many roles, but in particular are looking for software generalists and Deployed Engineers, which are more heavy on the customer interaction than code (https://jobs.ashbyhq.com/codeium/fd2ca49f-ae99-487c-8a52-75d...). No ML or systems experience required. We also have fall software engineering internships available.

You can see all the open roles and apply here (we pretty much look at every single application): https://codeium.com/careers

csjh · on April 2, 2024

Is sponsorship available for Canadian applicants?

pschuegr · on April 2, 2024

Not the original poster, but also Canadian. IIUC TN visas don't really require sponsorship, just a job offer in one of the covered fields.

pqn · on April 5, 2024

In general we can handle most common visa/sponsorship situations.

pqn · on Dec 6, 2022

Hey swyx :) Great question, we've got a blog post coming soon with some of these technical and other details... we've employed a lot of tricks here, debouncing included.

swyx · on Dec 6, 2022

inject this straight into my veins

pqn · on Oct 18, 2022

Speed of sound isn't too directly affected by air pressure I think (holding all else constant): https://physics.stackexchange.com/a/146207

pqn · on Aug 29, 2022

Let's just take the topic of measuring GPU usage. This alone is quite tricky -- tools like nvidia-smi will show full GPU utilization even if not all SMs are running. And also the workload may change behavior over time, if for instance inputs to transformers got longer over time. And then it gets even more complicated to measure when considering optimizations like dynamic batching. I think if you peek into some ML Ops communities you can get a flavor of these nuances, but not sure if there are good exhaustive guides around right now.

pqn · on Aug 29, 2022

Disclaimer: I work at Exafunction

I empathize a bit with the cloud providers as they have to upgrade their data centers every few years with new GPU instances and it's hard for them to anticipate demand.

But if you can easily use every trick in the book (CPU version of the model, autoscaling to zero, model compilation, keeping inference in your own VPC, using spot instances, etc.) then it's usually still worth it.

lowdose · on Aug 29, 2022

Not to mention AWS has had a GPU cloud offering monopoly because Google Cloud and Microsoft Azure were publicly available until 2019.

fomine3 · on Aug 30, 2022

GCP still provides NVIDIA K80. I wonder is it still worth to hold.

varunkmohan · on Aug 30, 2022

I think you'd probably always want to go with T4's since they are the same price unless there's just no availability for them.

pqn · on April 28, 2022

SWE at Exafunction here! We're not ready to be self-serve, so the contact process is the most straightforward for us to work with companies right now. But we respond fast :)

As to the tech, we have APIs closely resembling common deep learning frameworks, so once you add our Python/C++ client locally, you can change a small amount of code to start remotely using GPUs. We also have the ability to handle arbitrary stateful CUDA code for more complex use cases. On the server side, you can deploy our work scheduler inside your own VPC, so we take over orchestration for you as well.

Our customers are currently confidential, but safe to say we've seen a 5-10x decrease in cloud costs (or equivalently, the ability to fit 5-10x larger workloads given a GPU quota). It really depends on the utilization of your current workload.

nharada · on April 28, 2022

Makes sense. I think it's less about speed of response and more that for many SWEs, we don't necessarily have the authority to reach out and create relationships between our companies and another vendor. I certainly am not going to bother trying to get that process started unless I can do a little legwork before involving other parties at my company.

pqn · on April 28, 2022

That's a great point. We've been mostly outbound so far, but this will be a bigger issue for us moving forward. We're thinking about how to lower the barrier to entry for this -- for instance, we can try to publicly release some docs and let anyone try out the system in a local container.

pqn · on Sept 28, 2021

I have a couple of NLP side projects that I'm working on and I'll definitely try using this!

pqn · on Aug 2, 2021

Some additional "State of Python"-themed material:

https://www.jetbrains.com/lp/devecosystem-2021/python/

https://www.jetbrains.com/lp/python-developers-survey-2020/ (in partnership with PSF)