More

mathiasn · 2025-02-17T19:16:13 1739819773

Is that something you want to open source?

mathiasn · on Jan 8, 2025

https://ssotax.org gets more updates. There is also a Friends of SSO page :-)

mathiasn · on Dec 17, 2024

What are other potential platforms?

marcklingen · on Dec 17, 2024

This is a good long-list of projects, although it is not narrowly scoped to tracing/evals/prompt-management: https://github.com/tensorchord/Awesome-LLMOps?tab=readme-ov-...

resiros · on Dec 17, 2024

One missing in the list below is Agenta (https://github.com/agenta-ai/agenta).

We're oss, otel compliant with stronger focus on evals and the enabling collaboration between subject matter experts and devs.

suninsight · on Dec 17, 2024

Bunch of them : Langsmith, Lunary, Phoenix Arize, Portkey, Datadog and Helicone.

We also picked Langfuse - more details here: https://www.nonbios.ai/post/the-nonbios-llm-observability-pi...

unnikrishnan_r · on Dec 17, 2024

Thanks, this post was insightful. I laughed at the reason why you rejected Arize Phoenix, I had similar thoughts while going through their site!=)

> "Another notable feature of Langfuse is the use of a model as a judge ... this is not enabled in the free version/self-hosted version"

I think you can add LLM-as-judge to the self-hosted version of Langfuse by defining your own evaluation pipeline: https://langfuse.com/docs/scores/external-evaluation-pipelin...

suninsight · on Dec 18, 2024

Thanks for the pointer !

We are actually toying with building out a prompt evaluation platform and were considering extending langfuse. Maybe just use this instead.

barefeg · on Dec 17, 2024

Thanks for sharing your blogpost. We had a similar journey. I installed and tried both Langfuse and Phoenix and ended up choosing Langfuse due to some versioning conflicts on the python dependency. I’m curious if your thoughts change after V3? I also liked that it only depended on Postgres but the scalable version requires other dependencies.

The thing I liked about Phoenix is that it uses OpenTelemetry. In the end we’re building our Agents SDK in a way that the observability platform can be swapped (https://github.com/zetaalphavector/platform/tree/master/agen...) and the abstraction is OpenTelemetry-inspired.

marcklingen · on Dec 17, 2024

As you mentioned, this was a significant trade-off. We faced two choices:

(1) Stick with a single Docker container and Postgres. This option is simple to self-host, operate, and iterate on, but it suffers from poor performance at scale, especially for analytical queries that become crucial as the project grows. Additionally, as more features emerged, we needed a queue and benefited from caching and asynchronous processing, which required splitting into a second container and adding Redis. These features would have been blocked when going for this setup.

(2) Switch to a scalable setup with a robust infrastructure that enables us to develop features that interest the majority of our community. We have chosen this path and prioritized templates and Helm charts to simplify self-hosting. Please let us know if you have any questions or feedback as we transition to v3. We aim to make this process as easy as possible.

Regarding OTel, we are considering adding a collector to Langfuse as the OTel semantics are currently developing well. The needs of the Langfuse community are evolving rapidly, and starting with our own instrumentation has allowed us to move quickly while the semantic conventions were not developed. We are tracking this here and would greatly appreciate your feedback, upvotes, or any comments you have on this thread: https://github.com/orgs/langfuse/discussions/2509

suninsight · on Dec 18, 2024

So we are still on V2.7 - works pretty good for us. Havent tried V3 yet, and not looking to upgrade. I think the next big feature set we are looking for is a prompt evaluation system.

But we are coming around to the view that it is a big enough problem to have dedicated saas, rather than piggy back on observability saas. At NonBioS, we have very complex requirements - so we might just end up building it up from the ground up.

ianbicking · on Dec 18, 2024

"Langsmith appeared popular, but we had encountered challenges with Langchain from the same company, finding it overly complex for previous NonBioS tooling. We rewrote our systems to remove dependencies on Langchain and chose not to proceed with Langsmith as it seemed strongly coupled with Langchain."

I've never really used Langchain, but setup Langsmith with my own project quite quickly. It's very similar to setting up Langfuse, activated with a wrapper around the OpenAI library. (Though I haven't looked into the metadata and tracing yet.)

Functionally the two seem very similar. I'm looking at both and am having a hard time figuring out differences.

skull8888888 · on Dec 17, 2024

We launched Laminar couple of months ago, https://www.lmnr.ai. Extremely fast, great DX and written in Rust. Definitely worth a look.

marcklingen · on Dec 17, 2024

Congrats on the Launch!

skull8888888 · on Dec 17, 2024

apologies for hijacking your launch (congrats btw!)

skull8888888 · on Dec 17, 2024

thanks Marc :)

calebkaiser · on Dec 17, 2024

I'm a maintainer of Opik, an open source LLM evaluation and observability platform. We only launched a few months ago, but we're growing rapidly: https://github.com/comet-ml/opik

mathiasn · on Sept 5, 2024

Or better ssotax.org which is already way ahead.

mathiasn · on Aug 2, 2024

Yeah that's the old not very much up2date page. https://ssotax.org has more data and even has a Friends of SSO page.

mathiasn · on Aug 1, 2024

Slack is required for AccessOwl. It's used for things like approval workflows, task management and notifications in general.

What do you use instead?

haswell · on Aug 1, 2024

This severely limits the usefulness of a product like this.

Core aspects of the product like workflows and task management should not be tied to a chat vendor in my opinion, and would make me extremely nervous as a potential buyer due to your complete dependence on what SF does with Slack.

I’ve also worked places that strongly dislike Slack and won’t touch it since it was acquired by Salesforce. Ironically, your product would cause Shadow IT deployments (of Slack) in such environments.

Sharing these concerns because I think the product is a really useful concept, but your roadmap for these core functions would mean the difference between considering and completely passing over AccessOwl, i.e. for some subset of potential customers, the hard dependency on Slack is a complete blocker.

mathiasn · on Aug 1, 2024

Depending on the point of view it can also be a strength. Actually many of our customers like that we're in Slack because their people are already there:

- no login required to request an access - they don't need to "learn a new application"

So for end users that's great. There is still a web app for admins with more details.

But I can see where you're coming from. We plan to offer an alternative to Slack to be independent if the customers want that.

haswell · on Aug 1, 2024

It sounds like you may have found your niche with existing Slack customers and if that works for you that’s great.

I don’t agree that this is a “strength”, because it limits the growth potential of the product while coupling critical functions to the whims of a 3rd party vendor. I absolutely do see how it’s beneficial for you in this early stage because it allows you to deliver a straight-forward experience for this particular user base (Slack customers) without building your own UIs. But that position of strength is fundamentally limited to that specific group. Move beyond it, and not only would using the product now require the adoption of a non-standard chat tool, but the core function of your product is completely orthogonal to chat making the Slack requirement also appear really odd. That group won’t have muscle memory for Slack or know all of its key features. That group will not benefit from any of the familiarity your current customers find compelling.

And back when I was a Slack customer (I actually like Slack and prefer it to the alternatives) I’d still be raising concerns because of the tight coupling with Slack features.

Not trying to just criticize your decisions here, but trying to elaborate on an outsider’s perspective as someone who has been in the position to bring this kind of product on board at large companies, and as someone who has dealt with the pitfalls of building products that have 3rd party integrations.

Best of luck to you on all of this and it’s good to hear there’s an alternative on the roadmap.

akeck · on Aug 2, 2024

A lot of companies are MS O365/Teams shops.

mathiasn · on Aug 1, 2024

From a legal point of view that might be true, but I believe people are not aware that this is a problem. They just register, check the "Agree terms of service" box and do whatever they want to do. I saw that often, especially with Marketing.

galdosdi · on Aug 1, 2024

Then either the mandatory corporate training they signed off on their first day of employment was deficient, or they need to be fired for cause.

throwaway48540 · on Aug 1, 2024

It's not just legal but also the practical point of view. They committed fraud when they clicked that checkbox. It's exactly the same as signing a contract with someone else's name.

npmeye · on Aug 2, 2024

Also when you do npm i. That's fraud.

Did you just agree to opt the company into that smorgasbord of licenses?

throwaway48540 · on Aug 2, 2024

Not really, most (larger) companies have internal policies about that, listing the acceptable licenses. Which is exactly what I said - the employees are given the power to accept the terms. Some employees can sign SaaS contracts, though that's usually much less people.

mathiasn · on Aug 1, 2024

Which makes it even worse because you cannot detect that then :/

Shouldn't people just be able to try out new things? How can a company be innovative otherwise? And at a specific point (e.g. putting customer data into it), they need to start a proper vendor assessment process.

NoPicklez · on Aug 2, 2024

People can absolutely try new things, but time and time again you cannot trust people to not put sensitive data into those platforms and they continually do.

It's always a balance of information security awareness, culture and technological solutions within an organisation.

mathiasn · on July 2, 2024

Have you seen Livebook? Best Jupyter Notebook ever!! https://livebook.dev/

prasoonds · on July 2, 2024

Hey Mathias! Was fun chatting about Livebook the other day and yes, I'm definitely looking at it for inspiration! Alas it's an Elixir only notebook and so far as I know, there's very few data folks using Elixir so might be a hard sell.

mathiasn · on June 19, 2024

or even better https://ssotax.org