Be aware that if you run it locally with the open weights there is less censorin...

siwakotisaurav · 2025-01-28T22:18:05 1738102685

Models other than the 600b one are not R1. It’s crazy how many people are conflating distilled qwen and llama 1 to 70b models as r1 when saying they’re hosting them locally

The point does stand if you’re talking about using deepseek r1 zero instead which afaik you can try on hyperbolic and it apparently even answers the tianmen square question.

bhouston · 2025-01-29T02:24:07 1738117447

What is Ollama offering here in the smaller sizes?

https://ollama.com/library/deepseek-r1

blackeyeblitzar · 2025-01-29T00:50:38 1738111838

That legal requirement is also finding its way into private requirements. Bytedance required US based TikTok employees to sign agreements to uphold the same exact things, effectively turning TikTok into a company subject to the policies of the CCP. See details from this lawsuit:

https://dailycaller.com/2025/01/14/tiktok-forced-staff-oaths...

daft_pink · 2025-01-28T22:20:42 1738102842

Is this true with Groq too?

siwakotisaurav · 2025-01-28T22:22:22 1738102942

Groq doesn’t have r1, only a llama 70b distilled with r1 outputs. Kinda crazy how they just advertise it as actual r1

daft_pink · 2025-01-29T20:16:37 1738181797

I don’t quite understand what the difference between the Groq version and the actual r1 version are. Do you have a link or source that explains this?

Alex-Programs · 2025-01-31T23:07:41 1738364861

Actual r1 is a base model by Deepseek with CoT added by them via RL.

The distillations are other people's base models with RL used to add CoT to them. The 70b one is Llama that Deepseek has modified.

As I understand it.