Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Be aware that if you run it locally with the open weights there is less censoring than if you use DeepSeek hosted model interface. I confirmed this with the 7B model via ollama.

The censoring is a legal requirement of the state, per:

“Respect for China’s “social morality and ethics” and upholding of “Core Socialist Values” (Art. 4(1))”

https://www.fasken.com/en/knowledge/2023/08/chinas-new-rules...



Models other than the 600b one are not R1. It’s crazy how many people are conflating distilled qwen and llama 1 to 70b models as r1 when saying they’re hosting them locally

The point does stand if you’re talking about using deepseek r1 zero instead which afaik you can try on hyperbolic and it apparently even answers the tianmen square question.


What is Ollama offering here in the smaller sizes?

https://ollama.com/library/deepseek-r1


That legal requirement is also finding its way into private requirements. Bytedance required US based TikTok employees to sign agreements to uphold the same exact things, effectively turning TikTok into a company subject to the policies of the CCP. See details from this lawsuit:

https://dailycaller.com/2025/01/14/tiktok-forced-staff-oaths...


Is this true with Groq too?


Groq doesn’t have r1, only a llama 70b distilled with r1 outputs. Kinda crazy how they just advertise it as actual r1


I don’t quite understand what the difference between the Groq version and the actual r1 version are. Do you have a link or source that explains this?


Actual r1 is a base model by Deepseek with CoT added by them via RL.

The distillations are other people's base models with RL used to add CoT to them. The 70b one is Llama that Deepseek has modified.

As I understand it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: