We spent a lot of sweat to make it all transferable from one robot to another easily. It is our goal that we all have robots that feel alive and that we can all develop together.
This kind of architecture is very similar to what Physical Intelligence used for Pi-0.5 (a VLM triggering a VLA in different areas), albeit to a smaller size for now.
The agentic OS allows to abstract away from ROS2 by creating the concept of "skills" that can be triggered by the agent; you don't really have to care anymore about defining nodes or racing conditions between them, our framework is a lottle simpler in that way. Triggering a skill is the equivalent (almost) to triggering an action in ROS2 btw.
But the interest of it is how it is packaged and that it can be triggered by this VLM-based agent called BASIC that we created.
This agent has a special kind of architecture which gives it spatial memory, capability to react in time, and ability to navigate. This means that skills can be triggered in the right place at the right time, like a true entity. BASIC can interrupt skills in execution depending on how you configure it. So that if it's chilling navigating around and it sees you, it would interrupt navigation and get to you, shake hands and ask if you need help instead of finishing the task without reacting.
The example you gave is exactly this. The driving around in the example use-cases shows an example of going from room to room and execute policies in context:
I take from your comment that the full capabilities of the robot are not properly represented and I take a note to film longer ones. And it can definitely do what you just asked. And multiple times in a row. I will note that it depends on the training quality of course.
On speed of movement, I now realize we didn't mention it anywhere so I added it in the overview but it's pretty fast. 0.7m.s-1 for the base, and the arm can be tormented quite a bit. Just took this video for you:
One of our objectives here was to fix everything that we don't like about the SO-101 and the Kiwi, which have several hardware and software flaws in our view. Including, yes, the constant need for a computer to simply run your robot.
The training does require external GPUs (but we provide that infra for free, straight from the app!), but the onboard jetson can run models trained though, as you can see in the examples. Everything you see in the vids is running onboard when it comes to manipulation, because we use a special version of ACT made specifically by us for this robot, that also includes a reward model (like DYNA does).
We have developed this system to also be able to run the other components smoothly so it also does SLAM, and has room for more processing even when running our ACT.
Now indeed this cannot run Pi-0 but from our experience - and the whole community in general - VLAs are not particularly better than ACT in the low data regime, and need a lot more compute.
As for community-level datasets, yes this is the plan. Anything you train can already be shared with others - just share the files. We didn't develop a centralized place for sharing datasets and behaviors but it is on the plan.
That would not make it a complete product and would always require a complex setup whenever and wherever you want to use it.
This one is really, really convenient and intuitive. Turn it on anywhere, even outside, it just works. Even when I want to dev on it, it's super convenient.
On some level I truly believe robotics has to become more "complete", we can't always just piece things together, it makes it very hard to have a beautiful product.
I realize this is more of a philosophical answer, but I also think it is the right one to take this field to the next level
If we could sell it without we would for sure, but this is a current technological limitation. And we make extra easy to connect to it anywhere still, from your phone. Several components of the robots do not need this cloud service, and because the OS is accessible to you, you could even replace it with your own of doing things.
For this one, it's just the only feasible way we found to bring the kind of experience we created to folks.
Hi, I am currently considering a Lekiwi build but I am intrigued by Mars. Outside of the need for external compute, what issues did you find with SO101 and Kiwi?
Also I am curious about a couple of the parts, if you don't mind sharing - are those wheels the direct drive wheels from waveshare? And what is the RGBD camera? (Fwiw, even if it's hefty the MARS price tag seems fair to me).
There's several things but for example, there is no LiDAR on it nor even a good place to put one. If you're going to navigate around, without a LiDAR or good compute for VSLAM (which is very hard to setup and VERY demanding in compute), you will very quickly get lost. At this point the Kiwi is only for very local navigation (and you will still have IMU drift).
There is also a possibility for it to tip the base if the arm is fully extended. And the SO-101 has quite poor repeatability.
The base is also slow to move, and depending on which surface you are the omniwheels can get dirt in quickly.
Finally, external compute means you need in particular to teleoperate from your computer, so you have to be far from the robot and not necessarily in the same orientation than it which is very, very uncomfortable. This app system we made is one of the things people love the most about MARS.
Ah, and RGBD really does matter for navigation AND for learning (augmenting ACT with depth yields better results).
The wheels are indeed these ones, and the camera on the video is a luxonis oak-d wide, pretty expensive but comfortable to work with. However, the version we're shipping includes a much cheaper stereo-depth camera that we calibrate ourselves - I can't get you the reference right right now cause it's late at night but feel free to reach out on discord
Ah, so that's why the camera seemed familiar, I have a couple of the luxonis cameras around the office :). Re: kiwi, those are good points. Thank you for the answer!
As a side note, the previous generation of research platforms for that size made in Asia were the Turtlebots, which go for that same price, but without GPU, arm...
I would say the problem is that most manufacturers, including chinese, sell you platforms that are not reliable enough for AI manipulation, and there's a race to the bottom for it, to which we try not to participate to
> I would say the problem is that most manufacturers, including chinese, sell you platforms that are not reliable enough for AI manipulation, and there's a race to the bottom for it, to which we try not to participate
Pretty lofty claims though, really think you're so above everyone on quality at this price point? I know what dynamixels are capable of, and I see the jitter in the demo videos.
Why aren't the manipulator specs easily accessible on the website? Have you run a real repeatability test? Payload even?
It's a neat high-fidelity garage build platform, but I don't see any reason to assume this price premium is due to hardware quality.
That's fine, but for future reference, robotic arms should have their specs listed and quantified - stuff like reach, payload, repeatability. If I'm a researcher, how do I know if this arm can do what I need? I can only infer so much from a few demo videos.
Final comment I'll say, it's a weird and tough price point. Actual research labs would rather spend $20,000 on a very high quality and likely larger high-fidelity platform. A random hacker or grad student will need some real convincing to shell out $2,000, sub $1K might better serve them. So what's the target customer profile exactly?
I encountered similar issues developing a $3K plug and play robot research arm in the past. The economics are awkward. You can actually just spend $5K and get a really good second-hand industrial robot (maybe even first-hand now from China). Or you could spend $500 and get a 6 DOF platform at least as good as your current platform's arm and then buy the sensor separately and bolt it to your workspace - bam, done. And no, the software isn't that important, servos are easy to work with...
Therefore my 'in between platform' was stuck in a hard place. I made some one-off sales, but never really scaled the business, which is what would be needed for any fancy "we're the platform where people do AI" vision to manifest to investors. Hardware is tough - they'll see your numbers and easily pass. They'll realize you need sales in quantity to get anywhere meaningful otherwise.
So I wanted to share criticisms and my experience so you can look ahead to likely challenges and hopefully get further. Best of luck.
Absolutely, the link I sent you has these specs you mentioned listed.
And yeah, I agree this mid-market is indeed tough, but this is the upper price I was looking for when I started with my AI background and bought a similar-price turtlebot then struggled to put a cheap arm on it. Anything under this is really bad for algorithms, although you can reduce it with just the arm and clamping it to your workspace as you suggested but then you don't have mobility.
I will keep your comment in mind, and thank you for the thoughtfulness. You might be interested to know that we intend to show something bigger not long from now. But this is, as you said, more for investors.
For now I'm content if there's enough people that want this one
Actually the BOM cost required to make something stable that can execute manipulation tasks well enough is around $1k+ hence our price. You will find very cheap robots that can pretend to do what this one can, but in practice won't work well enough.
As for the unitree robot, this one is not unlocked for development, does not have onboard GPU, and does not have an arm. If you want it, check the price they give, it's very prohibitive.
You could attach a cheap arm to it but it would also not be stable enough for AI algorithms to run it. We're researchers ourselves, we would have made it cheaper if we could, but then you just can't do anything with it.
Our platform will deliver the experience of a real AI robot, anything cheaper than that is kind of a lie - or forces you to assemble and calibrate, which we do for you here. It is just the nature of trying to deliver a really complete product that works, and we want to stand for that.
I don't want to be too negative, but all your demos seems extremely silly.
Sure, the package is really interesting and definitely got me interested. But not one of the demos seems like a good use of the hardware. If you want to position yourselves mainly as an educational tool I don't think that is a problem. But if you want to target the 'maker' community I think you should put some thought into that.
For example, you could change the 'security guard' demo into a 'housekeeper' demo. You make it roam your house during the day and keep an up-to-date list of things you need to buy. I think this should work reasonably well for laundry and cleaning products. And after you have some historical data you could even do some forecasts about when you need to buy things again.
Another example would be to have it integrated with weather data and when it starts to rain it goes around the house to check if all windows are properly closed. On this same note it could keep track of the window state during the day and send you a reminder to open/close some windows if the temperature/humidity is above/below some threshold.
I think that by having some more 'useful' examples you should be able to get more attention from the 'maker' community. My guess is that a lot of folks that are heavily into home automation would love to have a device like that help with random things around the house.
Best of luck with your product, and I hope you succeed because this idea looks really exciting.
This does not sound negative to me and actually is great to read because this is definitely in the list of potential use-cases so we could make a video for it!
That's just one example that came to mind. I guarantee I could dig for 30 mins and find a mobile manipulator platform from China that kills it on hardware-to-price ratio that is either 'open enough' or could be made so.
As someone who's dabbled in this before, I guess I'd rather just sit down and plan a BOM and do it myself if that's your markup anyways. Not that it's totally unreasonable for people who just want something super simple out of the box that works.
My general commentary is just that it's sad how much basic servos and what not cost in North America. We've completely ceded this industry to Asia.
Our servos come from Asia. If you can find a platform with everything we have for around $1k BOM happy to review it but we've been pretty deep in picking our components.
Also, fair to say that if indeed you're the kind of person who likes to assemble all of this yourself, you're not directly in our target :)
This is more for AI / software folks who don't want to have to assemble and calibrate everything and risk having an arm that is not repeatable and thus can't actually properly learn. We have seen many folks spend a weekend or more trying to put these together and end up with a barely working platform and then be disgusted of AI robotics
On BASIC: Yes it does require an internet connection and until we figure out how this works for you it will remain free of use!
It is required to run them, not to create them. And it's not about running "pick_up_socks", this one can already run on your robot. BASIC is required to chain it with other tasks such as navigating to another part of your house and then running another skill to drop the sock somewhere for example
Thank you for the remark, we will make it clearer in the docs
As a consequence: The robot does not necessarily require Internet to run, but if you want it to chain tasks while talking and using memory, yes it does.
Happy to have you onboard!