Hacker Newsnew | past | comments | ask | show | jobs | submit | francescopace's commentslogin

Author here. Thanks for the interest!

Multiple people moving: The system detects "motion presence" rather than counting individuals. With multiple people, CSI variance increases, so detection works well. Distinguishing how many people would require ML (on the roadmap).

Detection latency: The algorithm processes each packet in real-time (~10ms @ 100 pkt/s), so detection is essentially instant. The default Home Assistant publish rate is 1 second to avoid flooding the event bus, but it's fully configurable via publish_interval.

Accuracy vs traditional sensors: See PERFORMANCE.md in the repo for detailed benchmarks.

Main advantages over PIR: no line-of-sight needed (Wi-Fi passes through walls) and no dead zones.

Happy to answer more questions!


Thanks for sharing the paper; I'll certainly be taking a look at that research!


Currently, ESPectre performs only binary motion detection (IDLE/MOTION) based on simple statistical thresholding.

It cannot ignore cats or prioritize size over speed directly on the device, but ESPectre's architecture is designed to enable this kind of advanced classification externally.

It collects a rich set of pre-processed features (spatial turbulence, entropy, etc.) and transmits them via MQTT.

Any external server (like a Home Assistant add-on or a dedicated Python script) can use these features as the input for a trained ML model to perform classification (e.g., Cat vs. Human vs. Fall detection vs. Gesture detection).

Regardin Ruckus Router / Beamforming: for CSI sensing, stability is generally more important than raw power. I recommend starting by disabling beamforming or reducing the power output if you experience poor motion sensitivity, as the stability of the ESP32 receiver is often the bottleneck.


Thank you, it looks fascinating. Putting it in my hobby project queue at position 1, right after my current one.


Mike, thank you so much! Coming from the founder of TOMMY, that means a lot.

I completely respect the way you've managed the beta community and licensing; it’s a smart way to reward early supporters and foster user engagement.

I wish you and the TOMMY project continued success as well!


The project’s open-source nature acts as an ethical safeguard, and I am explicitly not pursuing any identity recognition features, just movement detection.

But you are absolutely right that, in theory, misuse of this technology could reveal certain behavioral patterns that might lead to identification.

However, it can also be extremely useful for safety purposes, for example, detecting people during a house fire or an earthquake.


This particular project isn't what terrifies me. It's the technology itself which I'm certain you won't be the last to develop and quite likely not even the first. There are plenty of state actors that likely already have their hands on a technology like this or are working on it


Are you using two separate S3s as a dedicated Transmitter/Receiver pair, or are both transmitting data simultaneously?


Transmitter/Reciever pair.


Hahaha, I love your brain!


That's a fair point, and as a math graduate, I absolutely agree that ML is fundamentally applied math.

When I say 'No ML,' I mean there is no training phase, no labeled data needed, and no neural network model used to infer the rules.

The distinction here is that all the logic is based purely on signal processing algorithms.

Thanks for raising the point!


Fun fact: I’m working on turning ESPectre into a Wi‑Fi Theremin (the musical instrument you play by moving your hands near an antenna).

The idea of “playing” by simply moving around a room sounds a bit ridiculous… but also kind of fun.

The key is the Moving Variance of the spatial turbulence: this value is continuous and stable, making it perfect for mapping directly to pitch/frequency, just like the original Theremin. Other features can be mapped to volume and timbre.

It’s pure signal processing, running entirely on the ESP32. Has anyone here experimented with audio synthesis or sonification using real-time signal processing?


Ive worked on some sonification projects that used signals from xbox kinect lidar, piezos, and other sensors. Co-author on paper i wrote developed a "strummable" theremin that divided physical space with invisible "strings" of various tunings. We preferred running synthesis on PC when possible and just outputting midi and OSC, as DSP on ESP32 has limits for what can be achieved in under 5-10ms. If the goal is hardware audio output, you may need to look into dedicated DSP chips and an audio shield for better DAC- but for prototyping can easily bang a square wave through any of esp32 pins


Thanks for the insights Quinnjh! Would love to hear more about your invisible strings tuning system!

The ESP32-S3 extracts a moving variance signal from spatial turbulence (updates at 20-50 Hz), and I want to map this directly to audio frequency using a passive buzzer + PWM (square wave, 200-2000 Hz range).

Two quick questions:

1. Do you see any pitfalls with updating PWM frequency at 20-50 Hz for responsive theremin-like behavior?

2. Any recommendations on mapping strategies - linear, logarithmic (musical scale), or quantized to specific notes?


you may be interested in some tech details on that project's prototypes here: https://www.quinnjh.net/projects/adaptive-instruments-projec...

As for the tuning system, we didnt get great demo recordings of it, but the invisible strings were linearly mapped as a range onto degrees of a given scale. In our use-case (letting people with disabilities jam without too much dissonance) that key+scale and the master tempo were broadcast to each instrument.

Would have been interesting to play more with custom tunings, but the users we were designing for would have had a harder time using it consonantly. FWIW fully-abled folks like myself sound pretty bad on the theremin, and seeing people play them in orchestras etc displays an impressive level of "virtuosity" to place the hands properly. Quantizing the range of possible positions helps but the tradeoff is sacrificing expressivity.

As for 1) yes, there will definitely be some pitfalls with the relatively slow updates - which may show up as "zipper noise" artifacts in output.

For 2), logarithmic mapping between position and pitch is traditionally theremin-like, but as the theremin avoids zippering by being analog, youll have to get creative with some smoothing/lerping and potentially further quantization. Thats the fun and creative bit though!

Would love to hear about your project again and what approaches you take, and happy to answer other q's so feel free to drop me a line.


Great project and great idea, thank you for sharing !

I don't know if it's useful but one technique I have used in sonification during the experimentation phase is to skip the real time aspect, capture all the available "channels" and generate all the possible permutations of what is mapped where.

Then you can listen to the outputs, see what sounds good, and then test it in real time to check if the musicality is actually a result of the physical interaction and not an artifact or a product of noise.


Thank you 4goturnamesagain.

My first step is to 'listen' to the raw channels and features to quickly find which mapping produces the most musically coherent (i.e., clean and physically predictable) output.

If it sounds like white noise, the mapping is bad or the signal is artifact.

If it sounds like a sine wave moving predictably, the physics are sound.


By "GPLv3" do you mean "GPL-3.0-or-later"? https://www.gnu.org/licenses/gpl-faq.html#VersionThreeOrLate...


I wonder if somebody could make an open hardware version of the Leap Motion with this technique (though I'm not sure how accurate/repeatable the sensing is - Leap Motion could detect with an accuracy of < 0.7mm)


That's a great thought. The challenge is that Wi-Fi sensing (CSI) is measuring multipath changes across a few meters, not direct motion in a small volume like Leap Motion. I think its accuracy is measured in centimeters, not sub-millimeters.


I'm sure the kids will love this! Wi-Fi Theremin sounds great.


You hit the nail on the head! That's precisely the motivation.

Having two kids myself, I've thought of turning it into a game: blindfolded hide-and-seek where the pitch of the Wi-Fi Theremin tells the seeker how close they are to the 'signal disruption' of the other person. It's essentially a real-time sonar game!


Exactly! That is kind of fun that I had in mind.


Sure, the ESP32 will connect to whichever mesh node provides the best 2.4 GHz signal

- It monitors CSI from that specific node (the one it's associated with)

- If the ESP32 roams to a different mesh node, it will start monitoring CSI from the new node

The system doesn't care about the router's internal mesh topology, it just needs a stable connection to receive CSI data from the associated access point.


In terms of layout of rooms and useful monitoring, you have to be able to configure which node it attaches to, right? Because it's going to monitor the physical space between itself and that node.

So you might have an ESP32 placed across the room from one mesh node to monitor that particular room. But if that ESP32 roams to, say, the mesh node on the floor above it, it's going to monitoring a much less useful space - just the vertical space between itself and the mesh node on the floor above.

Am I envisioning this correctly? I'm thinking its a problem for systems like eero, where you can't lock a device to a particular mesh node.


On the critical topic of Mesh Routers and Roaming, a possible solution is to force the ESP32 to hook onto the MAC address of a single Access Point, as discussed here:

https://github.com/francescopace/espectre/discussions/6


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: