Something really cool about this approach is the marching squares algorithm is highly parallelizable which would make this a good candidate for GPU acceleration provided that there is enough memory.
I implemented marching cubes in the 00's to extract isosurface meshes from PET scans. As far as I can remember I had it operating real-time on regular CPUs. The volumes were in the order of 512^3 if my memory serves me well. I guess GPUs could help for very large volumes though.
You wouldn't need marching cubes to operate real time - it runs once, generates an iso-surface mesh which can be efficiently rendered in hardware.
Edit: my thesis advisor was the first to prove that there were exactly 15 distinct configurations of voxels in MC, which gives you the ability to perform constant time lookup for the (configuration, rotation) for vertices on the edge of any particular voxel. http://graphics.stanford.edu/courses/cs164-10-spring/Handout...