Realtime Dense Mesh Capture
If we wanted the most detailed, most accurate per-frame reconstruction of a subject’s mouth, how might we go about capturing that?
My dream scenario for capturing and translating accurate human speech looks something like a (nearly) per-frame detailed point cloud. From there, we could annotate and solve tracking issues in 3D, as well as have access to the detailed curvature of the lips and mouth region - all helping to inform our rig on how to drive the render mesh. In some cases, these points could directly drive the render mesh.
Assuming that this type of detailed information is useful, how might one go about capturing such a massive amount of data?
Lasers!
Well, probably not. Even the most state-of-the-art laser scanners aren’t really meant to capture multiple frames per second. In some cases, a single scan could take several seconds. But the resulting data is beautiful, and nearly perfect - especially for a close-up subject like a human face. But those, eyes - ouch.
How about photogrammetry?
Is it even possible to capture a dense mesh in realtime? Well, not really realtime, but there is another solution that I can think of. It’s been demonstrated all over the place, generally in re-creating faces/objects and environments.
It’s also the same concept behind how we use SolvePnP to get a head pose estimation.
What if we had a rig with multiple cameras (crudely pictured to the right) - all capturing at a reasonably high resolution with a reasonably high frame rate. Each camera fixed to the rig at a specific location, fixed on a common focal point.
We could then record sessions as we normally do, then post-process the data from each of the cameras on the rig. We could sync the frames from all cameras and solve on a per-frame basis the subject’s face.
This would result in a dense per-frame mesh with a high level of detail and accuracy. The more cameras, the more accurate. Is this too much data? I have no idea.
Check out the video below. That’s a tremendous amount of mesh data from a few DSLRs…
Wait, I thought David Fincher invented Photogrammetry??