Camera (&AI) based Mocap

AI- and webcam-based motion capture is a form of markerless motion capture: instead of using suits with sensors or reflective markers, the system analyzes video images from one or more cameras and estimates the position of the human body using computer vision and machine learning. It can often run using ordinary webcams, smartphones, or consumer cameras.

How AI-Based Mocap Works

1. Video Capture

A webcam, smartphone, DSLR, or multiple cameras record a performer moving through space.
Depending on the system:

2. Pose Estimation

AI models analyze each video frame and detect key body points such as:

These points are often called landmarks or keypoints. The AI has been trained on massive datasets of human movement, allowing it to recognize body posture even under imperfect lighting or partial occlusion.

3. Skeleton Reconstruction

The detected points are connected into a digital skeleton or “rig.”
The software estimates:

Advanced systems may also track:

4. Retargeting

The motion data is transferred (“retargeted”) onto:

This allows live movement to control digital media from recordings or in real time.


Example Tools:

MediaPipe

Open Pose

FreeMoCap

MediaPipe is an open-source framework developed by Google for real-time AI perception.

It includes models for:

  • body tracking
  • hand tracking
  • face tracking
  • gesture recognition

MediaPipe is widely used because it:

  • works in browsers, Python, mobile apps, and game engines
  • runs efficiently on consumer hardware
  • supports real-time interaction
  • is relatively easy to integrate into creative coding environments

MediaPipe is often connected to:

  • TouchDesigner
  • Unity
  • Unreal Engine
  • Resolume
  • Blender

OpenPose

One of the foundational open-source AI pose estimation systems developed by Carnegie Mellon University.

Widely used in:

  • research
  • interactive installations
  • experimental audiovisual systems
  • dance technology
  • real-time

It can track:

  • full body
  • hands
  • fingers
  • facial landmarks

FreeMoCapis an open-source motion capture system focused on accessible, research-based full-body capture.

Unlike simple webcam pose estimation,
FreeMoCap can use:

  • multiple webcams
  • synchronized cameras
  • AI pose estimation
  • biomechanical reconstruction

It combines tools such as:

  • MediaPipe
  • OpenCV
  • scientific motion analysis workflows

The system triangulates body positions from multiple camera angles to reconstruct motion in 3D space more accurately than a single webcam setup.

mostly for recordings, not realtime

Move.ai

DeepMotion

Rokoko Vision

Move.ai

Uses AI and multiple cameras or phones for high-quality markerless mocap without suits. Often used in:

  • virtual production
  • indie filmmaking
  • game animation
  • previs workflows

Mostly processed capture, some live workflows emerging

https://www.deepmotion.com/ 

Allows users to upload ordinary video footage and generate motion capture animation automatically using AI.

Useful for:

  • rapid prototyping
  • avatar animation
  • virtual influencers
  • metaverse applications

Free version tracks up to 2 people, not realtime

Rokoko Vision

Originally known for inertial mocap suits, but now also supports AI webcam-based tracking.

Popular in:

  • indie animation
  • live performance
  • virtual characters
  • VTubing
  • interactive installations

Free for up to 15 seconds

 


Revision #1
Created 2026-05-12 10:43:32 UTC by Astrid
Updated 2026-05-12 11:03:41 UTC by Astrid