Camera Tracking
APPLICATION
PRODUCT
CONTACT
Mr. Pete Hughes+44 (0)1865 811 060
Mr. Danny Proko
+1 949 540 0740
TOOLS
Print this pageEmail this page
2d3 has been developing and refining technology to exploit vision science structure from motion principles since 1999. 2d3 camera tracking capabilities make it possible, using the information contained in a moving image sequences alone, to calculate the path of the originating camera in 3D space and describe the 3D position of 2D features within the source image sequence.
The process involves calibrating the camera's internal parameters, such as focal length and film back dimensions, as well as its motion in 3d space. 3D geometry points can be created from identified features in the image sequence and using these points, their 2D feature positions and the calibrated camera parameters, the camera path and point positions can be optimized to produce the most accurate recreation of the source camera.
Camera calibration and tracking for the entertainment industry has traditionally been an offline process performed after the images have been captured. However we have amongst our arsenal of technology the ability to track cameras in real-time using automatically identified, natural 2D features or fiducial markers placed in the environment. This real-time capability has an enormous range of potential applications from on-set feature film pre-visualisation to real-time augmented reality virtual tours in the tourism industry.
We can also calibrate and recover virtual cameras from a set of still images allowing, for example, measurements to be made of the environment in the images using the images set alone.
2d3's camera tracking capability relies on algorithms developed in the field of computer vision and in particular the technique known as structure from motion, which underlies a many of 2d3 products and is one of the key components in various 2d3 capabilities.
For a stream of video images or multiple still images, the structure from motion approach involves the automatic identification of hundreds or possibly thousands of distinctive points that appear in areas of high contrast or high texture. Tracking the motion of these salient feature points in 2 dimensions through multiple video frames, or matching corresponding features in multiple still images, allows their two dimensional trajectories to be computed.

Structure from motion enables the three dimensional movement of the camera to be inferred from the 2d motion observed in the image sequence. Visible in the image above is a red line indicating the inferred trajectory of the camera in 3d space and the camera view frusta for some of the frames.
The quality of a camera solution is to an extent dependant on the quality of the 2D feature tracks it uses in the calibration process. By using a large set of hundreds or thousands of automatically identified feature tracks and application of statistical approaches to identify the primary motion within a sequence and discard tracks who's motion is inconsistent, 2d3’s camera tracking capabilities offer an unparalleled robustness against conflicting motions and fragmented tracks within a scene.
A by-product of this process is the generation of a sparse representation of the 3D scene structure, a point cloud as shown in the above screenshot. 2D features and 3D feature predictions are indicated by red crosses and coloured circles respectively.
Also recovered are internal camera parameters, such as focal length and lens distortion coefficients, allowing image correction to be performed prior to application of any further processing.
This generation of 3d structure from 2d source imagery was the origin for the company name 2d3, from 2d into 3d.
It should be noted that the feature point identification is intensity rather than color based and so the structure from motion approach is equally applicable to color, monochrome and thermal imagery and at any resolution.
- Multiple View Geometry in Computer Vision
Richard Hartley & Andrew ZissermanCambridge University Press, Mar 2004, ISBN 0-5215-4051-8
A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Techniques fot this problem are taken from projective geometry and photogrammetry. Here, the authors covers the geometric principles and their algebraic representation in terms of camera projection, matrices, the fundamental matric and the trifocal tensor.









