Yuchao Dai

Letting the Computer See and Understand


  • Rolling Shutter Relative Pose

The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism. In dynamic geometric computer vision applications such as visual SLAM, the so-called rolling shutter effect therefore needs to be properly taken into account. A dedicated relative pose solver appears to be the first problem to solve, as it is of eminent importance to bootstrap any derivation of multi-view geometry. However, despite its significance, it has received inadequate attention to date.

This paper presents a detailed investigation of the geometry of the rolling shutter relative pose problem. We introduce the rolling shutter essential matrix, and establish its link to existing models such as push-broom cameras, summarized in a clean hierarchy of multi-perspective cameras. The generalization of well-established concepts from epipolar geometry is completed by a definition of the Sampson distance in the rolling shutter case. The work is concluded with a careful investigation of the introduced epipolar geometry for rolling shutter cameras on several dedicated benchmarks.

Accepted by CVPR 2016

  • Non-Rigid Structure from Motion


This paper proposes a simple “prior-free” method for solving non-rigid structure-from-motion factorization problems. Other than using the basic low-rank condition, our method does not assume any extra prior knowledge about the nonrigid scene or about the camera motions. Yet, it runs reliably, produces optimal result, and does not suffer from the inherent basis-ambiguity issue which plagued many conventional nonrigid factorization techniques.

Our method is easy to implement, which involves solving no more than an SDP (semi-definite programming) of small and fixed size, a linear Least-Squares or trace-norm minimization. Extensive experiments have demonstrated that it outperforms most of the existing linear methods of nonrigid factorization. This paper offers not only new theoretical insight, but also a practical, everyday solution, to non-rigid structure-from-motion.

Videos (Deformable shape recover result our block matrix method compared with the ground truth, where "o" illustrates the ground truth, '+' gives our recovery result).

Dance Walking
Face Shark
Drink Yoga
Pickup Stretch


Matlab source code is freely available for academic users on request.



  • Rigid Structure from Motion

Sturm-Triggs iteration is a standard method for solving the projective factorization problem. Like other iterativealgorithms, this method suffers from some common drawbacks such as requiring a good initialization, the iteration may not converge or only converge to a local minimum, etc. None of the published works can offer any sort of global optimality guarantee to the problem. In this paper, an optimal solution to projective factorization for structure and motion is presented, based on the same principle of low-rank factorization. Instead of formulating the problem as matrix factorization, we recast it as element-wise fac torization, leading to a convenient and efficient semi-definite program formulation. Our method is thus global, where no initial point is needed, and a globally-optimal solution can be found (up to some relaxation gap). Unlike traditional projective factorization, our method can handle real-world difficult cases like missing data or outliers easily, and all in a unified manner. Extensive experiments on both synthetic and real image data show comparable or superior results compared with existing methods.

Paper in ECCV 2010


Code (SDP Implementation)

Code (Fixed-point continuation)

Code (Alternative Direction Method)

Matlab source code is freely available for academic users on request.


  • Rotation Averaging

We present a method for calibrating the rotation between two cameras in a camera rig in the case of non-overlapping fields of view and in a globally consistent manner. First, rotation averaging strategies are discussed and an L1-optimal rotation averaging algorithm is presented which is more robust than the L2-optimal mean and the direct least squares mean. Second, we alternate between rotation averaging across several views and conjugate rotation averaging to achieve a global solution. Various experiments both on synthetic data and a real camera rig are conducted to evaluate the performance of the proposed algorithm. Experimental results suggest that the proposed algorithm realizes global consistency and a high precision estimate.

  • Trajectory Reconstruction


In this work, generic smoothness constrained 3D trajectory reconstruction of moving object from monocular video is proposed. By introducing the generic smoothness constraint on the 3D trajectory, uncon- strained optimization model for 3D trajectory reconstruction is achieved and a closed-form solution is derived. Compared with the prede ned basis methods such as Discrete Cosine Transform based method and polynomial basis based method, the proposed method is more generic and can be applied to incomplete measurement case, thus having broad applicability. Geometric explanation and uniqueness of 3D trajectory reconstruction are given. Experimental results on both synthetic data and real monocular video sequences have shown the validity and advancement of the proposed method.

  • Dense Scene Flow Estimation

Dense scen flow estimation from RGB-D camera  

  • Multi-view 3D reconstruction from Uncalibrated Radially-Symmetric Cameras

In this paper, we present a new multi-view 3D Euclidean reconstruction method for arbitrary uncalibrated radiallysymmetric cameras, which does not require any calibration and camera model parameters as long as radial symmetry. It is built on the radial 1D camera model [23], a unified mathematical abstraction to different types of radiallysymmetric cameras. Efficient implementation based on alternating direction continuation is proposed to handle scalability issue for real-world applications.

  • Extrinsic Calibration of a Generically Configured RGB-D Camera Rig

With the increasing use of commodity RGB-D cameras for computer vision, robotics, mixed and augmented reality and other areas, it is of significant practical interest to calibrate the relative pose between a depth (D) camera and an RGB camera in these types of setups. In this paper, we propose a new single-shot, correspondence-free method to extrinsically calibrate a generically configured RGB-D camera rig. We formulate the extrinsic calibration problem as one of geometric 2D-3D registration which exploits scene constraints to achieve single-shot extrinsic calibration. Our method first reconstructs sparse point clouds from a singleview 2D image. These sparse point clouds are then registered with dense point clouds from the depth camera. Finally, we directly optimize the warping quality by evaluating scene constraints in 3D point clouds. Our single-shot extrinsic calibration method does not require correspondences across multiple color images or across different modalities and it is more flexible than existing methods. The scene constraints can be very simple and we demonstrate that a scene containing three sheets of paper is sufficient to obtain reliable calibration and with a lower geometric error than existing methods.