Research Papers

Improved Skeleton Tracking by Duplex Kinects: A Practical Approach for Real-Time Applications

[+] Author and Article Information
Charlie C. L. Wang

Associate Professor
Fellow ASME
e-mail: cwang@mae.cuhk.edu.hk

Department of Mechanical and
Automation Engineering,
The Chinese University of Hong Kong,
Hong Kong 1, China

The bone-length can be obtained during the initialization step.

Contributed by the Computers and Information Division of ASME for publication in the JOURNAL OF COMPUTERS AND INFORMATION DIVISION IN ENGINEERING. Manuscript received April 20, 2013; final manuscript received July 4, 2013; published online October 16, 2013. Editor: Bahram Ravani.

J. Comput. Inf. Sci. Eng 13(4), 041007 (Oct 16, 2013) (10 pages) Paper No: JCISE-13-1078; doi: 10.1115/1.4025404 History: Received April 20, 2013; Revised July 04, 2013

Recent development of per-frame motion extraction method can generate the skeleton of human motion in real-time with the help of RGB-D cameras such as Kinect. This leads to an economic device to provide human motion as input for real-time applications. As generated by a single-view image plus depth information, the extracted skeleton usually has problems of unwanted vibration, bone-length variation, self-occlusion, etc. This paper presents an approach to overcome these problems by synthesizing the skeletons generated by duplex Kinects, which capture the human motion in different views. The major technical difficulty of this synthesis comes from the inconsistency of two skeletons. Our algorithm is formulated under the constrained optimization framework by using the bone-lengths as hard constraints and the tradeoff between inconsistent joint positions as soft constraints. Schemes are developed to detect and re-position the problematic joints generated by per-frame method from duplex Kinects. As a result, we develop an easy, cheap and fast approach that can improve the skeleton of human motion at an average speed of 5 ms per frame.

Copyright © 2013 by ASME
Your Session has timed out. Please sign back in to continue.


Cavazza, M., Earnshaw, R., Magnenat-Thalmann, N., and Thalmann, D., 1998, “Motion Control of Virtual Humans,” IEEE Comput. Graph. Appl., 18(5), pp. 24–31. [CrossRef]
Pan, Z., Xu, W., Huang, J., Zhang, M., and Shi, J., 2003, “Easybowling: A Small Bowling Machine Based on Virtual Simulation,” Comput. Graph., 27, pp. 231–238. [CrossRef]
Chan, J. C. P., Leung, H., Tang, J. K. T., and Komura, T., 2011, “A Virtual Reality Dance Training System Using Motion Capture Technology,” IEEE Trans. Learn. Technol., 4(2), pp. 187–195. [CrossRef]
Woltring, H. J., 1974, “New Possibilities for Human Motion Studies by Real-Time Light Spot Position Measurement,” Biotelemetry, 1(3), pp. 132–146. [PubMed]
Vicon, “Vicon Motion Capture System,” available at: www.vicon.com
Xsens, “Xsens MVN,” available at: www.xsens.com
Olson, E., Leonard, J., and Teller, S., 2006, “Robust Rangeonly Beacon Localization,” J. Ocean. Eng., 31(4), pp. 949–958. [CrossRef]
Hazas, M., and Ward, A., 2002, “A Novel Broadband Ultrasonic Location System,” Proceedings of the 4th International Conference on Ubiquitous Computing, pp. 264–280.
Motion, M., Gypsy, available at: http://www.motion-capture-system.com
Measurand Shapewrap, available at: http://www.metamotion.com
Foxlin, E., and Harrington, M., 2000, “Weartrack: A self-Referenced Head and Hand Tracker for Wearable Computers and Portable VR,” Proceedings of the 4th IEEE International Symposium on Wearable Computers, pp. 155–162.
Bachmann, E. R., McGhee, R. B., Yun, X., and Zyda, M. J., 2001, “Inertial and Magnetic Posture Tracking for Inserting Humans into Networked Virtual Environments,” Proceedings of the ACM Symposium on Virtual Reality Software and Technology, pp. 9–16.
Vlasic, D., Adelsberger, R., Vannucci, G., Barnwell, J., Gross, M., Matusik, W., and Popović, J., 2007, “Practical Motion Capture in Everyday Surroundings,” ACM Trans. Graph., 26(3), pp. 35:1–35:10. [CrossRef]
Microsoft, “Microsoft Kinect for Windows SDK,” www.microsoft.com
Hauswiesner, S., Straka, M., and Reitmayr, G., 2011, “Free Viewpoint Virtual Try-On With Commodity Depth Cameras,” Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry, pp. 23–30.
Wilson, A. D., and Benko, H., 2010, “Combining Multiple Depth Cameras and Projectors for Interactions on, Above and Between Surfaces,” Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, pp. 273–282.
Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A., 2011, “Kinectfusion: Real-Time Dense Surface Mapping and Tracking,” Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136.
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., and Fitzgibbon, A., 2011, “Kinectfusion: Real-Time 3d Reconstruction and Interaction Using a Moving Depth Camera,” Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 559–568.
Tong, J., Zhou, J., Liu, L., Pan, Z., and Yan, H., 2012, “Scanning 3D Full Human Bodies Using Kinects,” IEEE Trans. Vis. Comput. Graph., 18, pp. 643–650. [CrossRef] [PubMed]
Weiss, A., Hirshberg, D., and Black, M., 2011, “Home 3D Body Scans From Noisy Image and Range Data,” 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1951–1958.
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A., 2011, “Real-Time Human Pose Recognition in Parts From Single Depth Images,” Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR’11, IEEE Computer Society, pp. 1297–1304.
Activate3D, “Activate3D's Intelligent Character Motion (ICM),” available at: www.http://activate3d.com
Moeslund, T. B., Hilton, A., and Krüger, V., 2006, “A Survey of Advances in Vision-Based Human Motion Capture and Analysis,” Comput. Vis. Image Underst., 104(2), pp. 90–126. [CrossRef]
Poppe, R., 2007, “Vision-Based Human Motion Analysis: An Overview,” Comput. Vis. Image Underst., 108(1–2), pp. 4–18. [CrossRef]
Rose, C., Bodenheimer, B., and Cohen, M. F., 1998, “Verbs and Adverbs: Multidimensional Motion Interpolation Using Radial Basis Functions,” IEEE Comput. Graph. Appl., 18, pp. 32–40. [CrossRef]
Piazza, T., Lundström, J., Hugestrand, A., Kunz, A., and Fjeld, M., 2009, “Towards Solving the Missing Marker Problem in Realtime Motion Capture,” Proceedings of ASME 2009 IDETC/CIE Conference, pp. 1521–1526.
Li, L., McCann, J., Pollard, N., and Faloutsos, C., 2010, “BoLeRO: A Principled Technique for Including Bone Length Constraints in Motion Capture Occlusion Filling,” Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 179–188.
Ho, E. S. L., Komura, T., and Tai, C.-L., 2010, “Spatial Relationship Preserving Character Motion Adaptation,” ACM Trans. Graph., 29(4), pp. 33:1–33:8. [CrossRef]
Caon, M., Yue, Y., Tscherrig, J., Mugellini, E., and Khaled, O. A., 2011, “Context-Aware 3D Gesture Interaction Based on Multiple Kinects,” Proceedings of The First International Conference on Ambient Computing, Applications, Services and Technologies, AMBIENT 2011.
Holte, M. B., Tran, C., Trivedi, M. M., and Moeslund, T. B., 2012, “Human Pose Estimation and Activity Recognition from Multi-View Videos: Comparative Explorations of Recent Developments,” IEEE J. Sel. Top. Signal Process., 6(5), pp. 538–552. [CrossRef]
Rusinkiewicz, S., Brown, B., and Kazhdan, M., “3D Scan Matching and Registration. ICCV 2005 Short Course,” available at: http://www.cs.princeton. edu/bjbrown/iccv05_course
Shoemake, K., 1985, “Animating Rotation With Quaternion Curves,” SIGGRAPH Comput. Graph., 19, pp. 245–254. [CrossRef]
Maimone, A., and Fuchs, H., 2012, “Reducing Interference Between Multiple Structured Light Depth Sensors Using Motion,” 2012 IEEE Virtual Reality Conference, pp. 51–54.
Plagemann, C., Ganapathi, V., Koller, D., and Thrun, S., 2010, “Real-Time Identification and Localization of Body Parts From Depth Images,” 2010 IEEE International Conference on Robotics and Automation, pp. 3108–3113.
Kalogerakis, E., Hertzmann, A., and Singh, K., 2010, “Learning 3D Mesh Segmentation and Labeling,” ACM Trans. Graph., 29(4), pp. 102:1–102:12. [CrossRef]
Baak, A., Müller, M., Bharaj, G., Seidel, H.-P., and Theobalt, C., 2011, “A Data-Driven Approach for Real-Time Full Body Pose Reconstruction From a Depth Camera,” IEEE 13th International Conference on Computer Vision (ICCV), pp. 1092–1099.


Grahic Jump Location
Fig. 2

Inconsistent skeletons extracted by two Kinects. (Top row) The 3D information captured by KA (first column) and KB (second column) are only partially overlapped even after carefully applying a registration procedure; as a result, the extracted skeletons, SA and SB, can be very close to each other but seldom be coincident. (Bottom row) In the view of KA, the elbow joint of SA is misclassified to the region of waist joint; although the position of this elbow joint on SB is correct, simply computing the average of SA and SB (i.e., 12(SA+SB)) will not give the result as good as S* generated by our approach.

Grahic Jump Location
Fig. 1

Problems of skeleton tracking by single Kinect—The viewing direction of Kinect sensor is specified by arrows on the photo. (Top-left) Self-occlusion: the left arm is hidden by the main body so that the positions of elbow and wrist are estimated to incorrect places. (Top-right) Bone-length variation: when viewing from a different direction, the length of forearm in an open-arm pose (right) changes significantly from its length that is captured when stands facing the Kinect camera (left). (Bottom) Artificial vibration: When viewing from a specific angle, the position of elbow joint has unwanted vibration even if a static pose is kept.

Grahic Jump Location
Fig. 3

An illustration for explaining the observation that the distance between mistracked joints in one viewing plane will generally be much shorter than the distance in another viewing plane

Grahic Jump Location
Fig. 4

Our algorithm can correct the positions of problematic joints by resolving inconsistency under our constrained optimization framework while preserving the bone-lengths. The joints in the circles are problematic.

Grahic Jump Location
Fig. 5

The statistics of bone-length variation at different parts of skeletons in a Badminton playing motion, where dashed, dot-dashed and solid curve are representing the bone-lengths of SA, SB, and S*, respectively. The target bone-lengths, which are obtained from the initialization step, are displayed as a horizontal dot line in black.

Grahic Jump Location
Fig. 6

The motion of badminton playing: the enhanced skeletons, S*, generated by our algorithm are listed in the third and the fifth rows, the skeletons generated by Microsoft Kinect SDK, SA and SB, are listed in the second row and the fourth row, respectively. In our tests, we also use a video camera to capture the motion (shown in top row) so that the real motion can be illustrated more clearly. The orientations of two Kinect cameras, KA and KB, are also illustrated in the first row—see the arrow in the first column.

Grahic Jump Location
Fig. 7

The motion of basket-ball playing: enhanced skeletons in the motion are displayed in the third and fifth rows along the same viewing direction of Kinect cameras (i.e., KA and KB), which are shown in the second and the fourth rows. The problematic skeletons in the motion extracted by KA and KB independently are circled by dashed lines.

Grahic Jump Location
Fig. 8

An illustration for limitation: the wrist joints and the elbow joints are hardly separated from the main body in both views of Kinect cameras; therefore, the resultant joint position may not be fixed correctly

Grahic Jump Location
Fig. 9

The squatting pose does not included in the database of Kinect SDK, and the positions estimated by Kinect SDK is not good. As a result, our approach cannot generate a reasonable skeleton with those input.



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In