Weakly-Supervised RGB-Based 3D human body pose and shape estimation
View/Open
163458.pdf (18,96Mb) (Restricted access)
Cita com:
hdl:2117/368897
Document typeMaster thesis
Date2022-04-27
Rights accessRestricted access - author's decision
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Estimating accurate 3D pose and shape from a 2D image is an inherently difficult problem. Part of the difficulty arises from the ambiguity of potential solutions based solely on geometric features. The fields of computer vision and artificial intelligence are particularly suited to finding a solution to this problem although they have primarily focused on pose recovery, leaving shape as an afterthought. This thesis explores adaptations of and extensions to a recent human mesh recovery framework that showed a significant improvement on shape metrics compared to a very popular real-time pose and shape estimator. The framework employs a multi-stage process, refining pose first, and then shape, through a non-differentiable mesh deformation process. A differentiable alternative to these deformation steps was proposed. In support of this effort, a dataset was compiled which indexes some of the most popular 2D and 3D human datasets and provides a common access format. The mesh recovery framework was retrained using this dataset, which incorporated an order of magnitude more samples than the dataset used to train the published framework. The new weights achieved the same levels of performance as the published weights, despite having less reliable ground-truth annotations. In addition, a multi-layer perceptron that has demonstrated state-of-the-art performance at pose parameter regression was trained, using millions of ground-truth 3D human meshes, to correct perturbations in shape and pose. Training techniques and methods of interfacing this network to the mesh recovery framework have been investigated and documented.
SubjectsThree-dimensional display systems, Computer vision, Visualització tridimensional (Informàtica), Visió per ordinador
DegreeMÀSTER UNIVERSITARI EN INTEL·LIGÈNCIA ARTIFICIAL (Pla 2017)
Collections
Files | Description | Size | Format | View |
---|---|---|---|---|
163458.pdf | 18,96Mb | Restricted access |