Faceptor:A Generalist Model for Face Perception

秦立雄, MeiWang, XuannanLiu, YuhangZhang, WeiDeng, 宋晓帅, 徐蔚然, WeihongDeng

March 2024

PDF 代码 DOI ECCV 2024

摘要

With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To tackle this issue, we focus on the unified model structure, exploring a face generalist model. As an intuitive design, Naive Faceptor enables tasks with the same output shape and granularity to share the structural design of the standardized output head, achieving improved task extensibility. Furthermore, Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture, allowing task-specific queries to represent new-coming semantics. This design enhances the unification of model structure while improving application efficiency in terms of storage overhead. Additionally, we introduce Layer-Attention into Faceptor, enabling the model to adaptively select features from optimal layers to perform the desired tasks. Through joint training on 13 face perception datasets, Faceptor achieves exceptional performance in facial landmark localization, face parsing, age estimation, expression recognition, binary attribute classification, and face recognition, achieving or surpassing specialized methods in most tasks. Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition. The code and models will be made publicly available at https://github.com/lxq1000/Faceptor.

类型

会议文章

会议

ECCV 2024

"computer vision" "foundation model"

Faceptor:A Generalist Model for Face Perception

摘要

秦立雄

博士研究生

宋晓帅

硕士研究生

徐蔚然

副教授，硕士生导师，博士生导师