-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCITATION.cff
83 lines (83 loc) · 2.91 KB
/
CITATION.cff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
cff-version: 1.2.0
title: >-
HEAD: HEtero-Assists Distillation for Heterogeneous
Object Detectors
message: Please cite this project using these metadata.
type: software
authors:
- given-names: Luting
family-names: Wang
email: wangluting@buaa.edu.cn
orcid: 'https://orcid.org/0000-0001-8317-226X'
affiliation: Beihang University
- given-names: Xiaojie
family-names: Li
email: lixiaojie@senseauto.com
affiliation: SenseTime
- given-names: Yue
family-names: Liao
email: liaoyue.ai@gmail.com
affiliation: Beihang University
- given-names: Zeren
family-names: Jiang
email: zeren.jiang99@gmail.com
affiliation: ETH Zurich
- given-names: Jianlong
email: jlwu1992@sdu.edu.cn
affiliation: Shandong University
family-names: Wu
- given-names: Fei
email: wangfei91@mail.ustc.edu.cn
affiliation: University of Science and Technology of China
family-names: Wang
- given-names: Chen
email: qianchen@sensetime.com
affiliation: SenseTime
family-names: Qian
- given-names: Si
email: liusi@buaa.edu.cn
affiliation: Beihang University
family-names: Liu
identifiers:
- type: doi
value: 10.48550/arXiv.2207.05345
description: arXiv
repository-code: 'https://github.com/LutingWang/HEAD'
abstract: >-
Conventional knowledge distillation (KD) methods
for object detection mainly concentrate on
homogeneous teacher-student detectors. However, the
design of a lightweight detector for deployment is
often significantly different from a high-capacity
detector. Thus, we investigate KD among
heterogeneous teacher-student pairs for a wide
application. We observe that the core difficulty
for heterogeneous KD (hetero-KD) is the significant
semantic gap between the backbone features of
heterogeneous detectors due to the different
optimization manners. Conventional homogeneous KD
(homo-KD) methods suffer from such a gap and are
hard to directly obtain satisfactory performance
for hetero-KD. In this paper, we propose the
HEtero-Assists Distillation (HEAD) framework,
leveraging heterogeneous detection heads as
assistants to guide the optimization of the student
detector to reduce this gap. In HEAD, the assistant
is an additional detection head with the
architecture homogeneous to the teacher head
attached to the student backbone. Thus, a hetero-KD
is transformed into a homo-KD, allowing efficient
knowledge transfer from the teacher to the student.
Moreover, we extend HEAD into a Teacher-Free HEAD
(TF-HEAD) framework when a well-trained teacher
detector is unavailable. Our method has achieved
significant improvement compared to current
detection KD methods. For example, on the MS-COCO
dataset, TF-HEAD helps R18 RetinaNet achieve 33.9
mAP (+2.2), while HEAD further pushes the limit to
36.2 mAP (+4.5).
keywords:
- Knowledge Distillation
- Object Detection
- Heterogeneous
license: Apache-2.0