Paper detail

Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation

Knowledge distillation aims at obtaining a compact and effective model by learning the mapping function from a much larger one. Due to the limited capacity of the student, the student would underfit the teacher. Therefore, student performance would unexpectedly drop when distilling from an oversized teacher, termed the capacity gap problem. We investigate this problem by study the gap of confidence between teacher and student. We find that the magnitude of confidence is not necessary for knowledge distillation and could harm the student performance if the student are forced to learn confidence. We propose Spherical Knowledge Distillation to eliminate this gap explicitly, which eases the underfitting problem. We find this novel knowledge representation can improve compact models with much larger teachers and is robust to temperature. We conducted experiments on both CIFAR100 and ImageNet, and achieve significant improvement. Specifically, we train ResNet18 to 73.0 accuracy, which is a substantial improvement over previous SOTA and is on par with resnet34 almost twice the student size. The implementation has been shared at https://github.com/forjiuzhou/Spherical-Knowledge-Distillation.

preprint2021arXivOpen access
0citations
0reviews
0saves
Nocode
Nodataset
0institutions

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.