-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
In the title for Figure 1-10 in Chapter 1 pdf at page 13, the language can be changed for better understanding.
Given: Distillation of a smaller student model from a larger pre-trained teacher model. Both the teacher’s weights are frozen and the student learns to copy both the ground-truth and the teacher’s outputs on the given training data.
Suggested: Distillation of a smaller student model from a larger pre-trained teacher model. The teacher’s weights are frozen. The student learns to copy both the ground-truth and the teacher’s outputs on the given training data.
Metadata
Metadata
Assignees
Labels
No labels