DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling (EMNLP 2020)

- 3 mins

Abstract

Motivation

Method: DiPair

figure2

Dual-Encoder

Truncated Output Sequences

Projection Layer

Transformer-Based Head

A Two-Stage Training Approach

Main Results

table2

Conclusion

In this work, we reveal the importance of customizing models for problems with pairwise/n-ary input and propose a new framework, DiPair, as an effective solution. This framework is flexible, and we can easily achieve more than 350x speedup over a BERT-based teacher model with no significant quality drop.

Joohong Lee

Joohong Lee

Machine Learning Researcher

rss facebook twitter github youtube mail spotify instagram linkedin google pinterest medium