Adaptive Input Representations for Neural Language Modeling (ICLR 2019)

- 4 mins

Abstract

1. Introduction

3. Adaptive Input Representations

figure1

Weight sharing

4. Experiments Setup

4.1. Model

4.2. Datasets

4.3. Batching

4.4. Input and Output Layer Hyperparameters

Embedding size

figure7

Character CNN

Adaptive input representations and adaptive softmax

Sub-word models

Optimization

5. Experiments

5.1. Main Results

table1

table2

5.2. Comparison of Input and Output Layer Factorization

table3

table4

5.3. Analysis

figure2

figure3

figure4

figure5

table5

5.4. Adaptive Softmax vs. Full Softmax

table6

6. Conclusion

Joohong Lee

Joohong Lee

Machine Learning Researcher

rss facebook twitter github youtube mail spotify instagram linkedin google pinterest medium