Do Language Models Use Their Depth Efficiently?

1.5K views

Arxiv Papers

2 years ago

Do Language Models Use Their Depth Efficiently?

Do Language Models Use Their Depth Efficiently?