Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

11K views

Keyur

1 year ago

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum