DeepSpeed Ulysses: System Optimizations for Enabling Training of Long Sequence Transformer Models

11 views

Arxiv Papers

1 month ago

DeepSpeed Ulysses: System Optimizations for Enabling Training of Long Sequence Transformer Models

DeepSpeed Ulysses: System Optimizations for Enabling Training of Long Sequence Transformer Models