11 views
Arxiv Papers
DeepSpeed Ulysses: System Optimizations for Enabling Training of Long Sequence Transformer Models
Login with Google Login with Discord