Flash Attention 2: Faster Attention with Better Parallelism and Work Partitioning

No views

Data Science Gems

Updated 3 days ago

Flash Attention 2: Faster Attention with Better Parallelism and Work Partitioning

Flash Attention 2: Faster Attention with Better Parallelism and Work Partitioning