FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

63K views

Scholaread

Updated 2 days ago

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning