Researcher
Currently, my research focuses on RNNs in large language models (LLMs). I’m interested in exploring how to adapt transformer-based models, like the 671B R1, to use RNN attention. You can check out my ongoing work on ARWKV here.
https://huggingface.co/papers/2501.15570
X: https://x.com/xiaolGo
Papers: https://scholar.google.com/citations?user=TPJYxnkAAAAJ