A brand new era for the RWKV-v5 architecture and linear transformer's has arrived - with the strongest multi-lingual model in open source today
RWKV is the SOTA for non-Transformer architecture.
caw caw! Congrats!
Is v4 the same as the paper? What's v5?
The latest iteration of this model EagleX is here: https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b
> Trained on 1.1 Trillion Tokens across 100+ languages
is the dataset open publicly?
I love the work you guys are doing. Wanna Collab?
RWKV is the SOTA for non-Transformer architecture.
caw caw! Congrats!
Is v4 the same as the paper? What's v5?
The latest iteration of this model EagleX is here: https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b
> Trained on 1.1 Trillion Tokens across 100+ languages
is the dataset open publicly?
I love the work you guys are doing. Wanna Collab?