A brand new era for the RWKV-v5 architecture and linear transformer's has arrived - with the strongest multi-lingual model in open source today
RWKV is the SOTA for non-Transformer architecture.
caw caw! Congrats!
Is v4 the same as the paper? What's v5?
The current paper is v4 : https://arxiv.org/abs/2305.13048
The paper for v5 is being written now, eta 1 month!
The latest iteration of this model EagleX is here: https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b
> Trained on 1.1 Trillion Tokens across 100+ languages
is the dataset open publicly?
I love the work you guys are doing. Wanna Collab?
undefined subscriptions will be displayed on your profile (edit)
Skip for now
For your security, we need to re-authenticate you.
Click the link we sent to , or click here to sign in.
RWKV is the SOTA for non-Transformer architecture.
caw caw! Congrats!
Is v4 the same as the paper? What's v5?
The current paper is v4 : https://arxiv.org/abs/2305.13048
The paper for v5 is being written now, eta 1 month!
The latest iteration of this model EagleX is here: https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b
> Trained on 1.1 Trillion Tokens across 100+ languages
is the dataset open publicly?
I love the work you guys are doing. Wanna Collab?