RWKV Open Source Development Blog

RWKV Open Source Development Blog

Share this post

RWKV Open Source Development Blog
RWKV Open Source Development Blog
RWKV-6 Finch 7B World 3 now with 3.1T tokens trained!
Copy link
Facebook
Email
Notes
More
User's avatar
Discover more from RWKV Open Source Development Blog
Development blog for the RWKV open source architecture, and their derivative OSS models
Already have an account? Sign in

RWKV-6 Finch 7B World 3 now with 3.1T tokens trained!

Moar training, moar capable!

RWKV's avatar
RWKV
Dec 11, 2024
1

Share this post

RWKV Open Source Development Blog
RWKV Open Source Development Blog
RWKV-6 Finch 7B World 3 now with 3.1T tokens trained!
Copy link
Facebook
Email
Notes
More
Share

RWKV-6 model: Finch 7B World 3

Now trained with an expanded and improved multilingual dataset, the latest Finch World 3 is the most capable 7B parameter class RWKV model yet! And you can use it today from either HuggingFace or the ChatRWKV inference runtime

Our goal is always to provide high-quality open-source AI models for everyone worldwide, regardless of nationality, language, or economic status. The RWKV architecture is designed to help reduce our impact on the environment, using a fixed amount of power per token regardless of context length. We invite interested developers to help us shape its future on the RWKV Discord server

Eval and benchmark

We tested Finch 7B World 3 using the EleutherAI lm-evaluation-harness across various typical industry benchmarks. Downstream performance improved significantly, now strongly beating Llama2 7B (trained on 2 trillion tokens) and closing in on Mistral 7B v0.1 and even Llama3 8B. We had theorized that the total tokens trained was the major difference between RWKV-6 models and modern Transformers, and seeing the continued performance improvement from further training reinforces that view. Llama3 8B is a larger model trained on 15 trillion tokens - nearly five times as many as Finch World 3 - and yet, the scores are close!

We’re looking forward to sharing the upcoming results from our new RWKV-7 architecture “Goose”, which may finally match or eclipse the modern transformer on a tokens-trained basis.


You can find the Finch architecture details in the Eagle and Finch research paper, recently presented at the Conference on Language Modelling.

Finch 7B World 3 has now been trained on a total of 3.1 trillion multilingual tokens. The training was accomplished in two steps: First, the original 1.1 trillion token Eagle (RWKV-5) checkpoint was upgraded to Finch (RWKV-6) and trained up to 1.4 trillion tokens with an expanded World v2.1 dataset. Then, the dataset was expanded again and training was continued for up to a total of 3.1 trillion tokens.

We added the following dataset (in addition to the original World 2 dataset details listed in the Eagle and Finch research paper) for the World V3 dataset.

Added in World v2.1

• cosmopedia
• adjustments to slimpajama inclusions
• dolma v1.6 reddit 
• Magpie-Align
• glaiveai_glaive-code-assistant-v3 
• cognitivecomputations_SystemChat-2.0_SystemChat 
• migtissera_Tess_tess-v1.5 
• openbmb_UltraInteract_sft 
• m-a-p~Code-Feedback~Code-Feedback

Added in World v3

• fineweb-edu 
• DCLM
• cosmopedia-v2 
• Buzz-V12 
• WebInstructSub 
• SKGInstruct 
• math-ai 
• TemplateGSM 
• all of starcoder
• python-edu (in HuggingFaceTB/smollm-corpus)

For the upcoming RWKV-7 “Goose” training runs, we will be improving and expanding the tokenizer to efficiently handle more world languages, and adding even more new dataset components.

Try out Finch World 3 today!

Acknowledgments

A big thank you to the following groups, who were instrumental in the continued development of the RWKV architecture and models:

  • Recursal AI for its commitment to providing resources and development for the RWKV ecosystem - you can use their featherless.ai platform to easily run RWKV and compare to it, other language models

  • EleutherAI for support and guidance, especially on benchmarks and publishing research papers about the RWKV architecture

  • Linux Foundation AI & Data group for supporting and hosting the RWKV project

And of course, a huge thank you to the many developers around the world working hard to improve the RWKV ecosystem and provide environmentally friendly open-source AI for all.


Subscribe to RWKV Open Source Development Blog

Launched a year ago
Development blog for the RWKV open source architecture, and their derivative OSS models
Marko Tasic's avatar
1 Like
1

Share this post

RWKV Open Source Development Blog
RWKV Open Source Development Blog
RWKV-6 Finch 7B World 3 now with 3.1T tokens trained!
Copy link
Facebook
Email
Notes
More
Share

Discussion about this post

User's avatar
🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)
A brand new era for the RWKV-v5 architecture and linear transformer's has arrived - with the strongest multi-lingual model in open source today
Jan 29, 2024 • 
Eugene Cheah
30

Share this post

RWKV Open Source Development Blog
RWKV Open Source Development Blog
🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)
Copy link
Facebook
Email
Notes
More
6
🚀 RWKV.cpp - shipping to 1.5 billion systems worldwide
We went from ~50k installation, to 1.5 billion. On every windows 10 and 11 computer, near you (even the ones in the IT store)
Sep 3, 2024 • 
RWKV
2

Share this post

RWKV Open Source Development Blog
RWKV Open Source Development Blog
🚀 RWKV.cpp - shipping to 1.5 billion systems worldwide
Copy link
Facebook
Email
Notes
More
🦅 EagleX v2 : Soaring past LLaMA2 7B in both English and Multi-lang evals (RWKV-v5)
You have seen the teaser with the EagleX 1.7T, now its here - the definitive version of linear transformer trained past, LLaMA 2 7B.
Apr 18, 2024 • 
RWKV
1

Share this post

RWKV Open Source Development Blog
RWKV Open Source Development Blog
🦅 EagleX v2 : Soaring past LLaMA2 7B in both English and Multi-lang evals (RWKV-v5)
Copy link
Facebook
Email
Notes
More

Ready for more?

© 2025 RWKV
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.