Researchers from Tsinghua University present a new machine learning algorithm under the meta-learning paradigm

Meet Baichuan-13B: China’s open source large language model to compete with OpenAI

Wang Xiaochuan, the founder of Chinese search engine Sogou, has released a massive new language model called Baichuan-13B through his business, Baichuan Intelligence. Commercial use by programmers and researchers is currently limited. Sogou founder Wang Xiaochuan recently posted on Weibo that “China needs its own OpenAI.” The Chinese businessman is one step closer to realizing his vision after his fledgling company, Baichuan Intelligence, released Baichuan-13B, its next-generation large-scale language model. Baichuan launched three months ago and has quickly attracted a group of investors willing to raise $50 million. Thanks to the founder’s outstanding computer skills, his organization is now considered one of China’s most promising creators of huge language models.

The Baichuan-13B follows the same Transformer design as the GPT and most homegrown Chinese variants. In addition to being trained on data in both Chinese and English, its 13 billion parameters (variables used in text production and analysis) are bilingual. The model is open source and can be used for profit and was created using data from GitHub.

After the success of Baichuan-7B, Baichuan Intelligent Technology created Baichuan-13B, a commercially available open source large-scale language model with 13 billion parameters. According to respected Chinese and English standards, it outperforms similarly sized competitors. Both basic (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat) versions are included in this implementation.

Build high-quality training datasets with Kili Technology and solve NLP machine learning challenges to develop powerful ML applications

Characteristics

  • Baichuan-13B builds on Baichuan-7B by increasing the number of parameters to 13 billion and has trained 1.4 trillion tokens on high-quality corpora, 40% more than LLaMA-13B. Currently, under the open source size 13B, it is the model with the most training data. It uses ALiBi positional encoding and a 4096 byte context window and works in Chinese and English.
  • The pre-training model serves as a “base” for developers, while the dialog-aligned model is more in demand among regular users. Therefore, the aligned model (Baichuan-13B-Chat) is included in this open source version, which boasts powerful dialog features, is ready to use and requires only a few lines of code to deploy.
  • The researchers are also making available quantized int8 and int4 versions, which are even more efficient for inference, to encourage widespread user use. They can be implemented on consumer-grade graphics cards like the Nvidia 3090, but the non-quantized version requires significantly more powerful hardware.
  • Free for public use without resell or modification restrictions: If a developer applies for an official commercial license by email, he or she can use Baichuan-13B for commercial purposes without any cost.

About 1.4 billion tokens are used to teach Baichuan-13. ChatGPT-3, according to OpenAI, would have been trained on 300 billion tokens. Baichuan’s team has doubled in size in three months to fifty members, and last month it publicly demonstrated its model, Baichuan-7B, which has seven billion parameters. The Baichuan-13B version, released two days ago, is the essential version. It is now offered free to researchers and programmers who have been granted legal permission to put it to commercial use. The future of officially releasing the model for widespread use is yet to be discovered.

The basic model Baichuan-13B is now freely available to researchers and programmers who have obtained the necessary legal permissions to put it to commercial use. In light of recent US restrictions against Chinese artificial intelligence (AI) chip makers, the fact that variants of this model can run on consumer hardware like Nvidia’s 3090 graphics cards is particularly noteworthy.

Baichuan Intelligent Technology researchers confirm that their group has yet to build Baichuan-13B-based apps for any platform, including iOS, Android, Web, or others. Users are advised not to use the Baichuan-13B model for any illegal or harmful purpose, such as compromising national or social security. Users are also encouraged to refrain from using the Baichuan-13B model for Internet services without the necessary security and archiving checks. They count on everyone to follow this rule to keep technological progress within the bounds of the law.


Check out the GitHub link.Don’t forget to subscribeour 26k+ ML SubReddit,Discord channel,ANDEmail newsletterwhere we share the latest news on AI research, cool AI projects, and more. If you have any questions regarding the above article or if you have missed anything, please do not hesitate to email us atAsif@marktechpost.com

Check out over 900 AI tools in the AI ​​Tools Club


Dhanshree Shenwai is a software engineer and has good experience in FinTech companies covering Finance, Cards & Payments and Banking domain with keen interest in AI applications. He is enthusiastic about exploring new technologies and advancements in today’s changing world, making everyone’s life easier.


Gain a competitive edge with data – actionable market insights for global brands, retailers, analysts and investors. (Sponsored)

#Meet #Baichuan13B #Chinas #open #source #large #language #model #compete #OpenAI
Image Source : www.marktechpost.com

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *