Skip to content
Home » Baobab, an AI learning data construction company, launches RAG dataset creation service to improve the accuracy of LLM (Large-Scale Language Model) ~Free distribution of sample data

Baobab, an AI learning data construction company, launches RAG dataset creation service to improve the accuracy of LLM (Large-Scale Language Model) ~Free distribution of sample data

Baobab, an AI learning data construction company, launches RAG dataset creation service to improve the accuracy of LLM (Large-Scale Language Model) ~Free distribution of sample data

*View in browser* *Baobab Co., Ltd.*
Press release: January 17, 2024
Baobab, an AI learning data construction company, launches RAG dataset creation service to improve the accuracy of LLM (Large-Scale Language Model) ~Free distribution of sample data
Baobab Co., Ltd. (Headquarters: Shibuya-ku, Tokyo, President and CEO: Miori Sagara, hereinafter referred to as “Baobab”), which provides AI learning data creation services, is an LLM (Large)
Language Model: Retrieval-augmented Generation (RAG)
We will start providing a dataset construction service for
implementing this from January 17, 2024, and will also start distributing sample data for free.
Reduce inaccurate output of generation AI with RAG (Search Enhanced Generation) LLM (Large Language Model:
Research and development of large-scale language models (Large-Scale Language Models) is progressing rapidly both domestically and internationally, and their use in all areas of public, private, and academia is required.
While LLMs are expected to have fluent sentence production ability and knowledge at a common sense level, they may sometimes produce confabulations or inaccuracies in contexts where specialized knowledge, non-public information, or factuality are important. Hallucination that presents information
This is one of the risks that companies considering introducing generative AI are most concerned about.
Baobab has focused on this issue and is currently using
Retrieval-augmented generation technology, which is currently considered an important technology to avoid hallucinations in 2024. We will start providing dataset construction services for implementing RAG) into LLM.
What is RAG (Search Enhanced Generation)?
Retrieval-augumented Generation:
RAG) combines LLM with external knowledge sources such as databases, and uses the information obtained from the knowledge sources together with the context input by the user to output the correct information or to correct the fact that there is no appropriate information. This is a method that allows you to answer.
In order to use RAG in LLM, in addition to designing prompts to perform RAG, it is necessary to prepare a high-quality dataset to tune LLM for RAG.
* Information included in the RAG dataset *
・User’s question text
・Queries that extract information that matches the user’s question from knowledge sources
・Information extracted from knowledge sources
・Language model answer text
Baobab is based on over 10 years of know-how in building text datasets. We will form a task-specific team and quickly provide high-quality RAG datasets. We also provide consulting services for LLM development from experts with extensive experience and insight regarding AI development for natural language processing.
Free sample data distribution
Sample data will be distributed free of charge in conjunction with the launch of the RAG dataset construction service.
*Sample data overview*
・QA dataset using Wikipedia database
・Number of responses created: 1150
・Working days: 12 days
・Distribution method: You can download from below
Download sample dataset
*About Baobab Co., Ltd.*
Baobab has been developing learning data construction services for AI since its founding, and provides a variety of annotation services for image recognition, dialogue scenarios, and multimodal, including dataset construction for LLM (Large-Scale Language Model). I am. The high-quality learning data achieved through our unique training of project partners (Baopart) and detailed work flow, organization, and system has been highly praised by universities, academic institutions, research institutes, etc. both domestically and internationally. In 2023, the Ministry of Economy, Trade and Industry will designate J-Startup as a company aiming to solve social and environmental issues, realize a new vision, and achieve sustainable economic growth. We were also selected as one of the Impact companies.
Going forward, Baobab will continue to provide high-quality learning data that is essential for high-quality AI models that solve social and customer issues. We aim to create a society where people are Baobab Co., Ltd.
Established: July 2010
Address: 39F Shibuya Scramble Square, 2-24-12 Shibuya, Shibuya-ku, Tokyo 150-6139
Business content: Learning data creation service for AI
Representative: Miori Sagara, Representative Director and President URL: https://baobab-trees.com/
-For media inquiries regarding this matter-
Email address: pr_saki@baobab-trees.com
*About details about this release*
https://prtimes.jp/main/html/rd/p/000000004.000112000.html

*Download press release materials*
https://prtimes.jp/im/action.php?run=html&page=releaseimage&company_id=112000&release_id=4


Unsubscribe HTML email

%d