Improving Answer Accuracy with RAG
What is RAG
Retrieval-Augmented Generation (RAG) is a technique that combines text generation by large language models (LLMs) with external information retrieval to suppress the generation of information not based on facts and improve the accuracy of AI responses.
Due to the token limit of ChatGPT, a large number of documents cannot be supplied as-is.
Therefore, prior information retrieval using RAG is performed in ailia DX Insight, and answers are generated based on a part of that information.
About Max Token Length
chatgpt-3.5 : 2k
chatgpt-3.5-turbo (16kと統合された):16k
chatgpt-4 : 8k
chatgpt-4-turbo : 128k
The more tokens, the more information chunks can be fed together, improving accuracy. However, responses become slower, and costs increase accordingly.
RAG Settings
- In the initial screen of ailia DX insight, click the gear icon in the upper right to display the settings window.
- Select "RAG" from the items on the left.
- You can set rerank and TOPK.