Retrieval-augmented generation (RAG) and semantic search engine as a combined solution

Retrieval-augmented generation with semantic search (SRAG) is a method of reducing or eliminating major disadvantages of GPT (such as ChatGPT) when used for local documents.

Architecture of an SRAG system.
Figure 1. Architecture of an SRAG system. Follow arcs in numerical order.

1st problem: Topics of local documents are strongly underrepresented or completely unknown in GPT. RAG with semantic search (SRAG) feeds semantically suitable passages into the GPT query.

2nd problem: GPT violates data protection regulations and confidentiality. In the case of RAG with semantic search, a GPT runs on a dedicated computer in a German data center or on-premises at the customer's location. The augmentation of passages from local documents is carried out on the same computer. Everything stays in-house!

3rd problem: GPT hallucinates, produces false statements and fake news SRAG requests and provides evidence from own documents.

4th problem: GPT results cannot be verified. RAG with semantic search retains the reference to the verifying text passage. This link can be formatted and displayed according to customer requirements.

5th problem: GPT ignores access rights. In the area of enterprise search and other areas, access rights, especially read rights (who is allowed to see what?), play a major role. A large language model (LLM), as used in GPT, cannot replicate access rights. The solution in SRAG is that the selection of own document snippets for the GPT query already takes the access rights into account, because the semantic search already implements the access rights. Only the text snippets that the current user is allowed to see are included in GPT.

The strength of the combination of retrieval-augmented generation and retrieval component depends, of course, on the strength of both components. We can offer the proven cognitive search engine SEMPRIA-Search as a retrieval component. We have closely coupled this with a GPT specialized in German and English (including LLM) that runs on local hardware. Every search engine from SEMPRIA can be quickly expanded to include retrieval-augmented generation. Depending on the search query, RAG with semantic search is activated to optimize the search results.

If you are struggling with your current internal search engine, then simply switch to a current search engine and make a big leap into the AI world of GPT, but of course with a solution to all 5 problems mentioned. If you want to experiment with a test system, you can contact us.

The diagram in figure 1 shows how a combined system looks like and works.