RAG Overview

Retrieval Augmented Generation (RAG) is a powerful approach that combines the strengths of neural language models with retrieval from a large corpus of documents to produce more accurate and informed outputs. The RAG system enables the retrieval of highly relevant documents using natural language search, and it is scalable to billions of records, functioning much like a traditional database. A key advantage of RAG is that it does not require context stuffing, meaning that the relevant context is pulled from the extensive database rather than being fed into the model in advance.

RAG

RAG can be particularly effective in addressing issues like hallucinations in large language models (LLMs), which are instances where the model generates factually incorrect or nonsensical information. By using vector databases that store embeddings of texts, RAG can provide an anchor to the “real world” and structured data. When a query is made, the system retrieves semantically relevant content from the vector database, which is then used by the LLM to generate responses. This ensures that the information is grounded in the content that has been previously validated and stored in the database.

The architecture of RAG involves several components working together. An application sends a query to an embedding model, which converts the query into a vector representation. This vector is then matched against a vector database via an API to find the most relevant documents or chunks of text. The results are filtered through metadata to ensure relevance and compliance with access controls. The selected content is streamed into the LLM’s context window, providing a foundation for it to generate informed responses.

Additionally, RAG incorporates a continuous cycle of ingestion and updating. New documents are crawled, chunked, and their embeddings are addei to the vector database, which ensures that the knowledge base is constantly expanding and staying up-to-date with the latest information. This dynamic nature of RAG makes it a robust system for applications requiring access to a vast, evolving pool of data and the ability to provide accurate, context-aware responses.

Messaging Interface

We have worked hard on providing the ultimate communication interface for developing RAG AI Agents. Each channel has a subscription to a GraphQL API that allows you to query and mutate data in the RAG system. You can use this API to build your own custom UI or integrate with your existing systems.


let GET_MY_CHANNEL_MESSAGES = gql`
  subscription (
    $last_received_id: String
    $last_received_ts: String
    $first_received_date: date
    $chat_id: uuid
  ) {
    message(
      order_by: { date: asc, timestamp: asc }
      where: {
        _and: {
          id: { _neq: $last_received_id }
          timestamp: { _gte: $last_received_ts }
          date: { _gte: $first_received_date }
          _and: {
            user: { username: { _neq: "null" } }
            _and: { chat_id: { _eq: $chat_id } }
          }
        }
      }
      limit: 200
    ) {
      id
      body
      username
      from
      time
      timestamp
      date
      created_at
      isCode
      likes
      channel_id
      isCommerce
      isReason
      chartContent
      image_url
      user {
        id
        username
        avatar
      }
    }
  }
`;