RAG AI Tool

RAG AI Tools allow agents to perform LLM-based semantic search.

Retrieval-Augmented Generation (RAG) framework is capable of combining a large language model (LLM) with an external knowledge base to provide more accurate and up-to-date responses. Its inner mechanism consists in first retrieving relevant information from sources (like documents or databases) and then using that information in the LLM, enhancing its output with the specific context. RAG helps AI to generate answers that are highly-grounded in factual, real-time data, reducing "hallucinations".

In the Emporix Agentic AI, Retrieval-Augmented Generation is a capability facilitated by the RAG AI tools that you can use within an AI agent.

These native tools enable agents to perform LLM-based semantic search across various domain-specific entities stored in vector databases.

The RAG AI tools transform user queries into vector embeddings and match them against pre-computed entity embeddings using similarity metrics (for example, cosine similarity).

Crucially, RAG AI tools operate on semantic meaning. This differs from traditional keyword-based search, as RAG enables more accurate and context-aware retrieval even when there is no explicit keyword overlap.

To enhance search for indexed entities (like products) with RAG, an agent must be used with a previously defined RAG AI tool attached to it. The tool configuration requires a prompt, which informs the agent when the tool must be triggered. The Agentic AI provides two tool types designed for RAG workflows:

  • RAG_EMPORIX

  • RAG_CUSTOM

RAG_EMPORIX tool type

A tool of type RAG_EMPORIX enables configuration and execution of the indexing and retrieval pipeline for selected Emporix-managed entities. This tool type leverages the native Emporix Vector Database, meaning there is no need to manage external infrastructure.

Configuration

The configuration of this tool includes:

  • Tool Type - The type of AI tool, select RAG Emporix.

  • Tool ID - The identificator of the tool.

  • Tool Name - The tool name to be displayed.

  • Prompt - The tool is later attached to the agent and the agent needs to know when to invoke that tool.

  • Provider - The LLM provider to be used for creating embeddings. The supported types are OpenAI, Self-Hosted Ollama, and Emporix OpenAI. See the LLM providers.

    • Model - The specific model of the chosen LLM.

    • Dimensions - The parameter of the embeddings table size. The value can be between between 128-4096 range.

    • URL - The URL where the LLM model is hosted (applicable to Self-Hosted Ollama only).

    • Token - The token to be used with the chosen LLM provider.

  • Entity Type - The type to be indexed and retrieved.

  • Indexed Fields - The fields that are part of the embedding.

LLM providers

The configuration for creating embeddings includes choosing the relevant LLM provider. When indexing products using a RAG tool, the system generates appropriate embeddings by making an HTTP call to the chosen LLM.

There are three supported LLM providers you can use for creating RAG embeddings:

  • Emporix OpenAI - This provider doesn't require any further configuration on your side, as default settings are used. It uses the OpenAI text-embedding-3-small model with embedding dimensions set to 1536. The number of tokens used by each operation is registered in the Emporix system, and usage is limited to the set limit. When an AI agent uses the RAG tool, the same LLM configuration is used to create embeddings for user queries to perform search operations.

  • OpenAI - Use your own OpenAI account within the RAG tool. Provide the specific model type, dimensions, and token.

  • Self-Hosted Ollama - Use a custom Ollama LLM provider and model. In addition to the model, dimensions, and token, you also need to provide the URL of the hosted model to enable valid HTTP communication.

Indexed fields

The values of the indexed fields are later concatenated and converted into embeddings. Each line in this concatenated text corresponds to a single field and follows the structure: {key}: {content}.

For example, if code, name.en, and description.en are included in the configuration, the resulting concatenated content used for embedding looks as follows:

code: 103155592
name.en: Precision Screwdriver
description.en: A high-quality of precision screwdrivers designed for industrial and professional applications, ensuring durability and precise performance.

The indexed fields list consists of objects with two properties: key and name. The key property is required, while name is optional and can serve as the alias for the key. If you provide the name, it is used in place of the key in the content that is transformed into embeddings.

Limit

The maximum number of indexed characters in singe product is 16 000. If the content exceeds the limit then the first 16 000 characters are embedded while the other characters are not part of the embedding and are ignored.

Indexing stage

When the tool configuration is ready, the entities can be indexed. Each time you modify a product, it gets reindexed by the RAG tool and its embedding is recalculated if necessary in the background. To trigger indexing of all products in the database, you can trigger the Reindex option.

Retrieving stage

To search for the indexed entities, you need to attach the previously defined RAG tool to an agent. The agent uses the available tool according to the specified prompt definition.

Example RAG_EMPORIX AI tool flow

The flowchart represents the process flow of the RAG_EMPORIX tool.

Example search results

When you apply the RAG tool in an agent available at the storefront, a user might, for example, query about products of interest using natural language. In the background, the agent triggers the RAG tool to enhance the search outcomes with RAG embeddings, match the query, and return all available matches to the user.

RAG_CUSTOM tool type

A tool of type RAG_CUSTOM enables integration with an external (custom) vector database, not managed by the Emporix platform. This allows the system to perform semantic search operations using embeddings stored outside of the native infrastructure.

This tool type is intended for advanced use cases where organizations prefer complete control over their vector storage, scalability, performance tuning, or cost management.

Configuration

The configuration of RAG_CUSTOM tool requires providing more details to enable it properly. The configuration properties must be valid and accessible from the running environment in order to execute Retrieval-Augmented Generation (RAG) queries successfully.

The configuration of this tool includes:

  • Tool Type - The type of AI tool, select RAG Custom.

  • Tool ID - The identificator of the tool.

  • Tool Name - The tool name to be displayed.

  • Prompt - As the tool is later attached to an AI agent, the agent needs to know when to trigger the tool.

  • Max Results - The maximum number of documents returned by an agent that uses this tool.

  • Database URL - The URL of the database where data is stored.

  • Database Type - Type of the database.

  • Database Entity Type - Specifies the entity type to be retrieved.

  • Database Collection Name - The name of the collection where products are stored.

  • Database Token - The token for authentication to the database.

  • Embedding Model - The model used to compute embeddings for entities in a given collection.

  • Embedding Token - The token necessary for computing embeddings.

Retrieving stage

To enhance search for the indexed products with custom-defined RAG, attach the RAG_CUSTOM tool to an agent. The agent uses the available tool according to the specified prompt definition.

Example RAG_CUSTOM AI tool flow

The flowchart represents the process flow of the RAG_CUSTOM tool.

Last updated

Was this helpful?