# Content Retrievers

Once you have stored your Documents, Content Retrievers are the way to get the content back out and injected into your prompts.&#x20;

### Adding a retriever to your agent

The @PeoplelogicAgent annotation supports defining a single Content Retriever for your agent.  All you need to do is pass a Spring bean name to the contentRetriever parameter on the annotation and it will automatically search that retriever each time you query your agent.

```java
@PeoplelogicAgent(value="thirdAgent",
        name = "Third Agent", contentRetriever = "apmContentRetriever",
        persona = "Funny but a bit snarky")
@PeoplelogicAgentInstructions("Your job is to say the current date.")
public interface ThirdAgent extends WorkerAgent {
    @SystemMessage(BASE_WORKER_PROMPT)
    Result<PeoplelogicResult> acceptWork(@MemoryId String userId, @UserMessage String query, @V("PreviousResponse") String agentResponse);
}
```

You can find the bean names of the built-in retrievers below.

{% hint style="info" %}
We'll be adding support to allow multiple retrievers included in a default query router in a future version.
{% endhint %}

### Retrieval Augmentors and Query Routers

A Retrieval Augmentor is a Langchain4J concept that allows you to route queries between multiple Content Retrievers.  Because the Agent SDK handles multi-tenancy, we provide an implementation of these called `PeoplelogicRetrievalAugmentor` that will automatically carry your tenant across the different threads. &#x20;

Creating a new RetrievalAugmentor is just like creating any other Spring bean (add this to a Configuration class or anywhere else you define your Beans):

```java
@Bean(value = "PeoplelogicKnowledgeRetrievalAugmentor")
public RetrievalAugmentor retrievalAugmentor() {
        // Let's create a query router that will route each query to both retrievers.
        // This does a quick lookup to verify that we need to use the RAG first
        return PeoplelogicRetrievalAugmentor.builder()
                .queryTransformer(ExpandingQueryTransformer.builder()
                .chatModel(model).build())
                .queryRouter(new DefaultQueryRouter(apmContentRetriever, trainingContentRetriever))
                .build();
}
```

You'll notice something new, the queryTransformer method.  This particular Query Transformer uses the LLM to try several variations of the prompt to get the best matches.  Langchain4J provides these for you and you can [read more about them in their docs](https://docs.langchain4j.dev/tutorials/rag#query-transformer).

Now just add your new `RetrievalAugmentor` to your agent instead of the `ContentRetriever` directly and you're off to the races.

### Pre-built Components

The agent SDK ships several Content Retrievers with built-in content that you may want to just leverage. All of the Content Retrievers are Spring beans and can be autowired into your classes. &#x20;

<table><thead><tr><th width="223.984375">Bean </th><th width="317.94140625">Type</th><th>Description</th></tr></thead><tbody><tr><td>apmContentRetriever</td><td>PeoplelogicClasspathContentRetriever</td><td>Contains a series of articles around Agile Performance Management.  Great for building HR coaching applications.</td></tr><tr><td>trainingContentRetriever</td><td>PeoplelogicClasspathContentRetriever</td><td>Contains presentations and content around OKRs and leadership training.</td></tr><tr><td>handbookContentRetriever</td><td>PeoplelogicClasspathContentRetriever</td><td>Contains multiple samples of different handbooks to help facilitate the handbook creation tool.</td></tr><tr><td>policyContentRetriever</td><td>PeoplelogicClasspathContentRetriever</td><td>Contains a collection of policy examples and individual development plans.</td></tr><tr><td>customer-content-retriever</td><td>CustomerKnowledgeContentRetriever</td><td>Searches any of the content that your organization has uploaded that was shared with the organization.</td></tr><tr><td>personal-content-retriever</td><td>PersonalContentRetriever</td><td>Searches content that was uploaded in the course of executing a particular task (like an OKR cycle export for analysis).</td></tr></tbody></table>

### Building your own Content Retriever

In addition to the built-in retrievers, we have made it easier to build new content retrievers that can handle multiple users and even multiple customers.  These are handled through several abstract implementations that you can extend. &#x20;

#### NamespaceAwareContentRetriever

Building a `NamespaceAwareContentRetriever` typically involves passing in some values in a constructor and overriding a single method:

```java
public abstract String getNamespaceKey();
```

This namespace key is then used to separate documents as they are ingested.  By default this retriever will filter out documents that are just for your organization (your `tenant` ).  Overriding the constructor is what allows you to modify some of the other default settings, such as the filter and the size of the embeddings.  Let's take a look at the PersonalContentRetriever:

```java
public PersonalContentRetriever(@Value("${peoplelogic.agent.rag.path:/tmp/personal-uploads}")
                                    String ragBasePath,
                                    @Value("${peoplelogic.agent.rag.store.type:memory}")
                                    String ragStoreType,
                                    @Value("${peoplelogic.agent.rag.store.key:}")
                                    String ragStoreKey,
                                    @Value("${peoplelogic.agent.rag.store.host:}")
                                    String ragStoreHost,
                                    EmbeddingModel embeddingModel,
                                    DirectoryUtils directoryUtils) {
        this.ragBasePath = ragBasePath;
        this.ragStoreType = ragStoreType;
        this.ragStoreKey = ragStoreKey;
        this.ragStoreHost = ragStoreHost;
        this.embeddingModel = embeddingModel;
        this.indexName = "personal-content-retriever";
        this.minScore = 0.0;
        this.maxResults = 100; // Let's increase this just to be sure we get the whole review for example
        this.maxCharsInSegment = 5000; // Longer documents typically
        this.directoryUtils = directoryUtils;
        this.defaultFilter = (query) -> metadataKey("user").isEqualTo("" + TokenSecurityUtils.getCurrentUserId())
                .and(metadataKey("file_name").isIn(SearchFileContext.getCurrentFiles()));
    }

    @Override
    public String getNamespaceKey() {
        return TenantContext.getCurrentTenant() + "-" + TokenSecurityUtils.getCurrentUserId();
    }
```

The main piece to pay attention to here is the `defaultFilter`. This filter is what limits the queries to certain sections of the vector store.  In this example, we're saying that we're filtering on the user metadata to the current user AND that we're looking inside very specific files that have been uploaded.  You're probably wondering how we use that in practice, let's take a look at a tool:

{% tabs %}
{% tab title="SearchContext.setCurrentFiles()" %}

```java
String userPrompt = "Summarize the files named '" + filenames + "'.  Multiple files are separated by a comma.";

// We build a new instance of this agent so we can empty out the tools.
SearchFileContext.setCurrentFiles(filenames.split(",")); // This lets us narrow the search to a specific set of files
if (!waitForUpload(userPrompt, personalContentRetriever)) {
    return "There was a problem uploading the files to analyze.  Please try again.";
}

return getAgent().answerWithPrompt(memoryId +"_summary", userPrompt, systemPrompt);
```

{% endtab %}

{% tab title="getAgent()" %}

```java
public HRAnalystAgent getAgent() {
        if (hrAnalystAgent == null) {
            hrAnalystAgent = AiServices.builder(HRAnalystAgent.class)
                    .retrievalAugmentor(PeoplelogicRetrievalAugmentor.builder()
                            .queryTransformer(ExpandingQueryTransformer.builder().chatModel(chatLanguageModel).build())
                            .queryRouter(new DefaultQueryRouter(personalContentRetriever, apmContentRetriever, trainingContentRetriever))
                            .build())
                    .chatMemoryProvider(chatMemoryProvider)
                    .chatModel(chatLanguageModel)
                    .tools(Collections.emptyList()).build();
        }
        
        return hrAnalystAgent;
}
```

{% endtab %}
{% endtabs %}

What you're seeing here is a specialized isntance of the HRAnalystAgent specifically designed to lookup content without calling tools (for use when we're inside a tool).  We wait for the upload (because Pinecone can take a bit of time to finish processing!) and then have the agent build a response using \*just\* the file that was recently uploaded.

#### PeoplelogicClasspathContentRetriever

This ContentRetriever is much simpler.  It takes a file on the classpath (usually in the `resources` folder) and loads it using the in-memory embedding store.  It can then be used anywhere that ContentRetrievers are used.  Any files that are in these resources will still be run through all `DocumentProcessor` instances but remember that the vector store is somewhat less sophisticated and results may be simpler!


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.peoplelogic.dev/guides/getting-started-with-the-talent-agent-sdk/working-with-documents/content-retrievers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
