> ## Documentation Index
> Fetch the complete documentation index at: https://docs.bey.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Agents

> Understanding conversational AI agents and how they work

## What are Agents?

Agents are AI systems that can perform tasks and make decisions autonomously. When these agents are designed for real-time dialogue and natural conversation with users, they become **conversational agents**—combining large language models (LLMs) with real-time communication capabilities to create interactive experiences.

At Beyond Presence, we focus on conversational agents—and specifically **conversational video agents** powered by our ultra-realistic [avatar models](/get-started/avatars).

### Agent Modalities

Agents can operate across different communication channels:

* **Text agents**: Traditional chatbots that respond via written messages
* **Voice agents**: AI assistants that speak and listen, like Siri or Alexa
* **Video agents**: The next evolution—AI that communicates with lifelike visual presence

## Why Video Agents?

While text and voice agents are useful, video agents create deeper engagement by:

* Establishing human-like connections through visual presence
* Conveying emotions and personality through facial expressions
* Building trust faster than disembodied voices or text
* Providing a more natural interaction paradigm

## Implementation Approaches

### Managed Agents

Use Beyond Presence's fully managed infrastructure where agents run on our servers. No framework setup required—just configure your agent through our dashboard or API and deploy instantly.

### Self-Hosted Agents

Build and manage your own conversational infrastructure using frameworks like [LiveKit Agents](https://github.com/livekit/agents) for real-time audio/video communication. This approach gives you complete control but requires significant engineering effort for media handling, scaling, and infrastructure management.

## Agent Components

Understanding agent components helps you optimize performance and customize functionality.
Whether configuring managed agents or building custom implementations, these components form the foundation of every conversational system.

<AccordionGroup>
  <Accordion title="Core Intelligence" icon="lightbulb">
    The core reasoning components that power your agent's conversational abilities.

    <AccordionGroup>
      <Accordion title="Language Model" icon="brain">
        The "brain" that understands user input and generates intelligent responses. This is the core reasoning engine that makes your agent conversational and context-aware.
      </Accordion>

      <Accordion title="System Prompt" icon="align-left">
        Instructions that define your agent's behavior, tone, and role. The system prompt contains the guidelines your agent follows during conversations.
      </Accordion>

      <Accordion title="Knowledge Base" icon="database">
        Domain-specific information your agent can reference to provide accurate, relevant responses. Upload documents, FAQs, or data to enhance your agent's expertise.
      </Accordion>
    </AccordionGroup>
  </Accordion>

  <Accordion title="Media Processing" icon="microphone">
    Components that handle audio and video processing for real-time interactions.

    <AccordionGroup>
      <Accordion title="Speech-to-Text (STT)" icon="ear-listen">
        Converts user speech into text that the language model can process.
      </Accordion>

      <Accordion title="Text-to-Speech (TTS)" icon="volume-high">
        Converts language model responses into natural-sounding speech.
      </Accordion>

      <Accordion title="Turn Detection" icon="timer">
        Detects when users finish speaking and when to respond, enabling natural conversation flow.
      </Accordion>

      <Accordion title="Avatar Rendering" icon="face-smile">
        Turns text or speech responses into a lifelike video of a person.
      </Accordion>
    </AccordionGroup>
  </Accordion>

  <Accordion title="Transport" icon="network-wired">
    Manages real-time audio and video streaming between users and agents. Popular transport options in the space include [LiveKit](https://docs.livekit.io) and [Pipecat](https://docs.pipecat.ai).
  </Accordion>

  <Accordion title="External Tools" icon="wrench">
    Connections to external systems, APIs, and services that expand your agent's capabilities beyond conversation.
  </Accordion>
</AccordionGroup>

## Next Steps

<CardGroup cols={2}>
  <Card title="Avatars" icon="user" href="/get-started/avatars">
    Learn how avatars transform agents into video experiences
  </Card>

  <Card title="Dashboard" icon="grip" href="/get-started/dashboard">
    Build managed video agents without code
  </Card>

  <Card title="API" icon="code" href="/get-started/api">
    Integrate video agents programmatically
  </Card>
</CardGroup>
