> ## Documentation Index > Fetch the complete documentation index at: https://docs.bey.dev/llms.txt > Use this file to discover all available pages before exploring further. # Agents > Understanding conversational AI agents and how they work ## What are Agents? Agents are AI systems that can perform tasks and make decisions autonomously. When these agents are designed for real-time dialogue and natural conversation with users, they become **conversational agents**—combining large language models (LLMs) with real-time communication capabilities to create interactive experiences. At Beyond Presence, we focus on conversational agents—and specifically **conversational video agents** powered by our ultra-realistic [avatar models](/get-started/avatars). ### Agent Modalities Agents can operate across different communication channels: * **Text agents**: Traditional chatbots that respond via written messages * **Voice agents**: AI assistants that speak and listen, like Siri or Alexa * **Video agents**: The next evolution—AI that communicates with lifelike visual presence ## Why Video Agents? While text and voice agents are useful, video agents create deeper engagement by: * Establishing human-like connections through visual presence * Conveying emotions and personality through facial expressions * Building trust faster than disembodied voices or text * Providing a more natural interaction paradigm ## Implementation Approaches ### Managed Agents Use Beyond Presence's fully managed infrastructure where agents run on our servers. No framework setup required—just configure your agent through the Studio or our API and deploy instantly. ### Self-Hosted Agents Build and manage your own conversational infrastructure using frameworks like [LiveKit Agents](https://github.com/livekit/agents) for real-time audio/video communication. This approach gives you complete control but requires significant engineering effort for media handling, scaling, and infrastructure management. ## Agent Components Understanding agent components helps you optimize performance and customize functionality. Whether configuring managed agents or building custom implementations, these components form the foundation of every conversational system. The core reasoning components that power your agent's conversational abilities. The "brain" that understands user input and generates intelligent responses. This is the core reasoning engine that makes your agent conversational and context-aware. Instructions that define your agent's behavior, tone, and role. The system prompt contains the guidelines your agent follows during conversations. Domain-specific information your agent can reference to provide accurate, relevant responses. Upload documents, FAQs, or data to enhance your agent's expertise. Components that handle audio and video processing for real-time interactions. Converts user speech into text that the language model can process. Converts language model responses into natural-sounding speech. Detects when users finish speaking and when to respond, enabling natural conversation flow. Turns text or speech responses into a lifelike video of a person. Manages real-time audio and video streaming between users and agents. Popular transport options in the space include [LiveKit](https://docs.livekit.io) and [Pipecat](https://docs.pipecat.ai). Connections to external systems, APIs, and services that expand your agent's capabilities beyond conversation. ## Next Steps Learn how avatars transform agents into video experiences Build managed video agents without code Integrate video agents programmatically