Beyond Presence integrates natively with LiveKit Agents, an open-source framework for building multimodal conversational agents. You can easily add Beyond Presence avatars to your LiveKit pipelines to give them a video layer.

If you are new to LiveKit agents, we recommend to start with the official integration example.

Quickstart

If you already have a LiveKit audio agent pipeline, all you need to do is:

1

Install the Beyond Presence (`bey`) Plugin

Beyond Presence has an official LiveKit plugin that you can install via pip:

pip install livekit-plugins-bey
2

Set your Beyond Presence API key

Create a Beyond Presence API Key and set it as an environment variable:

export BEY_API_KEY="..."  # Your Beyond Presence API Key
3

Add a Beyond Presence avatar into your room as soon as a user joins

When starting up your LiveKit agent, use the plugin abstraction to send a REST API request containing the room information to the Beyond Presence API to start up an avatar worker for that room:

agent.py
from livekit.plugins.bey import start_bey_avatar_session


async def entrypoint(ctx: JobContext):
    await ctx.connect()

    # Keep your existing agent logic here
    # agent = ... PipelineAgent(...)

    # Start a Beyond Presence avatar session
    session = await start_bey_avatar_session(
        ctx, 
        avatar_id="...",  # ID of the Beyond Presence avatar to use
        avatar_agent_name="..."  # How the avatar will be named in the call
    )

    # Make sure the avatar has joined the call
    await session.wait_for_avatar_agent()
4

Change the audio sink of your agent to send output audio to the data channel

Instead of streaming your agent’s output audio back into the LiveKit room to the user, send the audio over the data stream to the Beyond Presence avatar worker:

agent.py
from livekit.plugins.bey import start_bey_avatar_session


async def entrypoint(ctx: JobContext):
    await ctx.connect()

    # Keep your existing agent logic here
    # agent = ... PipelineAgent(...)

    # Start a Beyond Presence avatar session
    # session = ...

    # Make sure the avatar has joined the call
    # ...

    # Send all audio generated by the local agent to the avatar agent
    agent.output.audio = session.local_agent_output_audio

    # Start local agent with room input but only text room output
    # (the avatar agent will handle audio output)
    await agent.start(
        room=ctx.room,
        room_output_options=session.local_agent_room_output_options,
    )
5

(Optional) Hide your agent for the user

By default, both your agent and the avatar worker will join the LiveKit room, which will give the user the impression of a three-way call. To improve the user experience, we recommend to hide your agent from the user in your frontend.

One simple way to do this is by filtering out all tracks coming from agents that do not have their camera enabled:

VideoConference.tsx
  const filteredTracks = tracks.filter(
    (track) => !track.participant.isAgent || track.participant.isCameraEnabled,
  );

How it works

  • The LiveKit agent and the Beyond Presence avatar worker both join into the same LiveKit room as the user.
  • The LiveKit agent listens to the user and generates a conversational response, as usual.
  • However, instead of sending audio directly into the room, the agent sends the audio via WebRTC data channel to the Beyond Presence avatar worker.
  • The avatar worker only listens to the audio from the data channel, generates the corresponding avatar video, synchronizes audio and video, and publishes both back into the room for the user to experience.

For more details, check out the open-source integration code and the official integration usage example.