Blue Guardrails

Detect hallucinations with the live detection API

Send messages directly to the API and get hallucination annotations back in real time, without sending traces first.

This guide shows you how to use the live detection API to check assistant messages for hallucinations on demand.

The standard detection flow requires you to instrument your application with OpenTelemetry and wait for async processing. Live detection skips that. You send a conversation directly to the API and receive annotations in the response. This is useful when you want to check a response before showing it to a user or integrate detection into a CI pipeline.

Live detection uses the same detection engine as async detection. It supports the same workspace configuration, including custom labels and domain context.

Prerequisites

  • A Blue Guardrails account with a workspace and credits
  • An API key (workspace-scoped or user-scoped)

Send a detection request

POST your conversation to /v1/detect-hallucinations. Messages follow the OpenAI chat format, so you can pass the same messages you'd send to a chat completion endpoint. The last message must be from the assistant. That's the message Blue Guardrails evaluates.

curl -X POST https://api.blueguardrails.com/v1/detect-hallucinations \
  -H "Authorization: Bearer $API_KEY" \
  -H "x-workspace-id: $WORKSPACE_ID" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "Answer using only the provided documents."},
      {"role": "user", "content": "Context: Revenue in Q1 was $85 billion.\nWhat was Q1 revenue?"},
      {"role": "assistant", "content": "Q1 revenue was $90 billion."}
    ]
  }'

The response contains annotations for the assistant message and usage information:

{
  "annotations": [
    {
      "text": "$90 billion",
      "label": "fabrication",
      "explanation": "The context states revenue was $85 billion, not $90 billion.",
      "message_part_index": 0,
      "start_offset": 18,
      "end_offset": 29
    }
  ],
  "usage": {
    "input_tokens": 142,
    "cost_cents": 1
  },
  "evaluation_id": "a1b2c3d4-...",
  "conversation_id": "e5f6g7h8-..."
}

Each annotation includes the exact text span, a label, an explanation, and character offsets within the message part.

Include tool calls and tool results

If your assistant uses tools, include the full tool interaction in the messages array. Use the OpenAI tool call format:

{
  "messages": [
    {"role": "user", "content": "What is the weather in NYC?"},
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_1",
          "type": "function",
          "function": {
            "name": "get_weather",
            "arguments": "{\"city\": \"NYC\"}"
          }
        }
      ]
    },
    {"role": "tool", "tool_call_id": "call_1", "content": "Sunny, 72F"},
    {"role": "assistant", "content": "The weather in NYC is rainy and 55F."}
  ]
}

Blue Guardrails evaluates the last assistant message against the full conversation context, including tool results. In this example, it would flag the response because the tool returned "Sunny, 72F" but the assistant said "rainy and 55F".

Store results in your workspace

By default, live detection stores the conversation, messages, and evaluation so that they appear in your workspace dashboard. If you don't want to store live detection data, set store_conversation to false:

{
  "messages": [
    {"role": "user", "content": "What is the capital of France?"},
    {"role": "assistant", "content": "The capital of France is Berlin."}
  ],
  "store_conversation": false
}

When not stored, the response doesn't include evaluation_id and conversation_id:

{
  "annotations": [...],
  "usage": {"input_tokens": 82, "cost_cents": 1},
  "evaluation_id": null,
  "conversation_id": null
}

Attach generation metadata

If you store results, you can include metadata about the LLM call that produced the assistant message. Pass a generation_info object:

{
  "messages": [
    {"role": "user", "content": "What is the capital of France?"},
    {"role": "assistant", "content": "The capital of France is Paris."}
  ],
  "generation_info": {
    "model": "gpt-4o",
    "provider": "openai",
    "input_tokens": 100,
    "output_tokens": 50
  }
}

This metadata appears alongside the evaluation in your workspace dashboard. All fields are optional. If you omit model, it defaults to "live-detection".

Override the hallucination config

If your workspace has a custom hallucination config, live detection uses it. You can also override the config per request by passing a config object with custom labels and domain context:

{
  "messages": [
    {"role": "user", "content": "What are the side effects of Ibuprofen?"},
    {"role": "assistant", "content": "Common side effects include headache and dizziness."}
  ],
  "config": {
    "labels": [
      {"name": "fabrication", "description": "Information not found in the SmPC."},
      {"name": "omission", "description": "Relevant information from the SmPC is missing."},
      {"name": "dosage_error", "description": "Incorrect dosage information."}
    ],
    "domain_context": "This is a pharmaceutical Q&A assistant. All answers must be grounded in the Summary of Product Characteristics (SmPC)."
  }
}

The override applies only to this request. It doesn't change your workspace config.

Use live detection from Python

import httpx

response = httpx.post(
    "https://api.blueguardrails.com/v1/detect-hallucinations",
    headers={
        "Authorization": f"Bearer {api_key}",
        "x-workspace-id": workspace_id,
    },
    json={
      "messages": [
        {"role": "system", "content": "Answer using only the provided documents."},
        {"role": "user", "content": "Context: Revenue in Q1 was $85 billion.\nWhat was Q1 revenue?"},
        {"role": "assistant", "content": "Q1 revenue was $90 billion."}
      ]
    },
)
result = response.json()

for annotation in result["annotations"]:
    print(f"[{annotation['label']}] {annotation['text']}")
    print(f"  {annotation['explanation']}")

Billing

Each live detection request consumes credits. The usage field in the response shows the input token count and cost in cents. The cost is based on the number of tokens in the full conversation context, at the same rate as async detection.

On this page