Skip to main content

Overview

Classification and routing are common AI tasks where you need to make decisions based on input. Tracing these operations helps you understand why certain routing decisions were made and debug escalation logic.

Basic Classification

Here’s a support ticket classifier with tracing:
from artanis import Artanis

artanis = Artanis(api_key="sk_...")

def classify_ticket(ticket: dict) -> dict:
    trace = artanis.trace("classify-ticket")

    # Capture configuration
    trace.state("config", {
        "model": "gpt-5.1",
        "categories": workspace.categories,
        "confidence_threshold": workspace.threshold
    })

    # Classify
    trace.input(message=ticket["message"])

    classification = classifier.predict(ticket["message"])

    trace.output({
        "category": classification.category,
        "confidence": classification.confidence
    })

    return {
        "category": classification.category,
        "confidence": classification.confidence
    }

Escalation Logic

Decide when to escalate to humans vs. handle with AI:
def handle_ticket(ticket: dict) -> dict:
    trace = artanis.trace("handle-ticket")

    trace.input(message=ticket["message"])

    # Classify the ticket intent
    classification = classifier.predict(ticket["message"])

    # Decide if we should escalate to human
    should_escalate = (
        classification.confidence < workspace.confidence_threshold
        or classification.intent in workspace.escalate_intents
    )

    if should_escalate:
        trace.output({"action": "escalated"})
        return {"action": "escalated_to_human"}

    # Generate AI response (only if not escalated)
    response = generator.create(ticket, classification)
    trace.output({"action": "responded", "message": response})

    return {"action": "ai_response", "message": response}

Multi-Label Classification

Handle documents with multiple categories:
def classify_document(document: str) -> list[str]:
    trace = artanis.trace("multi-label-classify")

    trace.input(document_length=len(document))

    # Classify into multiple labels
    predictions = multi_label_classifier.predict(document)

    # Record all predictions above threshold
    above_threshold = [p for p in predictions if p.confidence > 0.5]

    trace.output({"labels": [p.label for p in above_threshold]})

    return [p.label for p in above_threshold]

Intent Classification with Fallback

Handle multiple classification strategies:
def classify_intent(message: str) -> str:
    trace = artanis.trace("intent-classification")

    trace.input(message=message)

    # Try ML classifier first
    ml_result = ml_classifier.predict(message)

    if ml_result.confidence > 0.8:
        trace.output({"intent": ml_result.intent, "method": "ml"})
        return ml_result.intent

    # Fallback to LLM if confidence is low
    llm_result = llm_classifier.predict(message, model="gpt-5.1")
    trace.output({"intent": llm_result.intent, "method": "llm"})

    return llm_result.intent

What to Trace

Track the settings used for classification:
trace.state("config", {
    "categories": workspace.categories,
    "confidence_threshold": 0.7,
    "escalate_intents": ["billing", "legal"],
    "model": "gpt-5.1"
})
Record the classification output with confidence:
trace.output({
    "category": result.category,
    "confidence": result.confidence,
    "all_scores": result.all_scores
})
Track when fallback strategies are used:
# High confidence: AI handles it
if classification.confidence > 0.9:
    trace.output({"action": "ai_response", "confidence": "high"})
# Medium confidence: need review
elif classification.confidence > 0.5:
    trace.output({"action": "flagged_for_review", "confidence": "medium"})
# Low confidence: escalate immediately
else:
    trace.output({"action": "escalated", "confidence": "low"})
Capture which path was taken and why:
trace.output({
    "route": selected_route,
    "reason": routing_reason,
    "alternatives": other_routes
})

Debugging Classification with Traces

Use traces to debug common classification issues:
IssueWhat to Check in Trace
Wrong categoryCheck confidence - is it too low?
Inconsistent routingCheck config - did thresholds change?
Too many escalationsCheck confidence_threshold - is it too high?
Missing classificationsCheck all_scores - what were alternatives?

Next Steps