Using trustyai_fms with Llama Stack
Overview
trustyai_fms
is an out-of-tree remote safety provider for Llama Stack that brings comprehensive AI safety and content filtering capabilities to your GenAI applications. This implementation combines the FMS Guardrails Orchestrator with a suite of community-developed detectors to provide robust content filtering and safety monitoring.
The following detectors can be used with trustyai_fms
:
-
Regex Detectors: pattern-based content detection for structured safety rules
-
Hugging Face Content Detectors: content detection compatible with most Hugging Face
AutoModelForSequenceClassification
models, such as granite-guardian-hap-38m or deberta-v3-base-prompt-injection-v2 -
vLLM Detector Adapter: content detection compatible with Hugging Face
AutoModelForCausalLM
models such as ibm-granite/granite-guardian-3.1-2b
Prerequisites
-
Local environment:
-
Python >=3.12
-
-
AI Platform:
-
OpenDataHub 2.29 (or Red Hat OpenShift AI (2.20.0 provided by Red Hat, Inc.)):
-
in the
DataScienceInitialization
resource, set the value ofmanagementState
for theserviceMesh
component toRemoved
-
in the
default-dsc
, ensure:-
trustyai
managementState
is set toManaged
-
kserve
is set to:kserve: defaultDeploymentMode: RawDeployment managementState: Managed nim: managementState: Managed rawDeploymentServiceConfig: Headless serving: ingressGateway: certificate: type: OpenshiftDefaultIngress managementState: Removed name: knative-serving
-
-
-
-
Model Serving:
-
Red Hat OpenShift Service Mesh 2 (2.6.7-0 provided by Red Hat, Inc.)
-
Red Hat OpenShift Serverless (1.35.1 provided by Red Hat)
-
-
Authentication:
-
Red Hat - Authorino Operator (1.2.1 provided by Red Hat)
-
Minimal example
This example shows to use the built-in
detectors of the GuardrailsOrchestrator (see this tutorial on how to get started Getting Started with GuardrailsOrchestrator) as Llama Stack safety guardrails. For simplicity, we will consider a local Llama Stack server running in a virtual environment.
Step 0: Set up the environment
Ensure you have Python 3.12 or higher installed, then create and activate a virtual environment:
python -m venv .venv && source .venv/bin/activate
Install the required packages:
pip install llama-stack==0.2.11 llama-stack-provider-trustyai-fms==0.1.3
Step 1: Create a new OpenShift project
PROJECT_NAME="lls-minimal-example" && oc new-project $PROJECT_NAME
Step 2: Deploy the orchestrator with built-in regex detectors
Deploy the GuardrailsOrchestrator with configuration for regex-based PII detection:
cat <<EOF | oc apply -f -
kind: ConfigMap
apiVersion: v1
metadata:
name: fms-orchestr8-config-nlp
data:
config.yaml: |
detectors:
regex:
type: text_contents
service:
hostname: "127.0.0.1"
port: 8080
chunker_id: whole_doc_chunker
default_threshold: 0.5
---
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: GuardrailsOrchestrator
metadata:
name: guardrails-orchestrator
spec:
orchestratorConfig: "fms-orchestr8-config-nlp"
enableBuiltInDetectors: true
enableGuardrailsGateway: false
replicas: 1
EOF
Step 3: Create Llama Stack configuration
Create the directory structure for the Llama Stack distribution:
mkdir -p lls-distribution/providers.d/remote/safety
Create the main configuration file:
cat <<EOF > lls-distribution/run.yaml
version: '2'
image_name: minimal_example
apis:
- safety
- shields
providers:
safety:
- provider_id: trustyai_fms
provider_type: remote::trustyai_fms
config:
orchestrator_url: \${env.FMS_ORCHESTRATOR_URL}
shields: {}
shields: []
server:
port: 8321
tls_certfile: null
tls_keyfile: null
external_providers_dir: lls-distribution/providers.d
EOF
Create the provider configuration:
cat <<EOF > lls-distribution/providers.d/remote/safety/trustyai_fms.yaml
adapter:
adapter_type: trustyai_fms
pip_packages: ["llama_stack_provider_trustyai_fms"]
config_class: llama_stack_provider_trustyai_fms.config.FMSSafetyProviderConfig
module: llama_stack_provider_trustyai_fms
api_dependencies: ["safety"]
optional_api_dependencies: ["shields"]
EOF
Step 4: Start local the Llama Stack server
Set the orchestrator URL and start the server:
export FMS_ORCHESTRATOR_URL="https://$(oc get routes guardrails-orchestrator -o jsonpath='{.spec.host}')"
python -m llama_stack.distribution.server.server --config lls-distribution/run.yaml --port 8321
Step 5: Register the built-in regex detectors
Ensure your server is running and in a separate terminal, dynamically register a shield that uses regex patterns to detect PII (personally identifiable information):
curl -X 'POST' \
'http://localhost:8321/v1/shields' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"shield_id": "regex_detector",
"provider_shield_id": "regex_detector",
"provider_id": "trustyai_fms",
"params": {
"type": "content",
"confidence_threshold": 0.5,
"message_types": ["system", "user"],
"detectors": {
"regex": {
"detector_params": {
"regex": ["email", "ssn", "credit-card"]
}
}
}
}
}'
Step 6: Inspect the registered shield
curl -s http://localhost:8321/v1/shields | jq '.'
You should see the following output, indicating that the shield has been registered successfully:
{
"data": [
{
"identifier": "regex_detector",
"provider_resource_id": "regex_detector",
"provider_id": "trustyai_fms",
"type": "shield",
"params": {
"type": "content",
"confidence_threshold": 0.5,
"message_types": [
"system",
"user"
],
"detectors": {
"regex": {
"detector_params": {
"regex": [
"email",
"ssn",
"credit-card"
]
}
}
}
}
}
]
}
Step 7: Test the shield with some sample messages
Test email detection
curl -X POST http://localhost:8321/v1/safety/run-shield \
-H "Content-Type: application/json" \
-d '{
"shield_id": "regex_detector",
"messages": [
{
"content": "My email is test@example.com",
"role": "user"
}
]
}' | jq '.'
This should return a response indicating that the email was detected:
{
"violation": {
"violation_level": "error",
"user_message": "Content violation detected by shield regex_detector (confidence: 1.00, 1/1 processed messages violated)",
"metadata": {
"status": "violation",
"shield_id": "regex_detector",
"confidence_threshold": 0.5,
"summary": {
"total_messages": 1,
"processed_messages": 1,
"skipped_messages": 0,
"messages_with_violations": 1,
"messages_passed": 0,
"message_fail_rate": 1.0,
"message_pass_rate": 0.0,
"total_detections": 1,
"detector_breakdown": {
"active_detectors": 1,
"total_checks_performed": 1,
"total_violations_found": 1,
"violations_per_message": 1.0
}
},
"results": [
{
"message_index": 0,
"text": "My email is test@example.com",
"status": "violation",
"score": 1.0,
"detection_type": "pii",
"individual_detector_results": [
{
"detector_id": "regex",
"status": "violation",
"score": 1.0,
"detection_type": "pii"
}
]
}
]
}
}
}
Test SSN detection
curl -X POST http://localhost:8321/v1/safety/run-shield \
-H "Content-Type: application/json" \
-d '{
"shield_id": "regex_detector",
"messages": [
{
"content": "My SSN is 123-45-6789",
"role": "user"
}
]
}' | jq '.'
This should return a response indicating that the SSN was detected:
{
"violation": {
"violation_level": "error",
"user_message": "Content violation detected by shield regex_detector (confidence: 1.00, 1/1 processed messages violated)",
"metadata": {
"status": "violation",
"shield_id": "regex_detector",
"confidence_threshold": 0.5,
"summary": {
"total_messages": 1,
"processed_messages": 1,
"skipped_messages": 0,
"messages_with_violations": 1,
"messages_passed": 0,
"message_fail_rate": 1.0,
"message_pass_rate": 0.0,
"total_detections": 1,
"detector_breakdown": {
"active_detectors": 1,
"total_checks_performed": 1,
"total_violations_found": 1,
"violations_per_message": 1.0
}
},
"results": [
{
"message_index": 0,
"text": "My SSN is 123-45-6789",
"status": "violation",
"score": 1.0,
"detection_type": "pii",
"individual_detector_results": [
{
"detector_id": "regex",
"status": "violation",
"score": 1.0,
"detection_type": "pii"
}
]
}
]
}
}
}
Test credit card detection
curl -X POST http://localhost:8321/v1/safety/run-shield \
-H "Content-Type: application/json" \
-d '{
"shield_id": "regex_detector",
"messages": [
{
"content": "My credit card number is 4111-1111-1111-1111",
"role": "user"
}
]
}' | jq '.'
This should return a response indicating that the credit card number was detected:
{
"violation": {
"violation_level": "error",
"user_message": "Content violation detected by shield regex_detector (confidence: 1.00, 1/1 processed messages violated)",
"metadata": {
"status": "violation",
"shield_id": "regex_detector",
"confidence_threshold": 0.5,
"summary": {
"total_messages": 1,
"processed_messages": 1,
"skipped_messages": 0,
"messages_with_violations": 1,
"messages_passed": 0,
"message_fail_rate": 1.0,
"message_pass_rate": 0.0,
"total_detections": 1,
"detector_breakdown": {
"active_detectors": 1,
"total_checks_performed": 1,
"total_violations_found": 1,
"violations_per_message": 1.0
}
},
"results": [
{
"message_index": 0,
"text": "My credit card number is 4111-1111-1111-1111",
"status": "violation",
"score": 1.0,
"detection_type": "pii",
"individual_detector_results": [
{
"detector_id": "regex",
"status": "violation",
"score": 1.0,
"detection_type": "pii"
}
]
}
]
}
}
}
Key takeaways
It is possible to register shields dynamically using the the /v1/shields
endpoint once the server is running.
Shield Registration
POST /v1/shields
Registers a content shield for text analysis and violation detection.
Request Body Schema
{
"shield_id": "string",
"provider_shield_id": "string",
"provider_id": "trustyai_fms",
"params": {
"type": "content",
"confidence_threshold": 0.5,
"message_types": ["user", "system", "completion"],
"detectors": {
"detector_name": {
"detector_params": {
"param_key": "param_value"
}
}
}
}
}
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
|
string |
Yes |
Unique identifier for the shield |
|
string |
Yes |
Internal provider identifier (typically same as shield_id) |
|
string |
Yes |
Must be |
|
string |
Yes |
Must be |
|
number |
Optional |
Threshold 0.0-1.0 to trigger violations |
|
array |
Yes |
Message roles to analyze: |
|
object |
Yes |
Map of detector configurations |
Shield Execution
POST /v1/safety/run-shield
Executes a registered shield against messages.
Request Body Schema
{
"shield_id": "string",
"messages": [
{
"content": "string",
"role": "user|system|completion|tool"
}
]
}
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
|
string |
Yes |
ID of registered shield to execute |
|
array |
Yes |
Messages to analyze |
|
string |
Yes |
Text content to check |
|
string |
Yes |
Message role (must match shield’s message_types) |
Response Types
No Violation (Content Passed):
{
"violation": {
"violation_level": "info",
"user_message": "Content verified by shield shield_name (N messages processed)",
"metadata": {
"status": "pass",
"shield_id": "shield_name",
"confidence_threshold": 0.5,
"summary": { ... },
"results": [ ... ]
}
}
}
Violation Detected:
{
"violation": {
"violation_level": "error",
"user_message": "Content violation detected by shield shield_name (confidence: X.XX, N/N processed messages violated)",
"metadata": {
"status": "violation",
"shield_id": "shield_name",
"confidence_threshold": 0.5,
"summary": { ... },
"results": [ ... ]
}
}
}