Getting Started with GuardrailsOrchestrator
GuardrailsOrchestrator is a service for large language model guardrailing underpinned by the open-source project fms-guardrails-orchestrator. GuardrailsOrchestrator is a component of the TrustyAI Kubernetes Operator. In this tutorial, you will learn how to create a GuardrailsOrchestrator
CR to
perform detections on text generation output.
GuardrailsOrchestrator is available in RHOAI 2.19+ via KServe Raw Deployment mode. |
In order to use it on Open Data Hub or OpenShift AI, first enable KServe Raw Deployment
. In the DataScienceIntialization
resource, set the value of managementState
for the serviceMesh
component to Removed
.
---
serviceMesh:
auth:
audiences:
- 'https://kubernetes.default.svc'
controlPlane:
metricsCollection: Istio
name: data-science-smcp
namespace: istio-system
managementState: Removed
---
Next, in the DataScienceCluster
resource,under the spec.components section, set the value of of kserve.serving.managementState to Removed
.
The GuardrailsOrchestrator Service
The GuardrailsOrchestrator service defines a new Custom Resource Definition named GuardrailsOrchestrator
. GuardrailsOrchestrator
objects are monitored by the TrustyAI Kubernetes operator. A GuardrailsOrchestrator object represents an orchestration service that invokes detectors on text generation input/output and standalone detections.
Here is a minimal example of a GuardrailsOrchestrator
object:
---
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: GuardrailsOrchestrator
metadata:
name: gorch-sample
spec:
orchestratorConfig: "fms-orchestr8-config-nlp" (1)
replicas: 1 (2)
---
1 | The orchestratorConfig field specifies a ConfigMap object that contains generator, detector, and chunker arguments. |
2 | The replicas field specifies the number of replicas for the orchestrator. |
Here is a minimal example of an orchestratorConfig object:
---
kind: ConfigMap
apiVersion: v1
metadata:
name: fms-orchestr8-config-nlp
data:
config.yaml: |
generation: (1)
service:
hostname: llm-predictor.guardrails-test.svc.cluster.local
port: 8033
detectors: (2)
hap <2.1>:
service:
hostname: http:/detector-host/api/v1/text/contents
port: 8000
chunker_id: whole_doc_chunker
default_threshold: 0.5
---
1 | The generation field specifies the hostname and port of the Large Language Model (LLM) predictor service. |
2 | The detectors field specifies the name, hostname, and port of the detector service, the chunker ID, and the default threshold. <2.1> The name of the detector. In this example, we are specifiying it as a Hateful and Profance (HAP) detector. |
After you apply the example orchestratorConfig
ConfigMap and GuardrailsOrchestrator
CR above, you can guardrail against your LLM inputs and outputs:
Verify the orchestrator pod is up and running:
---
oc get pods -n <TEST_NAMESPACE> | grep gorch-sample
---
The expected output is:
---
gorch-sample-6776b64c58-xrxq9 3/3 Running 0 4h19m
---
Retrieve the external HTTP route for the orchestrator:
---
GORCH_ROUTE_HTTP=$(oc get routes gorch-sample-http -o jsonpath='{.spec.host}' -n <TEST_NAMESPACE>)
---
Send a request to the v2/chat/completions-detection endpoint, specifying detections against HAP content in input text and generated outputs.
---
curl -X 'POST' \
"https://$GORCH_ROUTE_HTTP/api/v2/chat/completions-detection" \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "llm",
"messages": [
{
"content": "You dotard, I really hate this stuff",
"role": "user"
}
],
"detectors": {
"input": {
"hap": {}
},
"output": {
"hap": {}
}
}
}'
---
Example output with HAP content detected:
---
{"id":"086980692dc1431f9c32cd56ba607067",
"object":"",
"created":1743084024,
"model":"llm",
"choices":[],"
usage":{"prompt_tokens":0,"total_tokens":0,"completion_tokens":0},
"detections":{
"input":[{
"message_index":0,
"results":[{
"start":0,"end":36,"text":"<explicit_text>, I really hate this stuff",
"detection":"sequence_classifier",
"detection_type": "sequence_classification",
"detector_id":"hap",
"score":0.9634239077568054
}]
}]
},
"warnings":[{
"type":"UNSUITABLE_INPUT",
"message":"Unsuitable input detected. Please check the detected entities on your input and try again with the unsuitable input removed."
}]
}
---
Details of GuardrailsOrchestrator
In this section, let’s review all the possible parameters for the GuardrailsOrchestrator
object and their usage.
Parameter | Description |
---|---|
|
The number of orchestrator pods to spin up |
|
The name of the ConfigMap object that contains generator, detector, and chunker arguments |
|
Boolean value to inject the regex detector sidecar container into the orchestrator pod. The regex detector is a lightweight HTTP server designed to parse text using predefined patterns or custom regular expressions. |
|
Boolean value to enable controlled interaction with the orchestrator service by enforcing stricter access to its exposed endpoints. It provides a mechanism of configuring fixed detector pipelines, and then provides a unique /v1/chat/completions endpoint per configured detector pipeline. |
|
The name of the ConfigMap object that specifies gateway configurations |
|
List of paired name and value arguments for configuring OpenTelemetry traces and/or metrics
|
Configuring the Regex Detector and Guardrails Gateway
The regex detector and guardrails gateway are two sidecar images that can be used with the GuardrailsOrchestrator service, either individually or together. They are enabled via the GuardrailsOrchestrator CR.
Here is an example of a GuardrailsOrchestrator CR that references the regex detector and guardrails gateway images:
---
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: GuardrailsOrchestrator
metadata:
name: gorch-sample
spec:
orchestratorConfig: "fms-orchestr8-config-nlp"
enableBuiltInDetectors: True (1)
enableGuardrailsGateway: True (2)
guardrailsGatewayConfig: "fms-orchestr8-config-gateway" (3)
replicas: 1
---
1 | The enabledBuiltInDetectors, if set to True, injects regex detectors as a sidecar container into the orchestrator pod |
2 | The enableGuardrailsGateway, if set to True, injects the vLLM gateway as a sidecar conatiner into the orchestrator pod |
3 | The guardrailsGatewayConfig field specifies a ConfigMap that reroutes the orchestrator and regex detector routes to specific paths |
Here is an example orchestratorConfig named fms-orchestr8-config-nlp
. Please take note that it differs from the previous example:
---
kind: ConfigMap
apiVersion: v1
metadata:
name: fms-orchestr8-config-nlp
data:
config.yaml: |
chat_generation:
service:
hostname: llm-predictor.guardrails-test.svc.cluster.local
port: 8032
detectors:
regex:
type: text_contents
service:
hostname: "127.0.0.1"
port: 8080
chunker_id: whole_doc_chunker
default_threshold: 0.5
---
Here is an example of a guardrailsGatewayConfig named fms-orchestr8-config-gateway
:
---
kind: ConfigMap
apiVersion: v1
metadata:
name: fms-orchestr8-config-gateway
labels:
app: fmstack-nlp
data:
config.yaml: |
orchestrator:
host: "localhost"
port: 8032
detectors:
- name: regex
detector_params:
input: true
output: true
regex:
- email
- ssn
- name: other_detector
routes:
- name: pii
detectors:
- regex
- name: passthrough
detectors:
---
Let’s review all the required arguments for the guardrailsGatewayConfig:
Parameter | Description |
---|---|
|
The orchestrator service |
|
A list of preconfigured regexes for common detection actions |
|
The resulting endpoints for detections |
Send a request to the /v1/chat/completions endpoint, specifying detections against PII content in input text and generated outputs.
---
curl "https://$GORCH_ROUTE_HTTP/pii/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-1.5B-Instruct",
"messages": [
{
"role": "user",
"content": "say hello to me at someemail@somedomain.com"
},
{
"role": "user",
"content": "btw here is my social 123456789"
}
]
}'
---
Example output with PII content detected:
---
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"audio": null,
"content": "I'm sorry, I'm afraid I can't do that.",
"refusal": null,
"role": "assistant",
"tool_calls": null
}
}
],
"created": 1741182848,
"detections": {
"input": null,
"output": [
{
"choice_index": 0,
"results": [
{
"detection": "EmailAddress",
"detection_type": "pii",
"detector_id": "regex-language",
"end": 176,
"score": 1.0,
"start": 152,
"text": "someemail@somedomain.com"
}
]
}
]
},
"id": "16a0abbf4b0c431e885be5cfa4ff1c4b",
"model": "Qwen/Qwen2.5-1.5B-Instruct",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 83,
"prompt_tokens": 61,
"total_tokens": 144
},
"warnings": [
{
"message": "Unsuitable output detected.",
"type": "UNSUITABLE_OUTPUT"
}
]
}
---
Configuring the OpenTelemetry Exporter for Metrics & Tracing
Traces and metrics are provided for the observability of the GuardrailsOrchestrator service via the OpenTelemetry Operator.
Pre-requisites:
-
Install the Red Hat OpenShift distributed tracing platform from the OperatorHub. Create a Jaeger instance using the default settings.
-
Install the Red Hat build of OpenTelemetry from the OperatorHub. Create an OpenTelemetry instance
Here is a minimal example of a GuardrailsOrchestrator
object that has OpenTelemetry configured:
---
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: GuardrailsOrchestrator
metadata:
name: gorch-test
spec:
orchestratorConfig: "fms-orchestr8-config-nlp"
replicas: 1
otelExporter:
protocol: "http"
otlpEndpoint: "localhost:4318"
otlpExport: "metrics"
---