Saliency explanations with KServe
This tutorial will walk you through setting up and using TrustyAI to provide saliency explanations for model predictions using the KServe builtin explainer capabilities. We will deploy a model, configure the environment, and demonstrate how to obtain predictions and their explanations.
Prerequisites
-
An operational Kubernetes cluster.
-
KServe 0.11 installed on the cluster. [1]
-
The
kubectl
command-line tool installed.
Setup
Install InferenceService
Create a new namespace for your explainer tests. We will refer to this namespace as $NAMESPACE
throughout the tutorial.
We will also assume that the namespace into which KServe was deployed is kserve
.
export NAMESPACE="trustyai-test"
export KSERVE_NAMESPACE="kserve"
We first verify that KServe is installed and running, by checking the status of its operator.
kubectl get pods -n $KSERVE_NAMESPACE -l control-plane=kserve-controller-manager
We can now proceed to create the model deployment namespace with
kubectl create namespace $NAMESPACE
The InferenceService
to deploy will have a new key (explainer
), and will be similar to the one in the following example.
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "explainer-test-lime"
spec:
predictor: (1)
model:
modelFormat:
name: sklearn
protocolVersion: v2
runtime: kserve-sklearnserver
storageUri: https://github.com/trustyai-explainability/model-collection/raw/bank-churn/model.joblib (2)
explainer: (3)
containers:
- name: explainer
image: quay.io/trustyai/trustyai-kserve-explainer:latest (4)
1 | The predictor field is the same as you would use for a regular InferenceService . |
2 | In this case we are using an sklearn model, by specifying the URI location, but you can use any other model supported by KServe. |
3 | The explainer field is a new key that specifies the explainer to use. In this case, the lime explainer is used by default. |
4 | The image of the KServe TrustyAI explainer must be specified in the explainer.image field. |
We can deploy it with
kubectl apply -f inference-service-explainer-lime.yaml -n $NAMESPACE
And wait for the InferenceService
to be ready.
kubectl get pods -n $NAMESPACE
NAME READY STATUS RESTARTS AGE
explainer-test-lime-explainer-00001-deployment-c6fff8b4-5x4qg 2/2 Running 0 41m
explainer-test-lime-predictor-00001-deployment-dfd47bb47-lwl5b 2/2 Running 0 41m
You will see that in addition to the predictor
pod, there is also an explainer
pod running. These two pods are responsible for the prediction and explanation, respectively.
Making predictions
Predictions are performed in the same way as with regular InferenceService
deployments. We can use the kubectl
command to send a request to the model.
We leave the access mode to the InferenceService
to the user’s discretion, depending on the Kubernetes cluster configuration. Some possible options are available in the KServe documentation. For the rest of this tutorial, we will use port-forwarding to access the service, for simplicity and assume local access.
We will use the following payload to make a prediction, which we know from the test data will produce a positive prediction. The payload is available for download here.
{
"instances": [
[
404,
1,
1,
20,
1,
144481.56,
1,
56482.48,
1,
372,
0,
0,
1,
2
]
]
}
To request a prediction, we can use the following command:
curl -s -H "Host: explainer-test-lime.${NAMESPACE}.example.com" \
-H "Content-Type: application/json" \
"http://localhost:8080/v1/models/explainer-test-lime:predict" \
-d @payload.json
and the result should be:
{"predictions":[1]}
Requesting explanations
To request an explanation, we can use the a very similar command and payload, simple replacing predict
with explain
in the URL.
curl -s -H "Host: explainer-test-lime.${NAMESPACE}.example.com" \
-H "Content-Type: application/json" \
"http://localhost:8080/v1/models/explainer-test-lime:explain" \ (1)
-d @payload.json
1 | The verb predict is replaced with explain . |
This produce a saliency map, similar to:
{
"timestamp": "2024-05-06T21:42:45.307+00:00",
"type": "explanation",
"saliencies": {
"outputs-0": [
{
"name": "inputs-12",
"score": 0.8496797810357467,
"confidence": 0
},
{
"name": "inputs-5",
"score": 0.6830766647546147,
"confidence": 0
},
{
"name": "inputs-7",
"score": 0.6768475400887952,
"confidence": 0
},
{
"name": "inputs-9",
"score": 0.018349706373627164,
"confidence": 0
},
{
"name": "inputs-3",
"score": 0.10709513039521452,
"confidence": 0
},
{
"name": "inputs-11",
"score": 0,
"confidence": 0
}
]
}
}
From the above saliency map, we can see that the most important feature is inputs-12
.