How to Identify Hate Speech using Salad Transcription API

Last Updated: March 11, 2025 By using our integrated custom LLM prompts, you can identify hate speech, with confidence ratings and category, in your transcriptions. To use this, you just need to add the custom_prompt input parameter to your transcription request.

Custom input parameters

In this example, we’ll use a Python script to transcribe an audio file and tell it to use the custom_prompt input parameter to identify any hate speech. We’ll use the following input parameters:

payload = {"input": {
        "custom_prompt": "Detect whether there is any hate speech in the content. respond only with a confidence rating, where low confidence means you are unsure, and a high number means you are sure, and category in the following format: 'confidence\': x.xx, \'category\': \'x\'",
        "url": audio_file_url,
        "language_code": "en",
        "return_as_file": False
    }}

Python script

We’ll use it in the following Python script:

import requests
import time

audio_file_url = "url/to/audio/file.mp3" # replace with your audio file URL

salad_api_key = "YOUR_SALAD_API_KEY" # replace with your Salad API key
organization_name = "YOUR_ORGANIZATION_NAME" # replace with your organization name

payload = {"input": {
        "custom_prompt": "Detect whether there is any hate speech in the content. respond only with a confidence rating, where low confidence means you are unsure, and a high number means you are sure, and category in the following format: 'confidence\': x.xx, \'category\': \'x\'",
        "url": audio_file_url,
        "language_code": "en",
        "return_as_file": False
    }}
headers = {
    "Salad-Api-Key": salad_api_key,
    "Content-Type": "application/json"
}


url = f"https://api.salad.com/api/public/organizations/{organization_name}/inference-endpoints/transcribe/jobs"

response = requests.post(url, json=payload, headers=headers)

response = response.json()
job_id=response["id"]
print (f'Job ID: {job_id}')

while True:
  time.sleep(5)
  try:
      result = requests.get(f"{url}/{job_id}", headers=headers)
      result = result.json()
      if result["status"] == "created" or result["status"] == "pending" or result["status"] == "started" or result["status"] == "running":
          print(f'Current job status is {result["status"]}')
      elif result["status"] == "failed":
          print(f'Job failed')
          break
      elif result["output"]:
          print(result["output"]["llm_result"])
          break
      else:
          print(f'Current job status is {result["status"]}')
  except Exception as e:
        print(f'Error retrieving transcription result: {e}')
        break

In this script, we’re telling the LLM to look for and identify any hate speech in the transcription content and categorize it. It will be returned in the llm_result field of the response. We’re then printing the result of this field to the console. It’ll look something like this once it finishes:

Output

Job ID: abcdef6c-1234-41e8-b81b-ecc1109033ed
Current job status is running
Current job status is running
Current job status is running
Current job status is running
Current job status is running
Current job status is running
Current job status is running
Current job status is running
Current job status is running
"confidence": 0.90, "category": "Hate speech targeting physical characteristics and self-description"

Explanation

Tutorials

How-to Guides

Reference

How to Identify Hate Speech using Salad Transcription API

Custom input parameters

Python script

Output

Explanation

Tutorials

How-to Guides

Reference

​Custom input parameters

​Python script

​Output

Custom input parameters

Python script

Output