Reasonings¶

While basic ratings provide a binary evaluation of criteria, reasonings offer detailed explanations of why a response meets or fails to meet specific criteria. The reasonings can provide you:

Detailed Analysis: Instead of just true/false, you get a textual explanation of how the response meets or fails each criterion
Actionable Feedback: Clear explanations help identify specific areas for improvement
Transparency: Makes the evaluation process more transparent and helps build trust in the rating system

The following example shows how to get detailed reasonings with elluminate:

from dotenv import load_dotenv
from elluminate import Client
from elluminate.schemas import RatingMode

load_dotenv(override=True)

client = Client()

prompt_template, _ = client.prompt_templates.get_or_create(
    "Write three sentences of increasing complexity about {{topic}}. Use one very odd word in each sentence.",
    name="Complex sentences",
)

criteria = [
    "Are the sentences of clearly increasing complexity?",
    "Are the sentences about {{topic}}?",
    "Does each sentence contain one very odd word?",
]  # (1)!
client.criteria.add_many(criteria, prompt_template=prompt_template, delete_existing=True)


template_variables = client.template_variables.add_to_collection(
    template_variables={"topic": "LLM Evaluation"},
    collection=prompt_template.default_template_variables_collection,
)

response = client.responses.generate(
    prompt_template,
    template_variables=template_variables,
)

ratings = client.ratings.rate(response, rating_mode=RatingMode.FAST)  # (2)!

# Print results
print(f"Response: {response.response}")
for rating in ratings:
    print(f"Criteria: {rating.criterion.criterion_str}")
    print(f"Reasoning: {rating.reasoning}")
    print(f"Rating: {rating.rating}\n")

1. The criteria also can contain template variables, which will be replaced when the criteria is evaluated against a response to a given prompt.

2. To get a proper reasoning, for your ratings you need to use `RatingMode.DETAILED`. Note that this takes more time than basic ratings with `RatingMode.FAST`.

Use detailed reasonings when you:

Need to debug prompt performance
Want to improve prompt quality systematically
Are training new team members

Note that generating reasonings takes more time than basic ratings, so use RatingMode.FAST for quick evaluations and RatingMode.DETAILED when you need the additional context. You can see the reasoning over the detail view of a rating in the UI.

From here you can also manually change the rating as well as the reasoning.