Skip to content

Batch Operations

Batch operations allow you to process multiple items in a single API call rather than making separate calls for each item. In Elluminate, this means you can generate multiple responses or perform multiple ratings in parallel, significantly improving efficiency and reducing overhead.

For example, instead of making 10 separate API calls to generate responses for 10 different sets of template variables, you can send all 10 sets in a single batch request. The server processes these requests concurrently and returns all results together.

Batch operations are particularly useful when you need to:

  • Bulk Process Responses: Generate responses for multiple prompts efficiently
  • Batch Evaluate: Rate multiple responses against criteria in parallel
from dotenv import load_dotenv
from elluminate import Client
from elluminate.schemas import RatingMode

load_dotenv(override=True)

client = Client()

prompt_template, _ = client.prompt_templates.get_or_create(
    "Explain why {{language}}'s {{quirk}} is considered surprising or counterintuitive, "
    + "and demonstrate it with a short code example that might trip up even experienced developers.",
    name="Programming Quirks",
)

client.criteria.get_or_generate_many(prompt_template)

# Create a collection for our variables
collection, _ = client.collections.get_or_create(
    name="Programming Quirks Variables", description="Template variables for programming quirks examples"
)

values = [
    {"language": "Python", "quirk": "mutable default arguments"},
    {"language": "JavaScript", "quirk": "type coercion"},
]

template_variables = []
for value in values:
    variables = client.template_variables.add_to_collection(
        template_variables=value,
        collection=collection,
    )
    template_variables.append(variables)

# Create an experiment for tracking responses and ratings
experiment, _ = client.experiments.get_or_create(
    "Programming Quirks Analysis",
    prompt_template=prompt_template,
    collection=collection,
    description="Evaluating explanations of programming language quirks",
)  # (1)!

responses = client.responses.generate_many(
    prompt_template=prompt_template, template_variables=template_variables, experiment=experiment
)  # (2)!

# Generate ratings for the `responses`
batch_ratings = client.ratings.rate_many(responses, rating_mode=RatingMode.FAST)  # (3)!

for i, response in enumerate(responses):
    ratings = batch_ratings[i]
    print(f"\n\nResponse: {response.messages[-1].content}")
    for rating in ratings:
        print(f"Criteria: {rating.criterion.criterion_str}")
        print(f"Rating: {rating.rating}\n")
  1. Create an experiment to track the batch responses and ratings. This experiment will be used to associate all generated responses with this evaluation run.

  2. Efficiently generates multiple responses from a single prompt template with different template variables. The experiment parameter links all generated responses to the experiment.

  3. Efficiently rate multiple responses against their respective criteria. The ratings are automatically collected in the experiment because the responses were linked to it during generation. The rating mode determines the type of rating strategy to employ.

Performance Considerations

Batch operations offer several performance benefits:

  • Reduced network overhead
  • Parallel processing on the server
  • Lower total latency for multiple operations
  • More efficient resource utilization

However, be mindful of:

  • Memory usage with large batches
  • Timeout limits for long-running operations
  • API rate limits and quotas
  • Error handling complexity