Evaluation Scenario Writer - AI Agent Testing Specialist

Jobs
Mindrift

Mindrift

-

🌎 Remote

Posted on: 20 August, 2025

Evaluation Scenario Writer - AI Agent Testing Specialist

πŸ“ Job Overview

Job Title: Evaluation Scenario Writer - AI Agent Testing Specialist

Company: Mindrift

Location: Poland

Job Type: Part-Time, Remote

Category: AI & Machine Learning

Date Posted: 2025-08-18

Experience Level: Mid-Senior level (2-5 years)

Remote Status: Remote (Specified Country)

πŸš€ Role Summary

  • πŸ“ Enhancement Note: This role focuses on designing evaluation scenarios for LLM-based agents, simulating human tasks, and defining gold-standard behaviors. It requires a strong analytical mindset, attention to detail, and adaptability to complex guidelines.
  • Collaborate with AI specialists and developers to create structured test scenarios that mimic real-world tasks.
  • Define golden paths and acceptable agent behaviors, annotating task steps, expected outputs, and edge cases.
  • Review agent outputs and adapt tests as needed to ensure they remain relevant and challenging.
  • Work on a part-time, remote basis, fitting the role around primary professional or academic commitments.

πŸ“ˆ Primary Responsibilities

  • πŸ“ Enhancement Note: The primary responsibilities revolve around creating evaluation scenarios, defining gold-standard behaviors, and collaborating with developers to refine tests.
  • Design structured test scenarios based on real-world tasks.
  • Define golden paths and acceptable agent behaviors.
  • Annotate task steps, expected outputs, and edge cases.
  • Collaborate with developers to test scenarios and improve clarity.
  • Review agent outputs and adapt tests accordingly.

πŸŽ“ Skills & Qualifications

Education: A Bachelor’s or Master’s degree in Computer Science, Software Engineering, Data Science, Artificial Intelligence, Computational Linguistics, Information Systems, or a related field.

Experience: 3+ years of experience in a relevant field.

Required Skills:

  • Analytical mindset
  • Attention to detail
  • Familiarity with AI agents and testing methodologies
  • Ability to learn new methods and adapt to complex guidelines
  • Advanced English language skills (C1 or above)

Preferred Skills:

  • Experience with LLM-based agents
  • Knowledge of natural language processing (NLP)
  • Familiarity with software engineering and data science principles
  • Experience in computational linguistics or machine learning

πŸ“Š Campaign Portfolio & Results Requirements

Portfolio Essentials:

  • Previous experience in AI testing or a related field.
  • Examples of test scenarios or evaluation frameworks.
  • Demonstrated ability to define gold-standard behaviors and edge cases.

Campaign Documentation:

  • Detailed documentation of test scenarios, including task steps, expected outputs, and edge cases.
  • Evidence of collaboration with developers to refine and improve tests.
  • Examples of agent outputs and how tests were adapted in response.

πŸ’΅ Compensation & Benefits

Salary Range: The salary range for this role is not specified. However, based on market research, the average salary for a similar role in Poland is approximately PLN 12,000 - 18,000 per month (gross). This range takes into account the experience level and remote nature of the role.

Benefits:

  • Flexible, part-time work schedule.
  • Opportunity to work on advanced AI projects and enhance your portfolio.
  • Influence on the future of AI in your field of expertise.

Working Hours: The working hours for this role are flexible, with a total of 40 hours per month.

🎯 Team & Company Context

🏒 Company Culture

Industry: Mindrift operates in the AI and machine learning industry, focusing on ethical AI development and collective intelligence.

Company Size: Mindrift is a growing startup, which means a dynamic and agile work environment.

Founded: The company was founded with a mission to unlock the potential of generative AI by tapping into real-world expertise from across the globe.

Team Structure:

  • The team consists of AI specialists, developers, and project managers.
  • Collaboration and knowledge sharing are key aspects of the company culture.

Methodology:

  • Mindrift uses a project-based methodology, with each project having its own unique requirements and goals.
  • The company emphasizes collective intelligence and ethical AI development.

Company Website: Mindrift

πŸ“ˆ Career & Growth Analysis

Marketing Career Level: This role is not a marketing role, but it offers opportunities for career progression in AI and machine learning.

Reporting Structure: As a freelance contributor, you will report directly to the project manager or AI specialist leading the project you are working on.

Marketing Impact: While this role does not directly impact marketing, it contributes to the development of AI models that can be used in various applications, including marketing.

🌐 Work Environment

Office Type: This is a remote, part-time role with no physical office requirements.

Office Location(s): The role can be performed from anywhere in the specified country.

Workspace Context:

  • As a remote worker, you will need a laptop, internet connection, and a quiet workspace.
  • You will collaborate with the Mindrift team using various communication and project management tools.

Work Schedule: The work schedule is flexible, allowing you to work around your primary professional or academic commitments.

πŸ“„ Application & Portfolio Review Process

Interview Process:

  • Apply through the application link.
  • Complete a qualification process to ensure your skills and experience match the role’s requirements.
  • If qualified, you will be invited to contribute to AI projects aligned with your skills and interests.

Portfolio Review Tips:

  • Highlight your experience in AI testing or a related field.
  • Showcase your ability to design test scenarios and define gold-standard behaviors.
  • Demonstrate your adaptability to complex guidelines and your ability to learn new methods.

Challenge Preparation:

  • Familiarize yourself with LLM-based agents and AI testing methodologies.
  • Brush up on your English language skills, as an advanced level is required for this role.

ATS Keywords: (For resume optimization)

  • AI Agent Testing
  • Evaluation Scenario Design
  • LLM-based Agents
  • Test Case Creation
  • Gold-standard Behavior Definition
  • Edge Case Annotation
  • Developer Collaboration
  • AI Project Contribution
  • Freelance AI Specialist
  • Part-time Remote Work

πŸ›  Tools & Technology Stack

Primary Tools:

  • Collaboration tools (e.g., Slack, Microsoft Teams)
  • Project management tools (e.g., Asana, Trello)
  • AI testing tools (e.g., LangChain, Hugging Face)

Analytics & Attribution:

  • Not applicable, as this role focuses on AI agent testing rather than marketing or sales performance tracking.

Campaign Management & Automation:

  • Not applicable, as this role focuses on AI agent testing rather than marketing campaign management.

πŸ‘₯ Team Culture & Values

Marketing Values: (Not applicable, as this role is not a marketing role.)

Collaboration Style:

  • Mindrift values collaboration and knowledge sharing, with a focus on collective intelligence.
  • As a freelance contributor, you will work closely with the Mindrift team to ensure your test scenarios are clear, well-scored, and easy to execute and reuse.

πŸ“ Interview Preparation

Strategy Questions:

  • Be prepared to discuss your experience in AI testing or a related field.
  • Explain your approach to designing test scenarios and defining gold-standard behaviors.
  • Demonstrate your ability to adapt to complex guidelines and learn new methods.

Company & Culture Questions:

  • Research Mindrift’s mission and values to show your understanding of the company’s focus on ethical AI development and collective intelligence.
  • Prepare questions to ask the interview panel about the projects you might work on and the team you will collaborate with.

Portfolio Presentation Strategy:

  • Highlight your experience in AI testing or a related field.
  • Showcase your ability to design test scenarios and define gold-standard behaviors.
  • Demonstrate your adaptability to complex guidelines and your ability to learn new methods.

πŸ“Œ Application Steps

To apply for this AI Agent Testing Specialist role:

  • Submit your application through the application link.
  • Prepare your portfolio, highlighting your experience in AI testing or a related field.
  • Research Mindrift’s mission and values to demonstrate your understanding of the company’s focus on ethical AI development and collective intelligence.
  • Prepare for the interview by brushing up on your English language skills and familiarizing yourself with LLM-based agents and AI testing methodologies.

Tags:
ai
ml
Share the job:

Related Jobs