Jobs

Mindrift

-

🌎 Remote

Posted on: 20 August, 2025

Apply

Evaluation Scenario Writer - AI Agent Testing Specialist

📍 Job Overview

Job Title: Evaluation Scenario Writer - AI Agent Testing Specialist

Company: Mindrift

Location: Poland

Job Type: Part-Time, Remote

Category: AI & Machine Learning

Date Posted: 2025-08-18

Experience Level: Mid-Senior level (2-5 years)

Remote Status: Remote (Specified Country)

🚀 Role Summary

📝 Enhancement Note: This role focuses on designing evaluation scenarios for LLM-based agents, simulating human tasks, and defining gold-standard behaviors. It requires a strong analytical mindset, attention to detail, and adaptability to complex guidelines.
Collaborate with AI specialists and developers to create structured test scenarios that mimic real-world tasks.
Define golden paths and acceptable agent behaviors, annotating task steps, expected outputs, and edge cases.
Review agent outputs and adapt tests as needed to ensure they remain relevant and challenging.
Work on a part-time, remote basis, fitting the role around primary professional or academic commitments.

📈 Primary Responsibilities

📝 Enhancement Note: The primary responsibilities revolve around creating evaluation scenarios, defining gold-standard behaviors, and collaborating with developers to refine tests.
Design structured test scenarios based on real-world tasks.
Define golden paths and acceptable agent behaviors.
Annotate task steps, expected outputs, and edge cases.
Collaborate with developers to test scenarios and improve clarity.
Review agent outputs and adapt tests accordingly.

🎓 Skills & Qualifications

Education: A Bachelor’s or Master’s degree in Computer Science, Software Engineering, Data Science, Artificial Intelligence, Computational Linguistics, Information Systems, or a related field.

Experience: 3+ years of experience in a relevant field.

Required Skills:

Analytical mindset
Attention to detail
Familiarity with AI agents and testing methodologies
Ability to learn new methods and adapt to complex guidelines
Advanced English language skills (C1 or above)

Preferred Skills:

Experience with LLM-based agents
Knowledge of natural language processing (NLP)
Familiarity with software engineering and data science principles
Experience in computational linguistics or machine learning

📊 Campaign Portfolio & Results Requirements

Portfolio Essentials:

Previous experience in AI testing or a related field.
Examples of test scenarios or evaluation frameworks.
Demonstrated ability to define gold-standard behaviors and edge cases.

Campaign Documentation:

Detailed documentation of test scenarios, including task steps, expected outputs, and edge cases.
Evidence of collaboration with developers to refine and improve tests.
Examples of agent outputs and how tests were adapted in response.

💵 Compensation & Benefits

Salary Range: The salary range for this role is not specified. However, based on market research, the average salary for a similar role in Poland is approximately PLN 12,000 - 18,000 per month (gross). This range takes into account the experience level and remote nature of the role.

Benefits:

Flexible, part-time work schedule.
Opportunity to work on advanced AI projects and enhance your portfolio.
Influence on the future of AI in your field of expertise.

Working Hours: The working hours for this role are flexible, with a total of 40 hours per month.

🎯 Team & Company Context

🏢 Company Culture

Industry: Mindrift operates in the AI and machine learning industry, focusing on ethical AI development and collective intelligence.

Company Size: Mindrift is a growing startup, which means a dynamic and agile work environment.

Founded: The company was founded with a mission to unlock the potential of generative AI by tapping into real-world expertise from across the globe.

Team Structure:

The team consists of AI specialists, developers, and project managers.
Collaboration and knowledge sharing are key aspects of the company culture.

Methodology:

Mindrift uses a project-based methodology, with each project having its own unique requirements and goals.
The company emphasizes collective intelligence and ethical AI development.

Company Website: Mindrift

📈 Career & Growth Analysis

Marketing Career Level: This role is not a marketing role, but it offers opportunities for career progression in AI and machine learning.

Reporting Structure: As a freelance contributor, you will report directly to the project manager or AI specialist leading the project you are working on.

Marketing Impact: While this role does not directly impact marketing, it contributes to the development of AI models that can be used in various applications, including marketing.

🌐 Work Environment

Office Type: This is a remote, part-time role with no physical office requirements.

Office Location(s): The role can be performed from anywhere in the specified country.

Workspace Context:

As a remote worker, you will need a laptop, internet connection, and a quiet workspace.
You will collaborate with the Mindrift team using various communication and project management tools.

Work Schedule: The work schedule is flexible, allowing you to work around your primary professional or academic commitments.

📄 Application & Portfolio Review Process

Interview Process:

Apply through the application link.
Complete a qualification process to ensure your skills and experience match the role’s requirements.
If qualified, you will be invited to contribute to AI projects aligned with your skills and interests.

Portfolio Review Tips:

Highlight your experience in AI testing or a related field.
Showcase your ability to design test scenarios and define gold-standard behaviors.
Demonstrate your adaptability to complex guidelines and your ability to learn new methods.

Challenge Preparation:

Familiarize yourself with LLM-based agents and AI testing methodologies.
Brush up on your English language skills, as an advanced level is required for this role.

ATS Keywords: (For resume optimization)

AI Agent Testing
Evaluation Scenario Design
LLM-based Agents
Test Case Creation
Gold-standard Behavior Definition
Edge Case Annotation
Developer Collaboration
AI Project Contribution
Freelance AI Specialist
Part-time Remote Work

🛠 Tools & Technology Stack

Primary Tools:

Collaboration tools (e.g., Slack, Microsoft Teams)
Project management tools (e.g., Asana, Trello)
AI testing tools (e.g., LangChain, Hugging Face)

Analytics & Attribution:

Not applicable, as this role focuses on AI agent testing rather than marketing or sales performance tracking.

Campaign Management & Automation:

Not applicable, as this role focuses on AI agent testing rather than marketing campaign management.

👥 Team Culture & Values

Marketing Values: (Not applicable, as this role is not a marketing role.)

Collaboration Style:

Mindrift values collaboration and knowledge sharing, with a focus on collective intelligence.
As a freelance contributor, you will work closely with the Mindrift team to ensure your test scenarios are clear, well-scored, and easy to execute and reuse.

📝 Interview Preparation

Strategy Questions:

Be prepared to discuss your experience in AI testing or a related field.
Explain your approach to designing test scenarios and defining gold-standard behaviors.
Demonstrate your ability to adapt to complex guidelines and learn new methods.

Company & Culture Questions:

Research Mindrift’s mission and values to show your understanding of the company’s focus on ethical AI development and collective intelligence.
Prepare questions to ask the interview panel about the projects you might work on and the team you will collaborate with.

Portfolio Presentation Strategy:

Highlight your experience in AI testing or a related field.
Showcase your ability to design test scenarios and define gold-standard behaviors.
Demonstrate your adaptability to complex guidelines and your ability to learn new methods.

📌 Application Steps

To apply for this AI Agent Testing Specialist role:

Submit your application through the application link.
Prepare your portfolio, highlighting your experience in AI testing or a related field.
Research Mindrift’s mission and values to demonstrate your understanding of the company’s focus on ethical AI development and collective intelligence.
Prepare for the interview by brushing up on your English language skills and familiarizing yourself with LLM-based agents and AI testing methodologies.

Tags:

ai

ml

Apply

Share the job: