Job Description
Key Responsibilities
AI Agent Evaluation
- Write evaluation rubrics with objective pass/fail criteria
- Debug agent traces to identify failure patterns
- Stress test agents against edge cases, prompt injection, and tool misuse
Technical Assessment
- Assess production-grade modular software architecture
- Analyse multi-turn system interactions and behaviours
- Provide high-density technical feedback for LLM training
Project Workflow
- Create an account and upload a resume/ID
- Complete the onboarding assessment
- Start earning through flexible task assignments
Qualifications
- Experience in backend engineering, AI automation, or complex systems integration
- Proven ability to build and maintain production-grade software with modular separation (e.g., distinct services for data parsing, logic processing, and reporting)
- Strong command of at least two major languages (e.g., Python, JavaScript, Go, or Java) and experience working with SQL databases
- Practical experience building for live, non-mocked environments and handling multi-turn system interactions
Preferred (Nice to Have)
- Experience integrating agents with live tools such as Supabase, Gmail, and other APIs
- Familiarity with persistent state and session-tracking patterns
- Experience identifying privacy leaks, authority escalation, or indirect prompt injection vulnerabilities
Compensation
- Hourly compensation ranges from USD $30–$50, depending on experience and task complexity
- Payments are issued weekly via supported payout platforms (e.g., PayPal or AirTM)
- Full compensation details are provided prior to task acceptance
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline
#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers# Dynamicbrand guru
Apply Now