Job Description
About the Role
You will own prompt design and optimization workflows for production-grade AI systems powering internal tools and customer-facing features. This is not a role for surface-level experimentation—we’re looking for someone who thinks in language and systems, understands token economics, and can reverse-engineer model behavior by reading between the (model-generated) lines.
Key Responsibilities
- Design token-efficient prompts tailored to specific model families (GPT-4, Claude, Gemini, Mistral, etc.) and use cases (e.g., structured outputs, constrained creativity, reasoning under chain-of-thought).
- Build and maintain internal prompt libraries, annotated with use-case benchmarks, failure modes, and fallback strategies.
- Run prompt evaluations across deterministic and temperature-varied settings using scripting tools (LangChain, Guidance, etc.) and fine-grained metrics (latency, hallucination frequency, instruction-following precision).
- Collaborate with LLMOps and product engineers to embed prompt logic into APIs, microservices, or user flows—sometimes writing intermediate parsing logic or few-shot scaffolding functions.
- Prototype multi-step prompt chains with role switching, embedded memory, and self-critique models—then optimize to reduce latency and token bloat without losing outcome fidelity.
- Contribute to internal tooling, such as test harnesses, evaluation dashboards, or prompt diffing utilities, ideally in Python or TypeScript.
Required Skills
- Strong command of prompt engineering across multiple LLMs (OpenAI, Anthropic, Cohere, open-weight models), including quirks and undocumented behaviors.
- Working knowledge of tokenization strategies, length limits, and generation settings across APIs (e.g.,
max_tokens
,stop
,top_p
,temperature
). - Experience using prompt evaluation tools (e.g., PromptLayer, Rebuff, LMQL, or custom-built eval scripts).
- Familiarity with structured output generation, JSON schema prompting, and mitigating failure when models “hallucinate structure.”
- Able to debug and iterate on prompts via model introspection, not just trial-and-error.
Preferred Qualifications
- Prior experience shipping LLM-powered features in production (not just research or demo environments).
- Comfortable working with chain-of-thought prompting, retrieval-augmented generation (RAG), and few-shot fine-tuning.
- Knowledge of LLMOps principles (caching, versioning, telemetry logging for prompts).
- Some exposure to embedding models, vector databases, and hybrid retrieval+prompting pipelines.
Bonus if You…
- Have written internal docs or published resources analyzing LLM behavior with actual prompt logs.
- Understand how prompting interacts with safety layers, content filters, or moderation systems.
- Know when to recommend a code rewrite over a prompt hack.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers#Dynamicbrandguru