Epicareer Might not Working Properly
Learn More

LLM Testing and Validation Engineer - AI & Data Solutions

  • Full Time, onsite
  • Computer Enterprises, Inc.
  • Remote, United States of America
Salary undisclosed

Apply on


Original
Simplified

Title: LLM Testing and Validation Engineer - AI & Data Solutions

Summary:
Our entertainment client is seeking a talented, detail-oriented LLM Tester to join their AI & Data Solutions team. This role will be instrumental in validating data pipelines, model training processes, and ensuring data quality within our Generative AI platform. The ideal candidate will have a strong background in testing data-driven AI workflows, with expertise in retrieval-augmented generation (RAG) systems, knowledge-base techniques, and compliance standards. This position offers an opportunity to contribute directly to the performance and reliability of their Large Language Models (LLMs).
Job at a glance:

  • 18-month contract
  • 100% remote in the US
  • No C2C
Responsibilities:
  • Model Validation and Testing : Conduct rigorous testing of LLM model training processes, including validation of data selection, ingestion, and preprocessing steps for Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) stages.
  • Data Quality Assurance : Implement and maintain a robust data testing framework to validate data integrity, quality, and relevance, focusing on both text and non-text data sources. Identify any data inconsistencies, gaps, or anomalies within multi-stage data pipelines.
  • Embedding and Retrieval Testing : Evaluate and test vector embeddings, similarity search algorithms, and retrieval techniques, ensuring model accuracy and efficient data retrieval. Validate the implementation of various vector stores, embedding techniques, and search methodologies.
  • Pipeline Functionality Testing : Validate data processing workflows, including chunking, indexing, ingestion, and vectorization, across various embedding algorithms and vector store configurations.
  • Automated Data Annotation and Tagging : Test and validate auto-tagging systems and data preparation tools to ensure accuracy and efficiency in both manual and automated data tagging processes.
  • Performance Monitoring and Benchmarking : Collaborate with the engineering team to develop benchmarks and validate model performance across multiple metrics, including temperature, top-k, and repeat penalty. Measure and analyze key performance metrics to assess model accuracy, relevance, and robustness.
  • Compliance and Data Privacy : Conduct audits of data pipelines and model outputs to ensure adherence to data privacy and ethical AI development standards.
  • Cross-functional Collaboration : Work closely with Data Engineers, AI Researchers, and cross-functional teams to align testing protocols with business needs and ensure that data pipelines support high-quality, scalable AI solutions.
Required Skills:
  • 5+ years of experience in data science/engineering.
  • Bachelors or Masters degree in Computer Science, Data Science, or a related field.
  • 2+ years of experience in data quality testing, AI model validation, or related roles within an AI/ML environment.
  • Proficiency in testing and validation methodologies for AI and LLM models, including an understanding of LLM architectures, RAG systems, and knowledge base integration.
  • Experience working with vector databases, embedding techniques, and information retrieval algorithms, with the ability to evaluate their accuracy and performance.
  • Familiarity with data lakehouse architectures and optimizing data storage and retrieval within these frameworks.
  • Knowledge of data privacy regulations and best practices for handling sensitive data in AI pipelines.
  • Strong problem-solving skills with the ability to troubleshoot issues within data pipelines and LLM model outputs.
  • Proficiency in Python and other tools relevant to data processing, testing, and validation.
Preferred Skills:
  • Hands-on experience with popular LLM/RAG frameworks such as LangChain, LlamaIndex, Semantic Kernel, or OpenAI functions.
  • Familiarity with distributed computing platforms (e.g., Apache Spark, Dask).
  • Experience with cloud platforms (AWS, Google Cloud Platform, or Azure) for large-scale data processing and testing.
  • Knowledge of data versioning and experiment tracking tools for monitoring model performance over time.
  • Understanding of metrics and methodologies for evaluating LLM outcomes, including tools for performance benchmarking and accuracy testing.
Interpersonal & Professional Skills:
  • Excellent communication skills with the ability to document and present testing results, insights, and recommendations clearly.
  • Strong attention to detail and commitment to ensuring data and model quality in a fast-paced, innovative environment.

CEI delivers solutions that help our customers transform their businesses and achieve meaningful results as a trusted technology partner. From strategy and custom application development through application management - our technology and digital experience services are tailored to meet each unique need of our customers. Our staffing solutions bring specialized skills to complement our customers' workforce and project requirements.
#INDCEI

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job