LLM Testing and Validation Engineer - AI & Data Solutions
Apply on
Title: LLM Testing and Validation Engineer - AI & Data Solutions
Summary:
Our entertainment client is seeking a talented, detail-oriented LLM Tester to join their AI & Data Solutions team. This role will be instrumental in validating data pipelines, model training processes, and ensuring data quality within our Generative AI platform. The ideal candidate will have a strong background in testing data-driven AI workflows, with expertise in retrieval-augmented generation (RAG) systems, knowledge-base techniques, and compliance standards. This position offers an opportunity to contribute directly to the performance and reliability of their Large Language Models (LLMs).
Job at a glance:
- 18-month contract
- 100% remote in the US
- No C2C
- Model Validation and Testing : Conduct rigorous testing of LLM model training processes, including validation of data selection, ingestion, and preprocessing steps for Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) stages.
- Data Quality Assurance : Implement and maintain a robust data testing framework to validate data integrity, quality, and relevance, focusing on both text and non-text data sources. Identify any data inconsistencies, gaps, or anomalies within multi-stage data pipelines.
- Embedding and Retrieval Testing : Evaluate and test vector embeddings, similarity search algorithms, and retrieval techniques, ensuring model accuracy and efficient data retrieval. Validate the implementation of various vector stores, embedding techniques, and search methodologies.
- Pipeline Functionality Testing : Validate data processing workflows, including chunking, indexing, ingestion, and vectorization, across various embedding algorithms and vector store configurations.
- Automated Data Annotation and Tagging : Test and validate auto-tagging systems and data preparation tools to ensure accuracy and efficiency in both manual and automated data tagging processes.
- Performance Monitoring and Benchmarking : Collaborate with the engineering team to develop benchmarks and validate model performance across multiple metrics, including temperature, top-k, and repeat penalty. Measure and analyze key performance metrics to assess model accuracy, relevance, and robustness.
- Compliance and Data Privacy : Conduct audits of data pipelines and model outputs to ensure adherence to data privacy and ethical AI development standards.
- Cross-functional Collaboration : Work closely with Data Engineers, AI Researchers, and cross-functional teams to align testing protocols with business needs and ensure that data pipelines support high-quality, scalable AI solutions.
- 5+ years of experience in data science/engineering.
- Bachelors or Masters degree in Computer Science, Data Science, or a related field.
- 2+ years of experience in data quality testing, AI model validation, or related roles within an AI/ML environment.
- Proficiency in testing and validation methodologies for AI and LLM models, including an understanding of LLM architectures, RAG systems, and knowledge base integration.
- Experience working with vector databases, embedding techniques, and information retrieval algorithms, with the ability to evaluate their accuracy and performance.
- Familiarity with data lakehouse architectures and optimizing data storage and retrieval within these frameworks.
- Knowledge of data privacy regulations and best practices for handling sensitive data in AI pipelines.
- Strong problem-solving skills with the ability to troubleshoot issues within data pipelines and LLM model outputs.
- Proficiency in Python and other tools relevant to data processing, testing, and validation.
- Hands-on experience with popular LLM/RAG frameworks such as LangChain, LlamaIndex, Semantic Kernel, or OpenAI functions.
- Familiarity with distributed computing platforms (e.g., Apache Spark, Dask).
- Experience with cloud platforms (AWS, Google Cloud Platform, or Azure) for large-scale data processing and testing.
- Knowledge of data versioning and experiment tracking tools for monitoring model performance over time.
- Understanding of metrics and methodologies for evaluating LLM outcomes, including tools for performance benchmarking and accuracy testing.
- Excellent communication skills with the ability to document and present testing results, insights, and recommendations clearly.
- Strong attention to detail and commitment to ensuring data and model quality in a fast-paced, innovative environment.
CEI delivers solutions that help our customers transform their businesses and achieve meaningful results as a trusted technology partner. From strategy and custom application development through application management - our technology and digital experience services are tailored to meet each unique need of our customers. Our staffing solutions bring specialized skills to complement our customers' workforce and project requirements.
#INDCEI