Data Scientist

Full Time, onsite
Diamond Pick
Remote, United States of America

Salary undisclosed

Apply on

Dice

Availability Status

This job is expected to be in high demand and may close soon. We’ll remove this job ad once it's closed.

Original

Simplified

RESPONSIBILITIES
Develop and implement LLM-based applications tailored for in-house legal
Fine-tune and deploy large language models to enhance their performance on legal text processing tasks
Evaluate and help maintain our data assets and training/evaluation data sets
Design and build pipelines for preprocessing, annotating, and managing legal document datasets
Collaborate with legal experts to understand requirements and ensure models meet domain-specific needs
Conduct experiments and evaluate model performance to drive continuous improvements
Interface with other technical personnel or team members to finalize requirements.
Work closely with other development team members to understand moderately complex product requirements and translate them into software designs.
Successfully implement development processes, coding best practices, and code reviews for production environments.

REQUIREMENTS
Formal training in machine learning: dimensionality reduction, clustering, embeddings, and sequence classification algorithms
Experience with deep learning frameworks such as PyTorch, Tensorflow and Hugging Face Transformers.
Practical experience in Natural Language Processing methods and libraries such as spaCy, word2vec, TensorFlow, Keras, PyTorch, Flair, BERT
Practical experience with large language models, prompt engineering, fine-tuning and benchmarking, using frameworks such as LangChain and LlamaIndex
Strong Python background
Knowledge of AWS, Google Cloud Platform, Azure, or other cloud platform
Understanding of data modeling principles and complex data models.
Proficiency with relational and NoSQL databases as well as vector stores (e.g., Postgres, Elasticsearch/OpenSearch, ChromaDB)
Knowledge of Scala, Spark, Ray, or other distributed computing systems highly preferred
Knowledge of API development, containerization, and machine learning deployment highly preferred
Experience with ML Ops/AI Ops highly preferred

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Report this job

Similar Jobs

1d ago

Sr. Business Analysis Manager - Media Data Analytics & Capabilities (Site Metrics)

T-Mobile