Epicareer Might not Working Properly
Learn More

Data Scientist

Salary undisclosed

Apply on


Original
Simplified

Job Title Title: Data Scientist
Location: Remote and Baltimore, MD
Duration: 9 months
Description:

  • Formulate, design, and deliver AI/ML-based decision-making frameworks and models to achieve business outcomes.
  • Develop and fine-tune AI models for NLP tasks, including Named Entity Recognition (NER), text classification, and sentiment analysis with a focus on unstructured clinical data.
  • Analyze and preprocess large datasets, particularly unstructured medical records (e.g., physician notes, discharge summaries) using tools like Pandas, NLTK, and SpaCy.
  • Create custom NLP algorithms and annotators for extracting insights from medical records.
  • Work with human-in-the-loop systems, incorporating clinician feedback to refine and improve AI models.
  • Conduct experiments to evaluate AI model performance, employing metrics such as precision, recall, and F1-score.
  • Collaborate with cross-functional teams, including software engineers, to ensure AI solutions are integrated into the broader system architecture.
  • Create custom tools to enable analysts to perform data research.
  • Maintain comprehensive technical documentation and provide ongoing system support.
  • Develop data pipelines for efficient data processing, integrating with databases such as SQL (PostgreSQL, MySQL) and NoSQL (MongoDB, Elasticsearch).
  • Deploy AI models using cloud platforms (AWS, Azure) and containerization technologies like Docker, and manage them using CI/CD pipelines.

Requirements:

  • 5+ years of experience in AI/ML development, with a strong focus on NLP using frameworks such as TensorFlow, PyTorch, and Hugging Face.
  • Mastery in Python frameworks and libraries like Transformers, NLTK, SpaCy, Gensim, and data manipulation tools such as Pandas and NumPy.
  • Proven experience with human-in-the-loop systems that incorporate clinician feedback to improve model performance.
  • Experience with cloud platforms (AWS, Azure), containerization (Docker), and CI/CD pipelines for machine learning model deployment.
  • Solid understanding of statistical modeling, data analysis, and performance evaluation metrics.
  • Expertise in analyzing and processing unstructured clinical data using techniques such as tokenization, lemmatization, and word embeddings (e.g., TF-IDF, BERT).
  • Knowledge of healthcare data formats and standards such as HL7, FHIR, ICD codes, and SNOMED.
  • Ability to effectively articulate technical challenges and solutions, with excellent written and verbal communication skills.
  • Familiarity with SQL and NoSQL databases and structuring data pipelines for efficient processing.
  • Master s degree in Data Science, AI, Computer Science, or a related field + 10 years of experience; or PhD + 4 years of experience.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job