Data specialist/curator for toxicological metadata
Posted 18 hours 41 minutes ago by ITech Consult
Data specialist/curator for Large Language Models-derived toxicological (meta)data (m/f/d) - Data Validation / Quality Assurance / Toxicology/Documentation /English
Project:
For our customer a big pharmaceutical company in Basel we are looking for a highly qualified Data specialist/curator for Large Language Models-derived toxicological (meta)data (m/f/d).
Background:
We believe it's urgent to deliver medical solutions right now - even as we develop innovations for the future. We are passionate about transforming patients' lives and we are fearless in both decision and action. And we believe that good business means a better world. We commit ourselves to scientific rigor, unassailable ethics, and access to medical innovations for all. We do this today to build a better tomorrow. Pharmaceutical Sciences (PS) is a global function within Roche Pharma Research and Early Development (pRED).
As a team member in the Prediction Modelling (PM) Chapter of PS, you will work in close collaboration with toxicologists as well as other scientists in pRED, having access to state-of-the-art bioinformatics and biostatistics tools and methods and gaining toxicological insights from experts in the field.
Large Language Models (LLMs) have evolved beyond simple text completion tools into sophisticated systems capable of data extraction, summarization, enrichment, and knowledge capture. These advancements enable the retrieval of information previously locked within documents, reports, presentations, and meeting records, making it possible to repurpose this data and reverse translate knowledge from the past to inform future insights.
The position in question focuses on transforming historical toxicology documents into structured, high-quality datasets through AI-powered extraction while supporting the application of LLMs for toxicological data understanding and enrichment from historical documents. The extracted data will be used to enhance existing data repositories and serve as a foundation for creating new ones tailored to the specific needs of toxicologists.
While LLMs hold immense potential, they are also prone to generating inaccurate or fabricated information, commonly referred to as "hallucinations." To address this, rigorous data curation is essential. Curating and validating subsets of the extracted data will not only improve the quality and reliability of the outputs but also contribute to refining and enhancing the performance of the models themselves
The perfect candidate:
We are looking for a highly skilled and detail-oriented Toxicological Data Curator with a background in biology or toxicology or drug development or veterinary medicine etc. In this role, you will ensure the accuracy, consistency, and scientific validity of toxicological data captured by large language models (LLMs). You will cross-check data outputs against original sources, evaluate the reliability of AI-derived information, and contribute to the development of high-quality datasets to support drug safety and development efforts.
Tasks & Responsibilities:
Data Validation:
Review toxicological data output generated by LLMs and validate it against original resources (eg, research articles, regulatory documents, toxicology databases).
Identify and document discrepancies, errors, or ambiguities in AI-derived data.
Build up pipeline/dataset (choose one if needed) for toxicological evaluation tasks
. Quality Assurance:
Ensure consistency, completeness, and scientific accuracy of curated toxicological datasets.
Adhere to established quality control protocols and contribute to improving workflows as needed.
. Toxicology Expertise:
Apply toxicological knowledge to evaluate data related to preclinical and clinical toxicities, mechanisms of toxicity, safety biomarkers, and risk assessments.
Assess relevance and applicability of curated data to specific drug development scenarios.
. Collaboration:
Work closely with computational toxicologists, data scientists, and cross-functional teams to align on curation standards and requirements.
Provide feedback to enhance LLM performance based on identified gaps or inaccuracies in the extracted data.
. Documentation and Reporting:
Maintain detailed records of validation processes and outcomes.
Prepare periodic reports summarizing curation progress, data quality metrics, and key findings.
Must Haves:
. Preferred Master's in Biology, Pharmacology, Toxicology, Drug Development, Biotechnology or a related field.
. Experience with data curation, annotation, or systematic review methodologies is a plus.
. Basic understanding of machine learning, LLMs, or natural language processing (NLP) tools is desirable but not required.
. English fluent (mind. C1 Level)
. Strong attention to detail and commitment to data accuracy.
. Excellent critical thinking and problem-solving skills, particularly in evaluating scientific information.
. Ability to synthesize complex toxicological data and present clear, actionable conclusions.
. Effective written and verbal communication skills for reporting and collaboration.
Reference Nr.: 923799TP
Role: Data specialist/curator for Large Language Models-derived toxicological (meta)data (m/f/d)
Industrie: Pharma
Workplace: Basel
Pensum: 80-100%
Start: 01.03.2025
Duration: 6
Deadline:10.02.2025
If you are interested in this position, please send us your complete dossier.
About us:
ITech Consult is an ISO 9001:2015 certified Swiss company with offices in Germany and Ireland. ITech Consult specialises in the placement of highly qualified candidates for recruitment in the fields of IT, Life Science & Engineering.
We offer staff leasing & payroll services. For our candidates this is free of charge, also for Payroll we do not charge you any additional fees.