Company Overview
Isomorphic Labs is a new Alphabet company that is reimagining drug discovery through a computational- and AI-first approach.
We are on a mission to accelerate the speed, increase the efficacy and lower the cost of drug discovery. You’ll be working at the cutting edge of the new era of ‘digital biology’ to deliver a transformative social impact for the benefit of millions of people.
Come and be part of an interdisciplinary team driving groundbreaking innovation and play a meaningful role in contributing towards us achieving our ambitious goals, while being a part of an inspiring, collaborative and entrepreneurial culture.
Your impact
This is an exciting opportunity for you to work on a greenfield ML-based software platform that will transform the biopharmaceutical world as we know it.
Working in a highly creative, iterative environment, you will be partnering with leading engineers, scientists and ML researchers to build the critical platform driving that transformation. This is a newly created role and you will need to use your previous experience and show initiative in order to fully carve out your contribution.
What You Will Do
- Develop and operate a bioinformatics platform for the ingestion, harmonisation, and curation of biological reference data, including external and internal datasets.
- Develop and curate robust data models, controlled vocabularies, ontologies, and data integration strategies to ensure data consistency, quality, and interoperability.
- Develop and maintain versioned release processes for curated biological data, ensuring traceability and reproducibility of analyses.
- Develop metrics and tools to monitor biological data quality and completeness.
- Develop efficient data stores for machine learning models and other data-driven products.
- Develop and execute processes for effective and secure data handling, including interoperation with TREs and other controlled environments.
- Develop and operate pipelines for complex and large biological dataset ingestion, harmonisation and curation.
- Contribute to the wider development of the data and computational platform.
- Collaborate with computational biology and chemistry groups, providing guidance and support for their initiatives.
- Provide documentation, guidance, and training on data resources, data curation processes, and ontologies to the wider organization.
Essential
Skills and qualifications
- Extensive programming experience writing production code using mainstream programming languages such as Python or C++.
- Strong experience with database technologies (SQL and NoSQL) and data modeling.
- Experience working with biological databases (e.g., NCBI, UniProt, Ensembl).
- Knowledge of biological ontologies (e.g., GO, KEGG, Reactome) and controlled vocabularies.
- Experience with bioinformatics large-scale data analysis and quality control.
- Experience building and maintaining bioinformatics pipelines.
- MSc degree in Bioinformatics, Computational Biology, Computer Science, or a related technical field, or equivalent practical experience.
Nice to have
- Strong experience building, deploying and maintaining production systems on GCP.
- Experience with data governance principles and practices.
- Experience with statistical analysis and data visualization.
- Experience with modern ML systems, frameworks, and data lifecycle
- PhD in Bioinformatics, Computational Biology, or a related field, or equivalent experience.
APPLY