Francisco Zorrilla portrait Available

Francisco Zorrilla

Computational biologist building production tools for microbiome-scale science.

Co-leading the NCCR Microbiomes Work Package 5 Flagship Project at ETH Zürich in the Sunagawa Lab. Built metaGEM, an open-source bioinformatics workflow used by labs worldwide. 7+ years of expertise spanning metagenomics, metabolic modeling, structural bioinformatics, and ML approaches.

Collaboration map · 12 publications & 14 key co-authors. Hover to inspect, click a paper to expand its case study.
Co-author Paper Industry collab
About

Overview

I'm a computational biologist co-leading the NCCR Microbiomes Work Package 5 Flagship Project at ETH Zürich, in the Sunagawa Lab, where my team builds the computational infrastructure for designing microbial communities that survive real-world deployment. I joined the Patil Lab as a computational biologist at EMBL Heidelberg, then moved with the group to the MRC Toxicology Unit at the University of Cambridge, where I completed my PhD. Throughout, my research has focused on translating raw metagenomic data into mechanistic, community-level metabolic predictions.

My published work has consistently aimed to deliver tools and frameworks that outlive a single paper:

Each project shipped with reproducible code, models, and datasets on GitHub and Zenodo (4,100+ downloads to date).

I'm now pivoting from academia to industry to apply this toolkit (production-quality bioinformatics, large-scale multi-omics, constraint-based modelling, AI for biology) to commercially relevant problems. Available October 2026 for industry research roles in agtech, microbial biotech, or AI-for-biology. Open to roles in the Zurich/Basel corridor or fully remote within Switzerland.

Curriculum Vitae

CV

Computational biologist · ETH Zürich · ex-EMBL · PhD Cambridge

Download PDF

Contact

Email
fzorrill@ethz.ch
Location
Zürich, Switzerland · Zurich/Basel corridor or fully remote in CH
Work authorization
Swiss B permit · Italian citizen · EU work-eligible
Languages
English (C2/native) · Spanish (C2/native)
Italian (B1) · French (A1) · German (A1)

Experience

Postdoctoral Researcher & NCCR Microbiomes Work Package 5 Flagship Project Lead

ETH Zürich · October 2024 – Present

  • Scientific leadership: Co-leading the multi-center NCCR Microbiomes Work Package 5 Flagship Project. Designed the 4-year computational roadmap for resilient SynCom design across diverse microbiome types.
  • AI & structural bioinformatics: Co-designed scalable functional annotation pipelines combining language models and structure search.
  • Data-driven SynCom design: Constraint-based metabolic modelling, pangenomics, ML to identify trophic interactions across soil, plant, gut, marine, and bee microbiomes.

Doctoral Researcher · Computational Biologist

University of Cambridge & EMBL Heidelberg · August 2019 – August 2024

Education

2020–2024
Ph.D., Biology · University of CambridgeCambridge Trust scholarship · MRC Toxicology Unit
2017–2019
M.Sc., Biotechnology · Chalmers University of TechnologyiGEM 2018: gold medal & best graduate modelling nominee
2013–2017
B.Sc., Biological Systems Engineering · UC Davis
12 Publications

Work

Click any card to expand for problem, approach, contribution, links, and altmetric attention.

Problem

Reconstructing genome-scale metabolic models from metagenomes was a multi-week, error-prone process that required deep expertise across a dozen separate tools, putting community-level metabolic insight out of reach for most labs.

Approach

Architected and released metaGEM, a Snakemake-orchestrated end-to-end pipeline that automates QC, assembly, binning, MAG consolidation, taxonomy, GEM reconstruction, and community simulation. Designed for HPC clusters and reproducible execution.

Contribution

Lead developer and corresponding author. Designed the architecture, wrote the pipeline, authored the documentation, and have maintained the project through five major releases.

Tools

SnakemakePythonCarveMeSMETANAMEGAHITCONCOCTMaxBin2MetaBAT2metaWRAPGTDB-TkMEMOTESlurm HPC

Attention

Live altmetric attention score

Links

Problem

Cheese flavor emerges from cross-feeding between microbial species in the starter culture. Empirical optimization is slow and doesn't generalize across products, so industrial fermenters needed a mechanistic predictor.

Approach

The team (led by first author Chrats Melkonian, Wageningen University) built community-level genome-scale metabolic models for the cheese microbiome, integrated metabolomic and transcriptomic data, and simulated cross-feeding to predict which species pairings drive flavor compounds.

My contribution

Second author. I supported the lead by contributing to the metabolic-modelling component: helping curate models for the relevant species and running community simulations. The wet-lab metabolomics, transcriptomics, and the principal modelling design were driven by Melkonian and the rest of the team.

Tools

COBRA ToolboxPythonMulti-omicsFBACommunity simulation

Attention

Live altmetric attention score

Press coverage

Links

Problem

Plastic pollution outpaces our catalog of enzymes that can degrade it. Wet-lab discovery is slow; we needed a way to mine global metagenomic data for novel plastic-degrading enzyme candidates.

Approach

The team (led by first author Jure Zrimec, Chalmers) built a deep-learning HMM classifier for plastic-degrading enzyme potential, applied it across global ocean and soil metagenomes, and correlated predicted enzyme abundance with regional pollution levels.

My contribution

Co-author. My role was the upstream metagenomic analysis: short-read processing through MAGs, contigs, and abundance estimates that the lead authors then used as input for the HMM-based plastic-degrading enzyme screen.

Tools

PythonDeep learningMetagenomicsStatistical modeling

Attention

Live altmetric attention score

Press coverage

Links

Problem

SynComs for agricultural inoculants and environmental restoration fail unpredictably. A core reason: we don't know which strains are obligately dependent on cross-fed metabolites from their neighbors.

Approach

Built a framework combining genome-scale metabolic modelling, metagenomics, and ecological co-occurrence analysis to flag obligate cross-feeding relationships across soil microbiome datasets.

My contribution

Co-first author. Co-led the metabolic modelling and integration with the metagenomics analysis. The most directly agtech-relevant paper in the portfolio.

Tools

metaGEMPythonConstraint-based modelingPangenomicsStatistical inference

Attention

Preprint · live altmetric attention

Links

Problem

OD600 is a universal microbiology readout but conversion to absolute cell count varies wildly between labs and instruments. This is a chronic source of irreproducibility in synthetic biology.

Approach

An iGEM-led consortium coordinated 244 laboratories to test three simple, low-cost OD calibration protocols on eight strains of constitutive GFP-expressing E. coli, establishing a community standard.

My contribution

Co-author. I participated as one of the 244 contributing labs (as a member of the Chalmers iGEM 2018 team), running the wet-lab cultures and OD/flow-cytometry measurements that fed into the consortium dataset.

Tools

Statistical analysisWet-lab calibrationMulti-lab QC

Attention

Highly cited reference paper

Press coverage

Links

Problem

Antimicrobial drug failure in real microbiomes can't be explained by single-strain resistance alone. Why do communities tolerate drugs better than monocultures?

Approach

The Patil & Ralser teams analysed auxotroph distributions across 12,000+ Earth Microbiome Project communities, then built self-establishing metabolically cooperating yeast communities (SeMeCos) and ran wet-lab drug screens to test the mechanism.

My contribution

Co-author. My specific role was the auxotrophy statistical reanalysis at the start of the paper: re-examining auxotroph distributions across the Earth Microbiome Project, and layering Lisa Maier's drug-on-bug screen data through an auxotroph-vs-prototroph lens to test whether auxotrophic state correlated with drug tolerance.

Tools

Statistical reanalysisEarth Microbiome ProjectAuxotroph annotationMulti-omics integration

Attention

Live altmetric attention score

Press coverage

Links

Problem

Enterobacteriaceae bloom in the gut signals dysbiosis and pathogen risk, but the global ecology of which commensal species protect against it was unknown at scale.

Approach

The team re-analysed 12,238 public gut metagenomes spanning 45 countries to identify co-colonisers and co-excluders of Enterobacteriaceae, surfacing a genus-wide colonisation-resistance signal in Faecalibacterium.

My contribution

Co-author. I served as a subject-matter-expert contributor on the metabolic and metagenomic-analysis design, providing guidance during the analysis.

Tools

MetagenomicsStatistical ecologyCo-occurrence analysis

Attention

Live altmetric attention score

Press coverage

Links

Problem

iGEM teams generate mountains of fluorescence data each year. Without consistent calibration, year-over-year results aren't directly comparable.

Approach

The iGEM consortium compared three multi-lab studies measuring fluorescence from identical engineered constructs across years, identifying calibrant preparation as the key driver of inter-year variance.

My contribution

Co-author. As a member of the Chalmers iGEM 2018 team, I contributed wet-lab fluorescence data from our setup; the cross-year analysis and write-up were led by the Beal-coordinated consortium.

Tools

Plate readersFlow cytometryStatistical analysis

Attention

Live altmetric attention score

Press coverage

Links

Problem

Genome-scale metabolic models published in literature often fail to reproduce: only ~40% in BioModels could be re-run without intervention. The community needed a standard.

Approach

The FROG consortium defined and tooled FROG (Flux variability, Reaction deletion, Objective function, Gene deletion) as a reproducibility checkpoint. BioModels integrated it into their submission workflow.

My contribution

Co-author on the consortium paper. My role was contributing model validation results and tool-integration feedback from the metaGEM side; the FROG specification itself was driven by the broader BioModels community.

Tools

SBMLCOBRABioModelsFBA / FVA

Attention

Preprint · live altmetric attention

Links

Problem

Hansenula polymorpha (Ogataea polymorpha) is an industrially relevant methylotrophic yeast, but new groups working on it lacked a clear protocol for building a usable GEM.

Approach

Wrote a step-by-step protocol chapter using RAVEN (MATLAB-based reconstruction toolbox) with a homology-based approach. Released the resulting hanpo-GEM publicly.

Contribution

First author of the protocol chapter; built the draft hanpo-GEM as the worked example.

Tools

RAVENMATLABSBMLHomology-based reconstruction

Attention

Methods chapter · attention from industrial-yeast community

Links

Problem

Recurrent C. difficile infection responds to fecal transplant, but the active ingredients of "colonization resistance" are unclear. Which metabolic interactions actually keep the pathogen out?

Approach

The team (Patil Lab, Cambridge) built a synthetic 14-commensal community and probed it against C. difficile invasion to dissect emergent metabolic interactions and their role in colonisation resistance.

My contribution

Co-author. I reconstructed and simulated GEMs from context-specific genomes, ran the metabolomics analysis between suppressive and non-suppressive samples, and identified genomic signatures consistent with the observed metabolomic differences.

Tools

Synthetic communitiesConstraint-based modelingMetabolomics

Attention

Preprint · live altmetric attention

Links

Scope

This thesis sits at the intersection of microbial ecology, metagenomics, and metabolic modelling, outlining the development and applications of omics-driven metabolic-modelling approaches to understand microbial community metabolism in diverse ecological contexts.

Methodological backbone

I developed metaGEM, a workflow for reconstructing context-specific genome-scale metabolic models (GEMs) from metagenome-assembled genomes (MAGs) and predicting nutritional dependencies directly from shotgun metagenomes. I applied it across five biomes (synthetic lab cultures, human gut, plant-associated, bulk soil, and ocean metagenomes), reconstructing over 14,000 MAGs and corresponding GEMs.

Three applied case studies

  • Auxotrophy distributions across microbiomes: high amino-acid auxotroph frequency, higher in host-associated than free-living samples, with evidence that auxotrophy correlates with drug tolerance. A second collaboration showed that purely genome-based annotations over-estimate auxotrophies while metagenomic models under-estimate them, with insertion-sequence enrichment in auxotrophic genomes hinting at a gene-loss mechanism.
  • Cheese-flavour formation (industrial collaboration): metabolic models from fermentation-culture genomes, simulated under varying media, revealed amino-acid auxotrophies in three community members rescued by a fourth, and strain-specific contributions to flavour.
  • C. difficile invasion resistance: GEMs from a defined 14-member community, plus metabolomics and genomic-signature analysis, to dissect why some sub-communities suppress the pathogen.

gutDBX

A separate methodological chapter introduces gutDBX: a metagenomics-driven database of 141,556 non-dereplicated microbial sequences associated with xenobiotic and pharmacologically-relevant metabolism in human and mouse gut, validated against the MetaCardis cohort. Hits correlated with serum cholesterol, Shannon diversity, and total prescribed-drug count.

Outlook

The closing chapter discusses future directions for metaGEM, including global plastic-degrading-potential surveys, long-read sequencing of understudied environments, and paleometagenomics.

Tools

metaGEMCarveMeSMETANAFBA / FVAeggNOG / KEGGMulti-omics

Links

Flagship Product

metaGEM

Reconstruct genome-scale metabolic models directly from metagenomes, at scale.

What it does

metaGEM is an end-to-end Snakemake workflow that turns raw metagenomic reads into community-level metabolic predictions. Used by labs worldwide for environmental genomics, synthetic ecology, and industrial fermentation research.

Pipeline stages: QC (fastp) → Assembly (MEGAHIT) → Binning (CONCOCT, MaxBin2, MetaBAT2) → MAG consolidation (metaWRAP) → Taxonomy (GTDB-Tk) → Metabolic reconstruction (CarveMe) → Model QC (MEMOTE) → Community simulation (SMETANA).

From an industry standpoint, metaGEM demonstrates two engineering capabilities that translate directly to commercial bioinformatics work: shipping a production-quality pipeline that researchers outside the original group can actually run, and sustaining it through real-world user load across multiple years and release cycles.

Quick start

mamba create -n metagem -c bioconda metagem

Full setup, tutorials, and HPC profiles at github.com/franciscozorrilla/metaGEM.

Applications in the wild

Selected published work where independent groups ran the metaGEM pipeline as a core part of their analysis (not just cited it). Each entry below has been verified against the published methods section.

2022 · Environmental Microbiome
Chiciudean et al. ran metaGEM through assembly, binning, GEM reconstruction (CarveMe) and SMETANA simulation to map cooperation and competition in a sulfidic Romanian cave analogous to deep-sea hydrothermal vents.
2021 · F1000Research
Werbin et al. adopted metaGEM as the backbone pipeline for processing the National Ecological Observatory Network's continental-scale soil metagenome dataset.
2026 · Cell Host & Microbe
Applied metaGEM end-to-end to show that protist predation shifts soil bacteria from competitive to cooperative metabolic regimes.
2024 · Communications Biology
Used metaGEM as the binning workflow ("metagenomic reads were binned into MAGs using the metaGEM pipeline workflow") in a domestication study taking a phenolic-metabolism consortium from lab to industrial scale.
2025 · bioRxiv (preprint)
828 MAGs across 47 US soil cores; "metagenome construction processes, including quality-control, contig assembly and binning, were performed using a suite of tools available in the MetaGEM pipeline."

Selected from 125+ works citing metaGEM. Browse the full citation list on Semantic Scholar or Europe PMC.

BibTeX citation
@article{zorrilla2021metagem,
  title={metaGEM: reconstruction of genome scale metabolic models directly from metagenomes},
  author={Zorrilla, Francisco and Buric, Filip and Patil, Kiran R and Zelezniak, Aleksej},
  journal={Nucleic Acids Research},
  volume={49}, number={21}, pages={e126},
  year={2021}, doi={10.1093/nar/gkab815}
}
Talks · Teaching · Mentorship

Talks & Teaching

50+ international researchers trained across 7 countries. All teaching materials open-source.

Activity

Date Title Venue Role
2026 · Jan NCCR Microbiomes Winter School UNIL · Lausanne instructor
2025 · Fall Microbial Community Genomics (551-1119-00L) · Sunagawa Lab Block Course ETH Zürich co-tutor
2025 Master's thesis supervision University of Oxford supervisor
2024 · Oct Metabolite and species dynamics in microbial communities EMBO Practical Course, Bangalore co-instructor
2024 · Aug Metabolic modelling for microbial ecology 9th COBRA Conference, San Diego poster
2024 · Jan Flux balance analysis and metabolic modelling Systems Biology Undergrad Course, Cambridge instructor
2022 · Oct Metabolic modelling of community interactions EMBO Micro Com Course (virtual) trainer
2022 · Oct Metagenomics-driven metabolic modeling for microbial ecology EMBO EvoEco Conference, Heidelberg poster + flash
2022 · Sep Metagenomics-driven metabolic modeling for microbial ecology 8th COBRA Conference, Galway selected talk
2022 · Jun Applications of genome scale metabolic models Summer School in Metabolic Modeling (virtual) invited talk
2022 · Feb From metagenomics to metabolic interactions SymbNET 2022 Course (EBI virtual) trainer
2021 · Mar metaGEM: reconstruction of genome scale metabolic models directly from metagenomes 7th COBRA Conference (virtual) poster
2018 · Oct iGEM 2018 · Gold Medal & Best Graduate Modeling nominee iGEM Foundation, Boston team member

Open-source teaching materials

Every course taught ships with a public repo so materials outlive the course.

🧬 Microbial Community Genomics (551-1119-00L)
2025 · ETH Zürich · Fall block course
GEMs, community modelling, AI-based annotation. Co-tutor with Sunagawa, Miravet Verde, Sperfeld.
🇮🇳 EMBOMicroCom2
2024 · EMBO Practical, Bangalore · 4 ★
Metabolic modelling tutorial covering FBA & GEMs.
🇬🇧 systems-biology-fba-practical
2024 · Cambridge · 8 ★
Flux balance analysis practical exercises for systems biology undergraduates.
🌍 EMBOMicroCom
2022 · EMBO Practical (virtual) · 13 ★
FBA, GEMs, microbial ecology tutorial.
🇪🇺 SymbNET
2022 · EBI virtual · 14 ★
End-to-end metagenomes to community metabolic models walkthrough.
🦠 unseenbio_metaGEM
2020 · Public demo
Run-through of metaGEM on the Unseen Bio dataset.
May 2026

Now

What I'm working on and what comes next.

NCCR Microbiomes flagship project

Scalable SynCom design across diverse microbiomes

Co-leading the NCCR Microbiomes Work Package 5 Flagship Project at ETH Zürich in the Sunagawa Lab: scalable functional annotation pipelines and SynCom design across diverse microbiome types including soil, plant, gut, marine, and bee. A multi-center, 4-year effort with WP5 partners across ETH Zürich, EPFL, UZH, UNIL, and CHUV (see the NCCR research overview for the full consortium structure). Goal: establish mechanistic rules for designing synthetic microbial communities that survive field deployment without loss of critical strains.

My team is building the computational infrastructure: constraint-based metabolic models, pangenomics, and machine learning to identify trophic dependencies and niche-filling strains across the diverse microbiome contexts the consortium covers.

Looking forward · industry pivot

The full availability and geography summary is in the Overview. Specifically, ideal next-step projects:

  • Agricultural biotechnology: soil microbiomes, SynCom design, crop resilience.
  • Microbial biotech: strain engineering, fermentation optimization, bioremediation.
  • AI for biology: structural prediction, enzyme discovery, systems design.

The academic-to-industry transition is, for me, a natural extension of work I'm already doing: I've maintained metaGEM through five major releases, mentored 50+ trainees, and contributed to commercial partnerships such as Chr. Hansen.

Side interests

  • Open-source sustainability: what keeps scientific software tools alive in the wild, and what kills them.
  • Long-distance running: half-marathons and trail routes around Zürich.

Want to grab coffee?

Happy to chat about computational biology, open-source tools, or just meet someone new in biotech. Email is fastest.