Pharmaceutical Discovery
Questions from pharma R&D, natural product discovery, and BD teams - answered plainly.
IsoGentiX is building a specimen-level, Nagoya Protocol-compliant, cryptographically auditable multi-omics database of Madagascar's endemic vascular flora. Fewer than 10 of Madagascar's 12,000 endemic vascular plant species have chromosome-level genome assemblies in any public database. For pharmaceutical R&D, that means access to novel natural product chemistry - alkaloids, terpenoids, bioactive peptides - from a flora that has never been systematically screened.
Each dataset is delivered with the complete provenance documentation required for commercial development under the Nagoya Protocol and EU Regulation 511/2014: Prior Informed Consent from Madagascar's national authority, Mutually Agreed Terms, and a blockchain-anchored IRCC certificate. The science and the legal infrastructure are both designed to be commercially deployable from day one.
Catharanthus roseus - the rosy periwinkle, endemic to Madagascar - produced vinblastine and vincristine, two chemotherapy drugs with combined peak annual revenues exceeding $1.4 billion. That discovery came from one species, screened opportunistically with 1950s phytochemical methods. Less than 0.08% of Madagascar's endemic flora has been examined to comparable depth.
IsoGentiX is applying modern 8-layer multi-omics methods systematically to that 99.92%. The same evolutionary conditions that produced Catharanthus chemistry - 160 million years of island isolation, extreme diversity of biomes, no shared ancestry with continental flora - apply across the entire endemic flora. The Catharanthus case is not exceptional. It is what the data looks like when you look.
Each IsoGentiX specimen dataset includes 8 integrated data layers, all linked to the specimen GUID:
- Genome - whole-genome sequencing to EBP reference-grade standards (Merqury QV =40, BUSCO =90%)
- Transcriptome - RNA-seq capturing active gene expression at the time of collection
- Metabolome - LC-MS/MS targeted and untargeted metabolomics, plus NIR metabolite fingerprinting
- Proteome - protein expression profiling bridging transcriptional and metabolic data
- Epigenome - methylation profiling capturing epigenetic regulation of biosynthetic gene clusters
- Microbiome - rhizosphere and endophyte community profiling
- Soil XRF - elemental chemistry of the substrate at the collection site (accuracy -2% vs ISO 17294)
- Habitat data - GPS coordinates, altitude, phenological state, aspect, and voucher photography
All layers are delivered with provenance metadata compliant with EBP and Darwin Core standards. Data formats: FASTA/FASTQ (genome/transcriptome), GFF3 (annotations), mzML (metabolomics spectra), JSON-LD (metadata).
IsoGentiX enforces Earth BioGenome Project (EBP) reference-grade standards throughout:
- Merqury QV score =40 (fewer than 1 error per 10,000 bases)
- BUSCO completeness =90% against the relevant plant lineage database
- Sequencing depth =40 - for short reads; N50 scaffold length =10Mb for long reads
- Metabolomics coverage =95% of known metabolite classes for each plant family
- Soil XRF elemental accuracy within -2% of certified reference standards (ISO 17294)
Every dataset undergoes automated QC pipelines followed by CSO sign-off before release. Quality certificates accompany every data delivery. Datasets that do not meet standards are resequenced rather than released below threshold.
IsoGentiX prioritises families with established alkaloid-producing capacity where the Malagasy endemic members remain largely uncharacterised:
- Apocynaceae - the family of Catharanthus roseus. Madagascar has 200+ endemic Apocynaceae species. The alkaloid profiles of fewer than 15 have been systematically screened. This family is the highest-priority pharmaceutical target in the programme.
- Rubiaceae - the family of quinine. Madagascar's endemic Rubiaceae are poorly characterised; the alkaloid chemistry of most species is unknown.
- Loganiaceae - produces strychnine and related alkaloids. Malagasy endemic members have not been screened at metabolome depth.
- Menispermaceae - multiple bioactive alkaloid classes; most Malagasy species uncharacterised.
Priority within families is determined by phylogenetic proximity to characterised alkaloid producers, degree of geographic isolation (tsingy and high-endemism zones first), and intraspecies chemical diversity signals from preliminary NIR screening.
Domain licensing gives a single licensee exclusive access to the IsoGentiX multi-omics dataset from a defined domain - geographic (e.g. a specific biome or collection zone) or taxonomic (e.g. a plant family or genus) - before the data has been screened for specific commercial targets.
A Founding Partner acquiring a domain licence receives: all multi-omics data generated from specimens in that domain, as it becomes available; exclusive rights to any discoveries made using that data; Nagoya-compliant provenance documentation for downstream development; and priority access to additional specimens as the collection programme expands.
Domains are allocated on a first-come basis. Once a domain is licensed, IsoGentiX does not offer the same data to other commercial partners. This creates a genuine first-mover advantage: the licensee has exclusive access to a defined chemical space that no other party can access through IsoGentiX.
Yes. IsoGentiX offers a structured data preview process for qualified pharmaceutical partners: a curated sample dataset from a defined taxonomic group, delivered under a non-disclosure agreement, sufficient to evaluate data quality, format, and scientific value before committing to a domain licence.
Preview datasets include representative specimens from the target family, with all 8 data layers, quality certificates, and example provenance documentation. The preview process is designed to allow your bioinformatics and regulatory teams to assess the data independently before any commercial decision is made.
Contact us via the contact page to initiate a data preview conversation. We will ask for a brief description of your discovery programme focus to ensure the preview dataset is appropriately targeted.
Domain licensing is a first-come, first-served allocation. Domains that have been licensed are not available to subsequent partners. The window to acquire exclusive access to high-priority domains - particularly the Apocynaceae and the tsingy karst chemistry - is a function of when other pharmaceutical partners engage, not of when the data collection is complete.
Additionally, Madagascar's endemic flora faces ongoing habitat pressure. Collection capacity in some high-priority zones is time-limited by access conditions and conservation status. Species collected now, under current permit frameworks and FPIC agreements, generate datasets that may not be collectable in the same conditions in future years.
Ready to discuss pharmaceutical discovery access?
Contact us to request a data preview, discuss domain licensing terms, or arrange a scientific briefing with our CSO.
Get in Touch