Équipe CSTB - Systèmes Complexes, Bioinformatique Translationnelle

Carlos Bermejo Das Neves

De Équipe CSTB - Systèmes Complexes, Bioinformatique Translationnelle
Sauter à la navigation Sauter à la recherche


TITRE DE LA THÈSE : From genetics of rare diseases to personalized medicine

RESPONSABLES DE LA THÈSE : Olivier POCH, Julie THOMPSON (co-encadrement)

ÉQUIPE D’ACCUEIL DE LA THÈSE : Bioinformatique théorique, Fouille de données et Optimisation stochastique, 100%

MOTS CLÉS : bioinformatique, réseaux moléculaires, biologies systèmiques, myopathies


With the recent advances in sequencing of human genomes/exomes, the genetics of Rare Diseases (RD) has entered a new era. Approximately 7000–8000 different conditions are now categorized and a good level of knowledge is available for as many as 1200 of them. The technologies have enabled us to identify RD genes, allowing early diagnosis and providing a basis for understanding gene function and developing novel approaches to therapy. Understanding the causes of RD has provided fundamental insights into basic biological processes and the causes of common, multifactorial disorders such as cancer, heart disease, diabetes, or susceptibility to infectious diseases [2]. Our ability to explain the environmental, genetic and other biological sources of human variation will lead to more accurate disease diagnosis, more efficient drug development and will open the way to personalized health treatment strategies.

The work will be performed in the context of our collaboration with the Association Française contre les Myopathies (AFM) who have developed an internationally recognized expertise in Pathophysiology, molecular mechanisms and therapeutic approaches for several neuromuscular diseases. Recently, NGS and CGH arrays have provided thousands of gene variations (single nucleotides, insertions, deletions…) of which only a few are clearly causal for diseases. These data are complemented by detailed phenotypic descriptions and functional genomics (transcriptomic and proteomic) data. Predicting phenotype and/or clinical progression based on this information remains a challenge: identical phenotypes among individuals carrying identical alleles at a given locus are rare, due to incomplete penetrance, a genetic background consisting of multiple genes modifying the phenotype with minor effects, environmental and lifestyle factors, etc. The project will build on the previous work in the team, concerning the development of an integrated and intelligent infrastructure (SM2PH-Central), in the context of AFM’s Décrypthon programme (decrypthon.igbmc.fr/sm2ph) for the analysis of the relationships between human genotypes and disease phenotypes. The objective is the integration of information from the different ’omics’ platforms at the individual, family and population levels, in order to better understand the relationships between certain anomalies, their underlying genetic and functional causes and their evolution. For several years, AFM has built up a cohort of patients who have been followed by systematic clinical protocols to specify the detailed phenotypes, and for whom exomes (from selected families), biobanking facilities, physiopathological data and family histories are accessible. This constitutes a unique resource to develop and apply original data mining techniques to extract common features, as well as rare patterns or events at multiple levels, including genetic mutations, proteins, cellular networks, organs/tissues and phenotypes. Specific goals are:

  • Extension of SM2PH currently centred on SNPs to micro-insertions/deletions and large-scale rearrangements which are an important source of

biochemical, metabolic and phenotypic variations. Some of the genomic variations are known to cause RD; others are important factors in susceptibility to common diseases.

  • Integration and mapping of the detailed genotypic variation with existing knowledge represented in molecular networks, metabolic and signaling

pathways, literature, etc.

  • Appropriate quality control, standardization and statistical treatment of data.
  • Identification of specific patterns or profiles in the information networks that are correlated with patient-related phenotypes or affected/healthy


In addition to improving our fundamental system-level understanding of the complex, dynamic relationships between genotype and phenotype in human disease, this work will contribute to better diagnostics, prognostics and care of the patients.


Bermejo-Das-Neves C, Nguyen HN, Poch O, Thompson JD. A comprehensive study of small non-frameshift insertions/deletions in proteins and prediction of their phenotypic effects by a machine learning method (KD4i). BMC Bioinformatics. 2014 ;15:111.