Research
Participation to research projects
XAI and Human-XAI interaction (postdoctoral work, current)
Discovery of molecules with optimization and machine learning (PhD work, 2019-2022)
The purpose of this research project is to develop methods for the automatic generation of molecules that satisfy desired properties, with a focus on the chemistry of organic molecular materials. I proposed an evolutionary algorithm named EvoMol, which is designed to be interpretable and generic so that it can be used in various subdomains of chemistry [J3]. As evolutionary algorithms tend to generate unrealistic molecules, I also worked on a filter-based approach which favors the generation of realistic molecules [J5].
Most properties of interest in the field of organic molecular materials depend on costly quantum chemistry computations (DFT calculations). This motivates the use of machine learning algorithms as fast estimators of these properties. I worked on machine learning methods predicting the DFT-optimized geometry of molecules, which is closely related to the target electronic properties [C1, B1]. I worked with a postdoctoral researcher to measure the importance of chemical diversity in the training datasets of machine learning models of molecular properties. We showed that models trained on a commonly used synthetic dataset do suffer from a lack of diversity [3]. We further proposed an efficient method based on EvoMol to maximize various measures of chemical diversity, which we used to obtain a large and diverse dataset of molecules [1].
I have also proposed to combine an optimization method with a machine learning model, in the form of a surrogate-based black-box optimization method. The surrogate function is a machine learning model that estimates the values of a costly molecular property and that is used to select solutions in the search space. I showed that our approach is more efficient than an evolutionary search for the optimization of a costly electronic property [6]. Finally, the use of ML models for molecular chemistry raises questions about their interpretability. I proposed an approach based on EvoMol to generate counterfactual explanations to any binary classification model of molecules [10].
I also published a review of the state of the art of the field of de novo molecular generation [8].
Dereplication in vegetal-based chemistry (2019)
I worked with a group of scientists in vegetal-based chemistry during my MSc studies. The aim was to improve a pre-existing tool using NMR spectrum for the identification of compounds in a mixture. I performed a refactoring of the source code and I formalized the matching algorithm and improved its efficiency. The tool was later published [4].
Publications
Publications in journals with peer-review
J5. Definition and Exploration of Realistic Chemical Spaces Using the Connectivity and Cyclic Features of ChEMBL and ZINC
Thomas Cauchy, Jules Leguy, Benoit Da Mota
2023 Digital Discovery - Rank Q1 (chemistry)
DOi : 10.1039/D2DD00092J
scimago : ranking
github : BenoitDamota/gcf
figshare : data
[J4]\ 📄 "Scalable estimator of the diversity for de novo molecular generation resulting in a more robust QM dataset (OD9) and a more efficient molecular optimization"\ 📖 Journal of Cheminformatics (Q1 in computer science applications)\ 👥 Jules Leguy, Marta Glavatskikh, Thomas Cauchy, and Benoit Da Mota. \ 🌐 Oct. 2021 DOi: 10.1186/s13321-021-00554-8\
[J3]\ 📄 “EvoMol: a flexible and interpretable evolutionary algorithm for unbiased de novo molecular generation”\ 📖 Journal of Cheminformatics (Q1 in computer science applications)\ 👥 Jules Leguy, Thomas Cauchy, Marta Glavatskikh, Béatrice Duval, and Benoit Da Mota.\ 🌐 Sept. 2020 DOi: 10.1186/s13321-020-00458-z\
[J2]\ 📄 “MixONat, a Software for the Dereplication of Mixtures Based on 13C NMR Spectroscopy”\ 📖 Analytical Chemistry (Q1 in analytical chemistry)\ 👥 Antoine Bruguière, Séverine Derbré, Joël Dietsch, Jules Leguy, Valentine Rahier, Quentin Pottier, Dimitri Bréard, Sorphon Suor‑Cherer, Guillaume Viault, Anne‑Marie Le Ray, Frédéric Saubion, and Pascal Richomme\ 🌐 July 2020 DOi: 10.1021/acs.analchem.0c00193\
[J1]\ 📄 “Dataset’s chemical diversity limits the generalizability of machine learning predictions”\ 📖 Journal of Cheminformatics (Q1 in computer science applications)\ 👥 Marta Glavatskikh, Jules Leguy, Gilles Hunault, Thomas Cauchy, and Benoit Da Mota\ 🌐 Dec. 2019 DOi: 10.1186/s13321-019-0391-2\
Publications in conferences with peer-review
[C2]\ 📄 “Surrogate‑Based Black‑Box Optimization Method for Costly Molecular Properties”\ 📖 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) (Rank A2 Qualis/B ERA)\ 👥 Jules Leguy, Béatrice Duval, Benoit Da Mota, and Thomas Cauchy\ 🌐 Nov. 2021. DOi: 10.1109/ICTAI52525.2021.00124\
[C1]\ 📄 “Des réseaux de neurones pour prédire des distances interatomiques extraites d’une base de données ouverte de calculs en chimie quantique”\ 📖 Extraction et Gestion des connaissances, EGC 2019, Metz, France\ 👥 Jules Leguy, Thomas Cauchy, Béatrice Duval, and Benoit Da Mota\ 🌐 2019. URL : https://editions-rnti.fr/?inprocid=1002464\
Publications of book chapters with peer-review
[B2]\ 📄 "Goal‑directed generation of new molecules by AI methods" (Chapter 2)\ 📖 Computational and Data‑Driven Chemistry Using Artificial Intelligence\ 👥 Jules LEGUY, Thomas CAUCHY, Béatrice DUVAL et Benoit DA MOTA\ 🌐 Jan. 2022. DOi : 10.1016/B978-0-12-822249-2.00004-9\
[B1]\ 📄 "Predicting Interatomic Distances of Molecular Quantum Chemistry Calculations" (long version of [C1])\ 📖 Advances in Knowledge Discovery and Management: Volume 9. Studies in Computational Intelligence. Springer International Publishing \ 👥 Jules Leguy, Thomas Cauchy, Béatrice Duval, and Benoit Da Mota\ 🌐 2022 (Submitted 2019). DOi: 10.1007/978-3-030-90287-2_8\
Publications in conference workshops with light peer-review
[W1] 📄 “Génération d’explications contre‑factuelles pour la chimie moléculaire”\ 📖 Workshop EXPLAIN’AI hosted at EGC 2022, Blois, France\ 👥 Jules Leguy, Bryan Garreau, Thomas Cauchy, Benoit Da Mota, and Béatrice Duval\ 🌐 Jan. 2022. URL : https://univ-angers.hal.science/hal-04034150\
Posters
Preprints
[P1]\ 📄 "WebXAII: An Open-Source Web Framework to Study Human-XAI Interaction"\ 📖 arXiv:2506.14777 [cs.HC]\ 👥 Jules Leguy, Pierre-Antoine Jean, Felipe Torres Figueroa, and Sébastien Harispe\ 🌐 May 2025. DOi : 10.48550/arXiv.2506.14777\
PhD thesis
My Ph.D thesis manuscript entitled "Recherche combinatoire guidée par apprentissage artificiel en chimie moléculaire" (2022) is available here.