New paper submission
We submitted a new manuscript to Cladistics
We are very pleased to announce that our manuscript entitled “Machine learning models accurately predict clades of proteocephalidean tapeworms (Onchoproteocephalidea) based on host and biogeographical data}” was submitted for publication in Cladistics. Wish us good luck with the reviews!
The collaboration
This is the first of two manuscripts that our lab is preparing in collaboration with Dr. Philippe Vieira Alves.
Dr. Alves received a Research Internship Abroad (BEPE) Award from the São Paulo Research Foundations (FAPESP; Proc. No. Process No.2023/00714-5). This award funded Dr. Alves’ reserach in collaboration with the Phyloinformatics Lab from July 1, 2023, to June 30, 2024. The funded reserach project was entitled “Mitogenome organization and diversity of proteocephalid tapeworms (Cestoda) unveiled by genome skimming.”
Here is our complete list of eight authors from five institutions in four countries:
- Philippe Vieira Alves (UNESP, Brasil)
- Reinaldo José da Silva (UNESP, Brasil)
- Tomáš Scholz (Biology Centre of the Czech Academy of Sciences, Czech Republic)
- Alain de Chambrier (Natural History Museum, Switzerland)
- José Luis Luque (UFRRJ, Brasil)
- Anastasiia Duchenko (UNC Charlotte, USA)
- Daniel Janies (UNC Charlotte, USA)
- Denis Jacob Machado (UNC Charlotte, USA)
Shout out to Anastasiia Duchenko, one of our co-authors, who is a first-semester student in our Ph.D. program in Bioinformatics and Computational Biology.
What is the manuscript about
In this study, we reviewed the phylogenetic relationships of proteocephalid tapeworms (Cestoda) by analyzing hundreds of publicly available and newly generated sequences from the nuclear 28S rRNA and mitochondrial MT-CO1 genes. These sequences were combined for a comprehensive analysis of 537 terminals, providing a much-needed update on the phylogenetic relationships within this group.
While we found that optimizing biogeographical or host data within our phylogenetic tree was not an effective strategy for defining clades, we explored alternative ways in which biogeographical and host data might correlate with the group’s estimated evolutionary history. To this end, we employed machine learning to evaluate the predictive power of these data in classifying terminals into selected clades. Our manuscript presents the results of this analysis, offering insights into why biogeographical and host information are correlated with our phylogenetic tree, even though this data cannot be easily optimized.
Novel applications of AI in phylogenetics
To the best of our knowledge, this is the first study to apply supervised machine learning to test whether host and biogeographical attributes of non-model parasitic organisms can accurately predict clade affiliation within a family-level framework. Consequently, we believe this manuscript will be of particular interest to taxonomists studying parasitic flatworms, aligning well with the scope of Cladistics.
Learn more
We do not have a preprint and will advertise the paper after its publication. However, we have already communicated our main results to the community at the 2024 meeting of the American Society of Parasitologists (ASP).
Our talk was entitled “Machine learning models accurately predict clades of proteocephalidean tapeworms (Onchoproteocephalidea) based on host and zoogeographical data” and it was authored by Philippe Vieira Alves, Reinaldo J. da Silva, Alain de Chambrier, José L. Luque, Anastasiia Duchenko, Daniel Janies, and Denis Jacob Machado. The presentation for the 99th Annual Meeting of the ASP is available in Zenodo.
Previous post
Graduate students developers needed
Next post
New Fullbright Application