Close printable page

Recommendation

Performance of machine-learning approaches in identifying ammonoid species based on conch properties

Kenneth De Baets based on reviews by Jérémie Bardin and 1 anonymous reviewer

A recommendation of:

Ammonoid taxonomy with supervised and unsupervised machine learning algorithms

Floe Foxon (2021), PaleorXiv, ewkx9, ver. 3 peer-reviewed and recommended by PCI Paleo https://doi.org/10.31233/osf.io/ewkx9

Read preprint in preprint server

Data used for results

Codes used in this study

Scripts used to obtain or analyze results

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Ammonoid taxonomy with supervised and unsupervised machine learning algorithms

Ammonoid identification is crucial to biostratigraphy, systematic palaeontology, and evolutionary biology, but may prove difficult when shell features and sutures are poorly preserved. This necessitates novel approaches to ammonoid taxonomy. This study aimed to taxonomize ammonoids by their conch geometry using supervised and unsupervised machine learning algorithms. Ammonoid measurement data (conch diameter, whorl height, whorl width, and umbilical width) were taken from the Paleobiology Database (PBDB). 11 species with ≥50 specimens each were identified providing N=781 total unique specimens. Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, K-Nearest Neighbours, Support Vector Machine, and Multilayer Perceptron classifiers were applied to the PBDB data with a 5x5 nested cross-validation approach to obtain unbiased generalization performance estimates across a grid search of algorithm parameters. All supervised classifiers achieved ≥70% accuracy in identifying ammonoid species, with Naive Bayes demonstrating the least over-fitting. The unsupervised clustering algorithms K-Means, DBSCAN, OPTICS, Mean Shift, and Affinity Propagation achieved Normalized Mutual Information scores of ≥0.6, with the centroid-based methods having most success. This presents a reasonably-accurate proof-of-concept approach to ammonoid classification which may assist identification in cases where more traditional methods are not feasible.

Ammonoid; Ammonoidea; Machine Learning; Classification; Taxonomy

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

تصنيف الأمونويد مع خوارزميات التعلم الآلي الخاضعة للإشراف وغير الخاضعة للإشراف

يعد التعرف على الأمونويد أمرًا بالغ الأهمية في علم الطبقات الحيوية، وعلم الحفريات المنهجي، وعلم الأحياء التطوري، ولكن قد يكون من الصعب عندما يتم الحفاظ على ميزات الصدفة والخيوط بشكل سيئ. وهذا يتطلب أساليب جديدة لتصنيف الأمونويد. تهدف هذه الدراسة إلى تصنيف الأمونويدات حسب هندسة المحارة باستخدام خوارزميات التعلم الآلي الخاضعة للإشراف وغير الخاضعة للإشراف. تم أخذ بيانات قياس الأمونويد (قطر المحارة، وارتفاع الزهرة، وعرض الزهرة، والعرض السري) من قاعدة بيانات علم الأحياء القديمة (PBDB). تم تحديد 11 نوعًا تحتوي كل منها على ≥50 عينة، مما يوفر إجمالي عدد العينات الفريدة = 781. تم تطبيق Naive Bayes، وDecision Tree، وRandom Forest، وGradient Boosting، وK-Nearest Neighbours، وSupport Vector Machine، ومصنفات Multilayer Perceptron على بيانات PBDB باستخدام نهج التحقق المتبادل المتداخل 5x5 للحصول على تقديرات أداء تعميم غير متحيزة عبر بحث شبكي معلمات الخوارزمية. حققت جميع المصنفات الخاضعة للإشراف دقة بنسبة ≥70% في تحديد أنواع الأمونويد، مع إظهار Naive Bayes الأقل من حيث الإفراط في التركيب. حققت خوارزميات التجميع غير الخاضعة للرقابة K-Means وDBSCAN وOPTICS وMean Shift وAffinity Propagation درجات معلومات متبادلة طبيعية تبلغ ≥0.6، مع تحقيق الأساليب القائمة على النقطه الوسطى أكبر قدر من النجاح. يقدم هذا منهجًا دقيقًا إلى حد معقول لإثبات المفهوم لتصنيف الأمونويد والذي قد يساعد في تحديد الهوية في الحالات التي لا تكون فيها الطرق التقليدية مجدية.

أمونويد. أمونويديا. التعلم الالي؛ تصنيف؛ التصنيف

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Taxonomía de amonoides con algoritmos de aprendizaje automático supervisados y no supervisados

La identificación de amonoides es crucial para la bioestratigrafía, la paleontología sistemática y la biología evolutiva, pero puede resultar difícil cuando las características y suturas de la concha están mal conservadas. Esto requiere enfoques novedosos para la taxonomía de amonoides. Este estudio tuvo como objetivo taxonomizar los amonoides según la geometría de su caracola utilizando algoritmos de aprendizaje automático supervisados y no supervisados. Los datos de medición de amonoides (diámetro de la caracola, altura del verticilo, ancho del verticilo y ancho del umbilical) se tomaron de la Base de datos de Paleobiología (PBDB). Se identificaron 11 especies con ≥50 especímenes cada una, lo que proporcionó N=781 especímenes únicos en total. Se aplicaron clasificadores Naive Bayes, árbol de decisión, bosque aleatorio, aumento de gradiente, K-vecinos más cercanos, máquina de vectores de soporte y perceptrón multicapa a los datos de PBDB con un enfoque de validación cruzada anidada 5x5 para obtener estimaciones de rendimiento de generalización imparciales a través de una búsqueda en cuadrícula de parámetros del algoritmo. Todos los clasificadores supervisados lograron una precisión ≥70% en la identificación de especies de amonoides, siendo Naive Bayes el que demostró el menor sobreajuste. Los algoritmos de agrupamiento no supervisados K-Means, DBSCAN, OPTICS, Mean Shift y Affinity Propagation lograron puntuaciones de información mutua normalizada de ≥0,6, y los métodos basados en centroides tuvieron mayor éxito. Esto presenta un enfoque de prueba de concepto razonablemente preciso para la clasificación de amonoides que puede ayudar a la identificación en los casos en que los métodos más tradicionales no sean factibles.

Amonoide; amonoidea; Aprendizaje automático; Clasificación; Taxonomía

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Taxonomie des ammonoïdes avec algorithmes d'apprentissage automatique supervisés et non supervisés

L'identification des ammonoïdes est cruciale pour la biostratigraphie, la paléontologie systématique et la biologie évolutive, mais peut s'avérer difficile lorsque les caractéristiques de la coquille et les sutures sont mal préservées. Cela nécessite de nouvelles approches de la taxonomie des ammonoïdes. Cette étude visait à taxonomiser les ammonoïdes selon leur géométrie de conque à l’aide d’algorithmes d’apprentissage automatique supervisés et non supervisés. Les données de mesure des ammonoïdes (diamètre de la conque, hauteur du verticille, largeur du verticille et largeur ombilicale) ont été extraites de la base de données de paléobiologie (PBDB). 11 espèces comportant chacune ≥ 50 spécimens ont été identifiées, soit un total de N = 781 spécimens uniques. Les classificateurs Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, K-Nearest Neighbours, Support Vector Machine et Multilayer Perceptron ont été appliqués aux données PBDB avec une approche de validation croisée imbriquée 5x5 pour obtenir des estimations impartiales des performances de généralisation à travers une recherche par grille de paramètres de l'algorithme. Tous les classificateurs supervisés ont atteint une précision ≥ 70 % dans l'identification des espèces d'ammonoïdes, Naive Bayes démontrant le moins de surajustement. Les algorithmes de clustering non supervisés K-Means, DBSCAN, OPTICS, Mean Shift et Affinity Propagation ont obtenu des scores d'information mutuelle normalisée ≥0,6, les méthodes basées sur le centroïde ayant eu le plus de succès. Cela présente une approche de validation de principe raisonnablement précise pour la classification des ammonoïdes qui peut faciliter l'identification dans les cas où les méthodes plus traditionnelles ne sont pas réalisables.

Ammonoïde ; Ammonoïdes; Apprentissage automatique ; Classification; Taxonomie

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

पर्यवेक्षित और अप्रशिक्षित मशीन लर्निंग एल्गोरिदम के साथ अमोनॉइड वर्गीकरण

अमोनॉइड की पहचान बायोस्ट्रेटिग्राफी, व्यवस्थित जीवाश्म विज्ञान और विकासवादी जीव विज्ञान के लिए महत्वपूर्ण है, लेकिन जब शैल विशेषताएं और टांके खराब तरीके से संरक्षित होते हैं तो यह मुश्किल साबित हो सकता है। इसके लिए अमोनॉइड वर्गीकरण के लिए नवीन दृष्टिकोण की आवश्यकता है। इस अध्ययन का उद्देश्य पर्यवेक्षित और अप्रशिक्षित मशीन लर्निंग एल्गोरिदम का उपयोग करके अमोनोइड्स को उनके शंख ज्यामिति द्वारा वर्गीकृत करना है। अमोनॉइड माप डेटा (शंख व्यास, भंवर ऊंचाई, चक्कर चौड़ाई और नाभि चौड़ाई) पैलियोबायोलॉजी डेटाबेस (पीबीडीबी) से लिया गया था। ≥50 नमूनों वाली 11 प्रजातियों की पहचान की गई, जो कुल अद्वितीय नमूने एन = 781 प्रदान करती हैं। नाइव बेयस, डिसीजन ट्री, रैंडम फॉरेस्ट, ग्रेडिएंट बूस्टिंग, के-निकटतम पड़ोसी, सपोर्ट वेक्टर मशीन और मल्टीलेयर परसेप्ट्रॉन क्लासिफायर को ग्रिड खोज में निष्पक्ष सामान्यीकरण प्रदर्शन अनुमान प्राप्त करने के लिए 5x5 नेस्टेड क्रॉस-वैलिडेशन दृष्टिकोण के साथ पीबीडीबी डेटा पर लागू किया गया था। एल्गोरिदम पैरामीटर। सभी पर्यवेक्षित क्लासिफायरों ने अमोनॉइड प्रजातियों की पहचान करने में ≥70% सटीकता हासिल की, जिसमें नाइव बेयस ने सबसे कम ओवर-फिटिंग का प्रदर्शन किया। बिना पर्यवेक्षित क्लस्टरिंग एल्गोरिदम के-मीन्स, डीबीएससीएएन, ऑप्टिक्स, मीन शिफ्ट और एफिनिटी प्रोपेगेशन ने ≥0.6 का सामान्यीकृत पारस्परिक सूचना स्कोर हासिल किया, जिसमें सेंट्रोइड-आधारित तरीकों को सबसे अधिक सफलता मिली। यह अमोनॉइड वर्गीकरण के लिए एक यथोचित-सटीक प्रमाण-अवधारणा दृष्टिकोण प्रस्तुत करता है जो उन मामलों में पहचान में सहायता कर सकता है जहां अधिक पारंपरिक तरीके संभव नहीं हैं।

अमोनॉइड; अमोनोइडिया; यंत्र अधिगम; वर्गीकरण; वर्गीकरण

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

教師ありおよび教師なし機械学習アルゴリズムを使用したアンモノイド分類法

アンモノイドの同定は、生物層序学、体系的な古生物学、進化生物学にとって極めて重要ですが、殻の特徴や縫合糸の保存状態が悪い場合には困難になる可能性があります。これには、アンモノイド分類学への新しいアプローチが必要です。この研究は、教師ありおよび教師なしの機械学習アルゴリズムを使用して、巻貝の形状によってアンモノイドを分類することを目的としていました。アンモノイドの測定データ (巻き貝の直径、渦巻きの高さ、渦巻きの幅、臍帯の幅) は古生物学データベース (PBDB) から取得されました。それぞれ 50 個以上の標本を持つ 11 種が特定され、合計 N=781 のユニークな標本が得られました。 Naive Bayes、Decision Tree、Random Forest、Gradient Boosting、K-Nearest Neighbours、サポートベクターマシン、および多層パーセプトロン分類器を 5x5 ネストされた相互検証アプローチで PBDB データに適用し、次のグリッド検索全体で不偏の一般化パフォーマンス推定値を取得しました。アルゴリズムパラメータ。すべての教師あり分類器は、アンモノイド種の識別において 70% 以上の精度を達成し、Naive Bayes は過剰適合が最も少ないことを示しました。教師なしクラスタリングアルゴリズムの K-Means、DBSCAN、OPTICS、Mean Shift、および Affinity Propagation は、正規化相互情報スコア ≥0.6 を達成し、セントロイドベースの手法が最も成功しました。これは、アンモノイド分類に対するかなり正確な概念実証アプローチを示しており、より伝統的な方法が実行不可能な場合の識別に役立つ可能性があります。

アンモノイド;アンモノイデア;機械学習。分類;分類学

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Taxonomia ammonóide com algoritmos de aprendizado de máquina supervisionados e não supervisionados

A identificação de amonóides é crucial para a bioestratigrafia, a paleontologia sistemática e a biologia evolutiva, mas pode ser difícil quando as características da concha e as suturas estão mal preservadas. Isto requer novas abordagens para a taxonomia dos amonóides. Este estudo teve como objetivo taxonomizar amonóides por sua geometria de concha usando algoritmos de aprendizado de máquina supervisionados e não supervisionados. Os dados de medição de amonóides (diâmetro da concha, altura do verticilo, largura do verticilo e largura umbilical) foram retirados do Banco de Dados de Paleobiologia (PBDB). 11 espécies com ≥50 espécimes cada foram identificadas, fornecendo N = 781 espécimes únicos no total. Os classificadores Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, K-Nearest Neighbours, Support Vector Machine e Multilayer Perceptron foram aplicados aos dados PBDB com uma abordagem de validação cruzada aninhada 5x5 para obter estimativas imparciais de desempenho de generalização em uma pesquisa de grade de parâmetros do algoritmo. Todos os classificadores supervisionados alcançaram ≥70% de precisão na identificação de espécies de amonóides, com Naive Bayes demonstrando o menor ajuste excessivo. Os algoritmos de agrupamento não supervisionado K-Means, DBSCAN, OPTICS, Mean Shift e Affinity Propagation alcançaram pontuações de Informação Mútua Normalizada de ≥0,6, com os métodos baseados em centróides tendo mais sucesso. Isto apresenta uma abordagem de prova de conceito razoavelmente precisa para a classificação de amonóides, que pode ajudar na identificação em casos onde métodos mais tradicionais não são viáveis.

Amonóide; Ammonoidea; Aprendizado de Máquina; Classificação; Taxonomia

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Таксономия аммоноидей с контролируемыми и неконтролируемыми алгоритмами машинного обучения

Идентификация аммоноидей имеет решающее значение для биостратиграфии, систематической палеонтологии и эволюционной биологии, но может оказаться затруднительной, если особенности раковины и швы плохо сохранились. Это требует новых подходов к систематике аммоноидей. Это исследование было направлено на таксономизацию аммоноидей по их геометрии раковины с использованием контролируемых и неконтролируемых алгоритмов машинного обучения. Данные об измерениях аммоноидеев (диаметр раковины, высота оборота, ширина оборота и ширина пупка) были взяты из базы данных палеобиологии (PBDB). Было идентифицировано 11 видов по ≥50 экземпляров каждый, всего N=781 уникальных экземпляров. Наивный Байес, дерево решений, случайный лес, повышение градиента, K-ближайшие соседи, машина опорных векторов и классификаторы многослойного персептрона были применены к данным PBDB с помощью вложенного подхода перекрестной проверки 5x5 для получения объективных оценок производительности обобщения по сетке поиска параметры алгоритма. Все контролируемые классификаторы достигли точности ≥70% при определении видов аммоноидей, при этом Наивный Байес продемонстрировал наименьшее переобучение. Алгоритмы неконтролируемой кластеризации K-Means, DBSCAN, OPTICS, Mean Shift и Affinity Propagation достигли оценки нормализованной взаимной информации ≥0,6, при этом методы на основе центроидов имели наибольший успех. Это представляет собой достаточно точный экспериментальный подход к классификации аммоноидей, который может помочь в идентификации в тех случаях, когда более традиционные методы невозможны.

Аммоноидный; Аммоноидеи; Машинное обучение; Классификация; Таксономия

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

具有监督和无监督机器学习算法的菊石分类法

菊石鉴定对于生物地层学、系统古生物学和进化生物学至关重要，但当贝壳特征和缝合线保存不良时可能会很困难。这需要新的菊石分类方法。这项研究旨在使用有监督和无监督的机器学习算法，根据海螺的几何形状对菊石进行分类。菊石测量数据（海螺直径、螺体高度、螺体宽度和脐带宽度）取自古生物学数据库（PBDB）。鉴定出 11 个物种，每个物种有 ≥50 个标本，总共提供 N=781 个独特标本。使用 5x5 嵌套交叉验证方法将朴素贝叶斯、决策树、随机森林、梯度提升、K 最近邻、支持向量机和多层感知器分类器应用于 PBDB 数据，以获得跨网格搜索的无偏泛化性能估计算法参数。所有监督分类器在识别菊石种类方面均达到了 70% 以上的准确度，其中朴素贝叶斯的过度拟合程度最低。无监督聚类算法 K-Means、DBSCAN、OPTICS、Mean Shift 和 Affinity Propagation 的归一化互信息分数≥0.6，其中基于质心的方法最为成功。这为菊石分类提供了一种相当准确的概念验证方法，可以在更传统的方法不可行的情况下帮助识别。

菊石；菊总科；机器学习；分类;分类

Submission: posted 06 January 2021
Recommendation: posted 13 September 2021, validated 01 October 2021

Cite this recommendation as:
De Baets, K. (2021) Performance of machine-learning approaches in identifying ammonoid species based on conch properties. Peer Community in Paleontology, 100010. https://doi.org/10.24072/pci.paleo.100010

Recommendation

There are less and less experts on taxonomy of particular groups particularly among early career paleontologists and (paleo)biologists – this also includes ammonoid cephalopods. Techniques cannot replace this taxonomic expertise (Engel et al. 2021) but machine learning approaches can make taxonomy more efficient, reproducible as well as passing it over more sustainable. Initially ammonoid taxonomy was a black box with small differences sometimes sufficient to erect different species as well as really idiosyncratic groupings of superficially similar specimens (see De Baets et al. 2015 for a review). In the meantime, scientists have embraced more quantitative assessments of conch shape and morphology more generally (see Klug et al. 2015 for a more recent review). The approaches still rely on important but time-intensive collection work and seeing through daisy chains of more or less accessible papers and monographs without really knowing how these approaches perform (other than expert opinion). In addition, younger scientists are usually trained by more experienced scientists, but this practice is becoming more and more difficult which makes it difficult to resolve the taxonomic gap. This relates to the fact that less and less experienced researchers with this kind of expertise get employed as well as graduate students or postdocs choosing different research or job avenues after their initial training effectively leading to a leaky pipeline and taxonomic impediment.

Robust taxonomy and stratigraphy is the basis for all other studies we do as paleontologists/paleobiologists so Foxon (2021) represents the first step to use supervised and unsupervised machine-learning approaches and test their efficiency on ammonoid conch properties. This pilot study demonstrates that machine learning approaches can be reasonably accurate (60-70%) in identifying ammonoid species (Foxon, 2021) – at least similar to that in other mollusk taxa (e.g., Klinkenbuß et al. 2020) - and might also be interesting to assist in cases where more traditional methods are not feasible. Novel approaches might even allow to further approve the accuracy as has been demonstrated for other research objects like pollen (Romero et al. 2020). Further applying of machine learning approaches on larger datasets and additional morphological features (e.g., suture line) are now necessary in order to test and improve the robustness of these approaches for ammonoids as well as test their performance more broadly within paleontology.

References

De Baets K, Bert D, Hoffmann R, Monnet C, Yacobucci M, and Klug C (2015). Ammonoid intraspecific variability. In: Ammonoid Paleobiology: From anatomy to ecology. Ed. by Klug C, Korn D, De Baets K, Kruta I, and Mapes R. Vol. 43. Topics in Geobiology. Dordrecht: Springer, pp. 359–426.

Engel MS, Ceríaco LMP, Daniel GM, Dellapé PM, Löbl I, Marinov M, Reis RE, Young MT, Dubois A, Agarwal I, Lehmann A. P, Alvarado M, Alvarez N, Andreone F, Araujo-Vieira K, Ascher JS, Baêta D, Baldo D, Bandeira SA, Barden P, Barrasso DA, Bendifallah L, Bockmann FA, Böhme W, Borkent A, Brandão CRF, Busack SD, Bybee SM, Channing A, Chatzimanolis S, Christenhusz MJM, Crisci JV, D’elía G, Da Costa LM, Davis SR, De Lucena CAS, Deuve T, Fernandes Elizalde S, Faivovich J, Farooq H, Ferguson AW, Gippoliti S, Gonçalves FMP, Gonzalez VH, Greenbaum E, Hinojosa-Díaz IA, Ineich I, Jiang J, Kahono S, Kury AB, Lucinda PHF, Lynch JD, Malécot V, Marques MP, Marris JWM, Mckellar RC, Mendes LF, Nihei SS, Nishikawa K, Ohler A, Orrico VGD, Ota H, Paiva J, Parrinha D, Pauwels OSG, Pereyra MO, Pestana LB, Pinheiro PDP, Prendini L, Prokop J, Rasmussen C, Rödel MO, Rodrigues MT, Rodríguez SM, Salatnaya H, Sampaio Í, Sánchez-García A, Shebl MA, Santos BS, Solórzano-Kraemer MM, Sousa ACA, Stoev P, Teta P, Trape JF, Dos Santos CVD, Vasudevan K, Vink CJ, Vogel G, Wagner P, Wappler T, Ware JL, Wedmann S, and Zacharie CK (2021). The taxonomic impediment: a shortage of taxonomists, not the lack of technical approaches. Zoological Journal of the Linnean Society 193, 381–387. doi: 10. 1093/zoolinnean/zlab072

Foxon F (2021). Ammonoid taxonomy with supervised and unsupervised machine learning algorithms. PaleorXiv ewkx9, ver. 3, peer-reviewed by PCI Paleo. doi: 10.31233/osf.io/ewkx9

Klinkenbuß D, Metz O, Reichert J, Hauffe T, Neubauer TA, Wesselingh FP, and Wilke T (2020). Performance of 3D morphological methods in the machine learning assisted classification of closely related fossil bivalve species of the genus Dreissena. Malacologia 63, 95. doi: 10.4002/040.063.0109

Klug C, Korn D, Landman NH, Tanabe K, De Baets K, and Naglik C (2015). Ammonoid conchs. In: Ammonoid Paleobiology: From anatomy to ecology. Ed. by Klug C, Korn D, De Baets K, Kruta I, and Mapes RH. Vol. 43. Dordrecht: Springer, pp. 3–24.

Romero IC, Kong S, Fowlkes CC, Jaramillo C, Urban MA, Oboh-Ikuenobe F, D’Apolito C, and Punyasena SW (2020). Improving the taxonomy of fossil pollen using convolutional neural networks and superresolution microscopy. Proceedings of the National Academy of Sciences 117, 28496–28505. doi: 10.1073/pnas.2007324117

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Reviews

Reviewed by Jérémie Bardin, 08 Sep 2021

The author improved the manuscript in a satisfying way:

- an additional method has been included

- discussion has been extended to propose future developments and orientations for paleontologists

- previous minor comments have been considered.

I still think that more specimens, species and morphological traits are needed to quantify the effectiveness of these methods. This being said, such developments of quantitative taxonomy and systematics are really needed in our field, so I recommend this paper which is a very stimulating step.

Hereafter, few very minor points:

Table 1. Maybe property is not the best word. Why not just “parameter” as in the legend or “variable”?

p.2

“the” should precede “PBDB” when used as a name. Several occurrences throughout the text.

P.5

"becase" -> "because"

p.9

Future studies should seek to replicate these findings which (-> With?) richer data sources in both sample size and species count (->s).

https://doi.org/10.24072/pci.paleo.100010.rev21

Evaluation round #1

DOI or URL of the preprint: https://doi.org/10.31233/osf.io/ewkx9

Author's Reply, 05 Jul 2021

Download author's reply https://doi.org/10.24072/pci.paleo.100080.ar1

Decision by Kenneth De Baets, posted 07 Mar 2021

The manuscript documents an interesting pilot study applying established machine learning approaches to classify ammonoids based on standard conch properties. I feel it is an important way forward to standardize and test ammonoid taxonomy, but some minor but crucial points need to be revised before I can officially recommend this manuscript.

The main points:

Focus on conch parameters: It is ok to focus on conch parameters as these are easier to get and analyze in a biologically meaningful way but a bit more discussion on why this is the case as well as how adding additional parameters (e.g., suture line, ornamentation such as ribbing) might improve statistical power to separate species would be crucial to discuss. Many species are not just defined by conch parameters so it would be crucial to point out that you are working with only a subset of characters used to define species which are more readily available in the literature and easier to analyze quantitatively (see also comments by reviewer 1).

Data: As pointed out in the manuscript, the dataset is limited to 11 species entered into the Paleobiology Database – the sample size of individual species are ok (> 50 – could still be better – some authors have suggested to have > 100 specimens available when including multiple ontogenetic stages, etc.). As an ammonoid worker which has worked on intraspecific variation – I can highlight that data for much more species would be available in the primary literature (a substantial part is still missing from the PDBD). I must admit that particularly in older literature measurements would need to be extracted from graphs and we still need to go some way before all paleontologists make this kind of data available as standard practice. Ideally, you should try to compile some additional data from the primary data to better understand what I mean and would help to broaden the scope of your analysis. As this is a pilot study, focusing on 11 species with samples > 50 could still ok, but it would be crucial to highlight which primary references yielded data for particular species. This also becomes crucial as for some species, data from multiple references are merged, presented data from multiple stratigraphic and geographic intervals (and likely also different degrees of preservation). This could for example explain the poorer performance for particular species like Owenites koeneni which derive from different localities and might also represent different preservations and ages. Please also, write species in italics as this is customary.

Performance of particular methods and species. The original authors might have assigned all their specimens to a particular species (e.g., Owenites koeneni) but mostly did not statistically evaluate how the conch parameters of their specimens compared with those of other localities and some even highlight qualitative differences with material from other localities. The homogeneity of conch parameters and there use to define species might therefore be to some degree compromised even before applying machine learning approaches. To place the performance of the methods into context for particular species, it would be crucial to add at least the primary reference providing data, their age range (single bed, biozone, etc.) as well the geographic scope (same locality, continent, etc.), so such potential issues could be glanced more transparently. In the discussion you focus on the performance of methods, but I would also be crucial to highlights which species are consistently picked up and which ones are not to better understand the impact of the issue of species definition. Which ones are often/sometimes merged and which ones are sometimes/often oversplit. This would allow a better discussion and understanding of how species definition and homogeneity of conch parameters might impact on the performance of the methods. At first glance, particularly Owenites koeneni seems to perform peculiarly and it is also one of the species which measurements deriving from several continents and publications. So it would be crucial to discuss this at greater length in the discussion

Code availability and reproducibility: It has been become standard practice to share the code at least upon publication (see Reviewer 2). Ideally, this should even be done during the review process as it would allow reviewers to verify the results, but I can to some degree understand the reluctance to do so before publication. Special repositories are however available for this purpose (GitHub) which allow to put embargos and restrictions on the availability of the data.

Please address these and other points raised by the reviewers and myself (see annotated pdf). I look forward to seeing the revised manuscript.

Download recommender's annotations

https://doi.org/10.24072/pci.paleo.100080.d1

Reviewed by anonymous reviewer 1, 05 Mar 2021

The manuscript provides a persuasive example of how to combine the morphological data of fossil taxa in existing databases with machine learning. Given how little machine learning is used in the paleontological sciences – despite the exponential growth of this computer science field – any contribution provides a valuable first step.

Overall, I have no major concerns with the work conducted. I am not familiar with all of the unsupervised methods used (listed in Table 5), though I have used the k-means approach in some of my previous papers. However, from my understanding of the literature, all the techniques applied are fairly standard in machine learning research, with many of these algorithms in use for over a decade.

My main concern with the manuscript is the reporting of the work. It has become routine in this field to publish code in a public repository like GitHub and to at least mention the platform used for the analysis (e.g. PyTorch) so that other can replicate, corroborate, and build on the work. The detail in this manuscript is very minimal. Detailed procedures would be especially helpful for the paleontological community, as it would give others guidance on how to apply machine learning to their own morphological datasets.

Additionally, while it appears that the results are promising, higher accuracies may be possible with new machine learning approaches, e.g. convolutional neural networks, and with a morphological dataset that perhaps includes more morphological features. Many of the morphological traits that we use in taxon identifications are chosen as much for their ease of measurement and reproducibility, as their diagnostic strength. Computer vision has the potential to expand the morphological traits we can use in taxonomic determinations.

In short, this is a solid first step in applying machine learning to fossil data. However, the methods used are not among the most cutting edge in a field that is evolving rapidly. I strongly urge the author to publish their code and for the PCI Paleo editors to make it part of the publication process.

https://doi.org/10.24072/pci.paleo.100080.rev11

Reviewed by Jérémie Bardin, 03 Feb 2021

This paper by Floe Foxon investigates the use of machine learning algorithms for ammonoid taxonomy. This is a really enjoying topic as ammonoid taxonomy suffers from many biases that could be avoided by using quantified methods to recognize taxa and identify species. The paper is clear and well-written. The “results” and “discussion” sections need reorganization and the discussion has to be expanded. I cannot properly evaluate the language as I am not an English native speaker. In my opinion, as the author mentions it, this study is more a proof-of-concept rather than a demonstration that machine learning may be useful with these parameters. Algorithms are well-parametrized and the procedures to quantify overfitting are ok.

Indeed, my main concern is that the somewhat good results in this paper come from the very low number of species treated. The author mentions as a limitation the low number of specimens for training but, to me, the real weakness is the number of species. Using machine learning methods for species identification of a higher number of species will require way more descriptors. The parameters used in this paper are the most common and the corresponding measurements are provided in the majority of papers describing ammonoids. Given that these measures are extremely available and that many people have important databases of such measures, I would have expected more than 11 species to demonstrate that these parameters with machine learnings methods could provide useful tools to identify ammonites and build a robust taxonomy.

Moreover, I would like to see more insights on the very general use of supervised vs unsupervised methods for ammonoids taxonomy. Supervised methods make the assumption that the target variable is true. These methods are thus suited to identify specimens given a robust taxonomy. Unsupervised methods, if performed on already settled taxonomy, quantify the congruence/difference between, on the one side, species definitions and taxonomic attributions of specimens and, on the other side, their morphological clustering. They should also be used to create taxonomy but there is much to do to properly include stratigraphical time, ontogeny and any kind of morphological features. All those points could be addressed in the discussion.

To conclude, I think this study afford the advantage of stimulating an underexplored topic even if the results are limited and the discussion would deserve a strengthening. Hereafter, some detailed comments.

Abstract - The verb “taxonomize” is rarley used. I am not sure of its meaning, I would recommend being more specific.

Introduction - Ammonoids is not the name of the subclass, it is Ammonoidea. Using Linnean nomenclature is very controversial now (even if still correct), I would replace “subclass” by “clade” or simply “group”. - The two parts of this sentence are somewhat redundant “ammonoids are crucial index fossils for biostratigraphy (Cox, 1995), therefore ammonoid taxonomy is useful for the study of stratigraphic subdivision”. - “conch morphology, coiling, and aperture shape.” Actually, coiling and aperture shape are parts of conch morphology. Better say: “conch morphology such as coiling, and aperture shape” - “ribs (their direction, spacing, and type) may be used for family classification”. I would remove “family”, ribs are useful at every level of taxonomy. By the way, I would also recommend removing Linnean ranks as much as possible. - Despite all the great things Dieter Korn did and does on ammonoids description, I am not sure he defined the numerous parameters in its 2010 paper as written. It seems to me that its contribution is more a summary and formalization. - “Since ammonoids exhibit intraspecific variation (De Baets et al., 2015), it follows that each species has a typical range of conch proportions which are diagnostic of taxonomy.” This is one of the main problems. Ammonite species are usually not built on advanced quantitative diagnoses. Most of the time, several species (usually a lot) will have overlapping morphologies. Moreover, many species are partly defined on stratigraphy itself. I think that the way ammonites’ species are built and the variability in this practice are of prime importance to properly use machine learnings algorithms. - Supervised (eg discriminant analyses) and unsupervised (clustering) methods have already been used on ammonoids, I expect the introduction and/or the discussion to review what has already been done on the topic (e.g. Hohenegger and Tatzreiter 1992, Meister et al. 2011, Bardin et al. 2015). - To me, the priority is to define robust species by the use of clustering methods given a clear definition of the paleontological species. For now, I think that using supervised algorithms on few parameters is largely flawed due to problems in current species definitions.

Data - The number of specimens and species may be a weakness of this work. I am not surprise to see such results for 11 species. - The author mentions in the “Limitations” section that he was not able to differentiate juveniles and adults but given the fact the diameter is used, there are ways to infer it.

Discussion - “A nearest neighbours algorithm was then implemented to calculate the average distance between each point and its m nearest neighbours”. Use m as defined before. Supervised models - A large part of the discussion would be better suited to the results section. Please reorganize the two sections (Results and Discussion). - I don’t understand the comparison of test accuracies to the accuracy of majority class prediction. Could you be more specific and explain why it is your baseline hypothesis? - A naïve question: is the test accuracy sufficient to choose between methods or do we need to consider the difference between test and train accuracies. In other words, do we care about an important over-fitting is the test accuracy is really good?

Additional references: - Bardin J, Rouget I, Benzaggagh M, Fürsich FT and Cecca F. 2015. Lower Toarcian (Jurassic) ammonites of the South Riffian ridges (Morocco): systematics and biostratigraphy. Journal of systematic paleontology. 13 (6). 471-501. - Hohenegger J and Tatzreiter F. 1992. Morphometric methods in determination of ammonite species, exemplified through Balatonites shells (Middle Triassic). Journal of Paleontology 66(5): 801-816. - Meister C, Dommergues JL, Dommergues C, Lachkar N and El Hariri K. 2011. Les ammonites du Pliensbachien du jebel Bou Rharraf(Haut Atlas oriental, Maroc). Geobios. 44. 117.e1–117.e60.

https://doi.org/10.24072/pci.paleo.100080.rev12