An AI-powered knowledge base for scientific abstracts: a case study on environmental DNA (eDNA) in biomonitoring

dc.contributor.authorBaev, Vesselin
dc.contributor.authorGecheva, Gana
dc.date.accessioned2025-12-16T10:14:21Z
dc.date.available2025-12-16T10:14:21Z
dc.date.issued2025-12-05
dc.description.abstractEnvironmental DNA (eDNA) refers to genetic material shed by organisms into their environment, such as water, soil, or air. As a non-invasive biomonitoring method, eDNA has revolutionized biodiversity assessment by enabling the detection of species presence without direct observation or capture. This approach is especially critical for tracking invasive, elusive, or endangered species and monitoring ecosystem changes due to climate or anthropogenic pressures. Over the past decade, a growing body of scientific literature has explored eDNA applications, resulting in a fragmented but rich landscape of domain-specific knowledge. Navigating this information is increasingly challenging for researchers and policymakers. To address this, we developed BioTrace, an AI-powered knowledge base designed to support conversational exploration of scientific abstracts focused on eDNA in biodiversity monitoring. BioTrace leverages a Retrieval-Augmented Generation (RAG) architecture, integrating the mistral-saba-24b large language model via the Groq API for ultra-fast, low-latency inference. Scientific abstracts are indexed using a vector store, and retrieved passages are reranked using the all-MiniLM-L6-v2 model to improve answer relevance. Users can query the system in natural language and receive grounded, context-aware responses that synthesize findings across multiple studies. So far, the knowledge base includes more than 4000 abstracts on eDNA studies. This work demonstrates the potential of large language models (LLMs) to distil scientific literature into accessible, structured knowledge. BioTrace empowers users with real-time, interpretable insights into eDNA research, serving as a blueprint for future AI-based tools in ecological and environmental sciences.
dc.identifier.issn1313-9940
dc.identifier.urihttps://doi.uni-plovdiv.bg/handle/store/838
dc.language.isoen
dc.publisherPlovdiv University Press "Paisii Hilendarski"
dc.subjectAI
dc.subjectLLM models
dc.subjectRAG
dc.subjecteDNA
dc.subjectbiomonitoring
dc.titleAn AI-powered knowledge base for scientific abstracts: a case study on environmental DNA (eDNA) in biomonitoring
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
eb20252154.pdf
Size:
721.63 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
73 B
Format:
Item-specific license agreed to upon submission
Description: