This study evaluates the effectiveness of a Large Language Model (LLM)-based agent designed to support the veterinary clinical process, aiming to alleviate the administrative burdens of documentation and the complexities of clinical decision-making, w...
This study evaluates the effectiveness of a Large Language Model (LLM)-based agent designed to support the veterinary clinical process, aiming to alleviate the administrative burdens of documentation and the complexities of clinical decision-making, which are primary contributors to high workload and professional burnout among veterinarians. A mixed-methods research design was employed.
In the first phase, a survey of 100 practicing veterinarians analyzed the need for AI technology. The results confirmed that support for the clinical reasoning (Assessment) phase, such as preliminary diagnosis support (M=6.00), was the most urgent need, rated higher than administrative conveniences such as charting automation (M=5.45). Based on these findings, a prototype system was developed, integrating Retrieval-Augmented Generation (RAG) for academic information retrieval, Electronic Medical Record (EMR) integration, and multi-step reasoning capabilities.
In the second phase, 30 veterinarians evaluated the system's effectiveness in a controlled scenario. The results demonstrated that the system significantly reduced cognitive load, as measured by NASA-TLX. The overall mean cognitive load score decreased from 5.57 to 3.49 (p<.001). Notably, the Performance sub-scale score increased from 4.00 to 5.40 (p=.003), indicating that the system not only reduced perceived effort but also enhanced users’ perception of task performance.
Furthermore, the study confirmed improvements in the objective quality of clinical records. An expert panel (N=5) evaluated 20 anonymized clinical records (10 manual vs. 10 AI-assisted) and rated the AI-assisted records significantly higher across all criteria, including completeness, logical consistency, and readability (Overall Score M=4.75 vs. M=6.63, p<.001), with readability (M=6.52) showing the greatest improvement.
The system also demonstrated high usability, scoring highly on items such as usable without separate learning (M=6.10) and intuitive and easy to use (M=5.80). However, qualitative feedback identified seamless EMR integration, voice recognition, and enhanced RAG reliability as key challenges for future commercialization.
In conclusion, the proposed LLM-based agent proved to be an effective prototype in terms of quality, efficiency, and user satisfaction. These findings suggest that AI can function as a collaborative partner rather than a replacement, enabling clinicians to focus on higher-level clinical reasoning. With further development in workflow integration, this approach holds significant potential to reduce veterinarian burnout and improve the standardization of clinical care.