ROMJIST Volume 29, No. 1, 2026, pp. 3-14, DOI: 10.59277/ROMJIST.2026.1.01
Shariq BASHIR Reference Recommendation for Large Language Models-Generated Text Using Deep Textual Representations
ABSTRACT: The increasing use of large language models (LLMs) in conversational systems raises concerns about the credibility and verifiability of the information they generate. These models often produce fluent and convincing responses that may lack factual basis or supporting evidence. To address this challenge, a reference recommendation approach is proposed to retrieve relevant citations from LLM-generated text. The proposed approach treats the LLM output as a query and employs Sentence-BERT to create deep contextual embeddings of documents. Retrieval performance is further enhanced by integrating Siamese and Triplet neural network architectures to model semantic similarity and applying a submodular scoring function to ensure relevance and diversity in recommended references. Performance tests on a domain-specific dataset demonstrate that the proposed approach outperforms traditional retrieval approaches and recent baselines in standard evaluation metrics, including F1 @ k and mean recurrence rank (MRR). This work offers a scalable and effective solution for improving the reliability of AI-generated content through evidence-based support.KEYWORDS: Conversational information system; information retrieval; information system; large language models; reference recommendation; Sentence-BERTRead full text (pdf)
