Pipeline infrastructure faces increasing challenges associated with corrosion, materials degradation, and environmental hazards. Current inspection and risk management practices generate large volumes of heterogeneous data. The data is spread across inspection reports, corrosion assessments, incident records, and spatial datasets. Advanced analytical tools are needed to interpret multimodal field inspection data to support pipeline integrity. This study explores the use of Large Language Models (LLMs) combined with document-based databases through Retrieval-Augmented Generation (RAG). The approach leverages a hybrid repository of structured and unstructured data which includes corrosion risk factors, geohazard risk factors, and historical incident reports. The LLM framework enables semantic retrieval, schema-aware validation, automated reasoning, and domain-specific summarization. The LLM can obtain key indicators of degradation, identifies inconsistencies within created corrosion datasets based on schema and regulatory requirements, and produces clear, domain specific summaries and reports. It delivers comprehensive, context-aware retrieval of corrosion and geological risk data and produces explainable insights. By combining retrieval with generative inference, this approach supports informed risk management decision making for structural integrity assessments. The results highlight the value of LLM database integration as a scalable and adaptable framework for delivering transparent, explainable risk information to advance data-driven pipeline integrity management.