Retrieval-Augmented Generation (RAG) for Network Automation using LangChain and OpenAI

Retrieval-Augmented Generation (RAG) for network automation enables AI to work with real network data as its knowledge base, rather than relying solely on general AI knowledge. LangChain is a powerful Python framework that helps implement RAG using OpenAI’s models. In this section, we’ll use the LangChain framework to build a RAG workflow that allows AI to deliver more accurate, context-aware, and intelligent responses for network automation tasks.

Understanding RAG and LangChain

Before we start the discussion, it’s important to understand what RAG and LangChain are, and how their components work together.

Normally, AI models answer questions using only the knowledge they learned during training.
However, this knowledge is static — it doesn’t include your specific data or the most recent information.

Retrieval-Augmented Generation (RAG) changes that. It makes AI “smarter” by allowing it to retrieve real, external information before generating a response. Instead of relying only on pre-trained knowledge, the model can pull information from documents, databases, network logs, or other sources to add real, dynamic, and domain-specific context.

With RAG, the AI can search and read actual data — such as network configurations, logs, or manuals — before answering your question.
This means its responses are more accurate, up-to-date, and relevant to your specific environment, not just based on general knowledge.

RAG stands for:

Retrieval → The model fetches relevant external data (e.g., documents, logs, configs).
Augmented → That data enriches the model’s understanding and reasoning.
Generation → The model uses this enhanced context to generate precise, context-aware answers.

Example use cases:

“What interfaces are configured on router X?” → Answered from the actual show run output.
“What caused the BGP flap yesterday?” → Answered using show logging data.
Querying internal documentation or checking network health with pyATS can also be powered by RAG, enabling real-time, data-driven insights.

What is LangChain?

LangChain is a Python framework designed to build RAG pipelines and other applications that use Large Language Models (LLMs), such as OpenAI’s GPT models.

The name comes from:

Lang → Referring to language models.
Chain → Linking together modular components like retrievers, data loaders, and models into a unified workflow.

Typical LangChain Components:

Sources → Data from logs, configs, documents, or APIs.
Document Loaders → Import and prepare data for use.
Text Splitters → Break large files into smaller, searchable chunks.
Embeddings → Convert text into numeric vectors for similarity search.
Vector Stores → Databases (like ChromaDB or Pinecone) that store embeddings.
Retrievers & LLMs → Retrieve relevant chunks and generate natural-language responses.

If you’re building an AI assistant for your network, you’re essentially creating a LangChain-powered RAG system — one that enables the AI to understand, analyze, and reason about your real network data, not just generic information.

Simple RAG Pipeline Example using LangChain and OpenAI

To better understand how RAG can assist in network automation, we’ll use LangChain and OpenAI to create a simple example script that demonstrates how the RAG pipeline works.

Context-Aware AI using LangChain and OpenAI in Network Automation

This example shows how to use LangChain and OpenAI to build a simple Retrieval-Augmented Generation (RAG) pipeline. The goal is to enable AI to answer questions using your own data — such as logs or command outputs — rather than relying solely on its pre-trained general knowledge.

Steps:

Load a text file (e.g., a log file).
Split the content into smaller chunks.
Create embeddings and store them in a Chroma vector database.
Retrieve the most relevant chunks for a given query.
Generate an answer using OpenAI, based on the retrieved data.

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA

# 1️⃣ Load documents
loader = TextLoader("logging.txt")  # your text file
documents = loader.load()

# 2️⃣ Split documents into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

# 3️⃣ Create embeddings and store them in a Chroma vector database
embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(docs, embeddings)

# 4️⃣ Create a retriever
retriever = db.as_retriever(search_kwargs={"k": 4})

# 5️⃣ Initialize the OpenAI LLM
llm = ChatOpenAI(
    model="gpt-4o-mini",  # modern model name
    temperature=0
)

# 6️⃣ Create a RetrievalQA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type="stuff"
)

# 7️⃣ Ask a question
#query = "Who created Python?"
query = "what are the most important messages in the log file in summary?"
result = qa_chain.invoke({"query": query})

# 8️⃣ Print the result
print(f"Question: {query}")
print(f"Answer: {result['result']}")

To provide a short description for each section of the script: in the first section, we use LangChain’s TextLoader to load the contents of logging.txt as documents for the RAG pipeline. This could be any other dynamic data or documents, such as network device configurations, routing tables, ARP tables, MAC tables,log file or network documentation.

# 1️⃣ Load your data (e.g., network logs or command outputs)
# You can replace 'logging.txt' with any text file, such as:
# - A 'show run' output from a router
# - A system log file
# - A configuration dump
# Example: loader = TextLoader("show_run_R1.txt")
loader = TextLoader("logging.txt")
documents = loader.load()

Then we use LangChain’s CharacterTextSplitter to divide the loaded documents into smaller, manageable chunks for more efficient retrieval and processing in the RAG pipeline. The parameters chunk_size=1000 and chunk_overlap=0 mean that each chunk will contain up to 1000 characters, and there will be no overlapping content between consecutive chunks.

# 2️⃣ Split the data into smaller, manageable chunks
# This helps when working with large files.
# Each chunk will later be embedded and stored for efficient search.
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

Then we create vector embeddings of the document chunks using OpenAIEmbeddings, which convert text into numerical representations for similarity search. These embeddings are then stored in a Chroma vector database (Chroma.from_documents) to enable efficient retrieval of relevant content during the RAG process.

# 3️⃣ Create embeddings and store them in a Chroma vector database
# Embeddings convert text into numerical vectors that capture meaning.
# Chroma is a lightweight local vector database for storing and searching them.
embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(docs, embeddings)

In this section, we create a retriever from the Chroma vector database using db.as_retriever(). The retriever is responsible for fetching the most relevant document chunks for a given query. The argument search_kwargs={"k": 4} specifies that the retriever should return the top 4 most similar chunks for each query.

# 4️⃣ Create a retriever to find relevant information
# The retriever uses the embeddings database to find chunks
# that are most relevant to a given query.
retriever = db.as_retriever(search_kwargs={"k": 4})

In this section, we initialize the language model using ChatOpenAI. The model="gpt-4o-mini" argument specifies which OpenAI model to use, while temperature=0 ensures that the model’s responses are deterministic and focused, minimizing randomness in its answers.

# 5️⃣ Initialize the OpenAI model (LLM)
# This is the generative part that will produce the answer.
# 'gpt-4o-mini' is a fast, cost-effective model.
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0  # 0 = more deterministic and factual responses
)

Then, we create a RetrievalQA chain using RetrievalQA.from_chain_type, which connects the retriever and the language model to form the RAG pipeline. The llm=llm argument specifies the language model to generate answers, retriever=retriever provides the relevant document chunks, and chain_type="stuff" indicates that all retrieved chunks will be combined together (“stuffed”) as context for the model to generate a response.

# 6️⃣ Create the RetrievalQA chain
# This connects the retriever with the LLM.
# The retriever finds the relevant text, and the LLM generates the answer.
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type="stuff"  # simplest chain type: all retrieved chunks are “stuffed” into one prompt
)

Fianlly, we submit a query to the RAG pipeline using qa_chain.invoke(). The query contains the question we want the AI to answer (e.g., summarizing key log messages), and the chain retrieves relevant document chunks and generates a context-aware response based on the actual content of the log file.

# 7️⃣ Ask your question
# Try different questions depending on your input file.
# Example: "What caused the BGP flap yesterday?"
# or "What are the configured interfaces on R1?"
query = "What are the most important messages in the log file in summary?"
result = qa_chain.invoke({"query": query})

Then, we print the query and the AI-generated answer to the console, displaying both the user’s question and the context-aware response produced by the RAG pipeline.

# 8️⃣ Display the answer
print(f"Question: {query}")
print(f"Answer: {result['result']}")

Running a LangChain RAG Pipeline for Log File Summarization

Before running the script, you first need to set your OpenAI API key using the export command, which is required to authenticate and communicate with OpenAI’s services. Once the key is set and the script is executed, it loads the log file, processes its content into searchable chunks, and applies a Retrieval-Augmented Generation (RAG) pipeline to analyze the data. The script then takes your query — in this example, asking for a summary of the most important log messages — retrieves the most relevant text segments from the data, and generates a concise, context-aware summary. The output demonstrates how AI can automatically interpret and explain key events from real network logs, such as user logins, system activities, and overall system status, rather than relying solely on general pre-trained knowledge.

(majid) majid@majid-ubuntu:~/devnet/pyats$ export OPENAI_API_KEY="sk-proj-DtLXjN....................................kvF7bT3BlbkFJlnCtbATF79Lg56NTgJtUyvxnHM4k-I5rHXP0qrgvdllMzFjA0QouBWpF-RexqqAE-8Wa7ZoBMA"

(majid) majid@majid-ubuntu:~/devnet/pyats$ python 11.2.RAG_langchain_openai_.py
Question: what are the most important messages in the log file in summary?
Answer: The log file contains several important messages related to user authentication and system events. Here are the key points summarized:

1. **Login Success**: The most notable message is the successful login of a user named "majid" from the IP address 10.100.7.223 on October 9, 2025, at 20:46:24 UTC.

2. **User Creation and Deletion**: There are multiple entries indicating the creation and deletion of user sessions, particularly for the terminal interface "tty5". This includes messages about binding and freeing user resources.

3. **System Activity**: The log shows a continuous stream of activity related to user authentication, including parsing names, binding interfaces, and memory management for user sessions.

4. **No Errors or Warnings**: The log does not indicate any errors or warnings, suggesting that the system is functioning normally during the recorded time.

Overall, the log primarily documents user login activities and system resource management without any critical issues.

Understanding RAG and LangChain

What is LangChain?

Simple RAG Pipeline Example using LangChain and OpenAI

Running a LangChain RAG Pipeline for Log File Summarization

Leave a Reply Cancel reply