比微软的GraphRag更加强大的LightRAG：简单快速的检索增强生成

🚀 LightRAG：简单快速的检索增强生成

在这里插入图片描述

该存储库托管了 LightRAG 的代码。该代码的结构基于nano-graphrag。请添加图片描述
在这里插入图片描述

🎉 新闻

[2024.10.29]🎯📢LightRAG 现在支持多种文件类型，包括 PDF、DOC、PPT 和 CSV textract。
[2024.10.20]🎯📢我们为LightRAG添加了一项新功能：图形可视化。
[2024.10.18]🎯📢我们添加了LightRAG介绍视频的链接。感谢作者！
[2024.10.17]🎯📢我们创建了Discord频道！欢迎加入分享和讨论！🎉🎉
[2024.10.16]🎯📢LightRAG 现在支持Ollama 模型！
[2024.10.15]🎯📢LightRAG 现已支持Hugging Face 模型！

算法流程图

在这里插入图片描述

安装

从源码安装（推荐）

cd LightRAG
pip install -e .

从 PyPI 安装

pip install lightrag-hku

快速入门

在本地运行 LightRAG 的视频演示。
所有代码均可在中找到examples。
如果使用 OpenAI 模型，请在环境中设置 OpenAI API 密钥：export OPENAI_API_KEY=“sk-…”.
下载演示文本“查尔斯·狄更斯的《圣诞颂歌》”：

curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt

使用以下 Python 代码片段（在脚本中）初始化 LightRAG 并执行查询：

import os
from lightrag import LightRAG, QueryParam
from lightrag.llm import gpt_4o_mini_complete, gpt_4o_complete#########
# Uncomment the below two lines if running in a jupyter notebook to handle the async nature of rag.insert()
# import nest_asyncio
# nest_asyncio.apply()
#########WORKING_DIR = "./dickens"if not os.path.exists(WORKING_DIR):os.mkdir(WORKING_DIR)rag = LightRAG(working_dir=WORKING_DIR,llm_model_func=gpt_4o_mini_complete  # Use gpt_4o_mini_complete LLM model# llm_model_func=gpt_4o_complete  # Optionally, use a stronger model
)with open("./book.txt") as f:rag.insert(f.read())# Perform naive search
print(rag.query("What are the top themes in this story?", param=QueryParam(mode="naive")))# Perform local search
print(rag.query("What are the top themes in this story?", param=QueryParam(mode="local")))# Perform global search
print(rag.query("What are the top themes in this story?", param=QueryParam(mode="global")))# Perform hybrid search
print(rag.query("What are the top themes in this story?", param=QueryParam(mode="hybrid")))

使用类似开放 AI 的 API

LightRAG 还支持类似开放 AI 的聊天/嵌入 API：

async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs
) -> str:return await openai_complete_if_cache("solar-mini",prompt,system_prompt=system_prompt,history_messages=history_messages,api_key=os.getenv("UPSTAGE_API_KEY"),base_url="https://api.upstage.ai/v1/solar",**kwargs)async def embedding_func(texts: list[str]) -> np.ndarray:return await openai_embedding(texts,model="solar-embedding-1-large-query",api_key=os.getenv("UPSTAGE_API_KEY"),base_url="https://api.upstage.ai/v1/solar")rag = LightRAG(working_dir=WORKING_DIR,llm_model_func=llm_model_func,embedding_func=EmbeddingFunc(embedding_dim=4096,max_token_size=8192,func=embedding_func)
)

使用 Hugging Face 模型

如果要使用Hugging Face模型，只需要如下设置LightRAG：

from lightrag.llm import hf_model_complete, hf_embedding
from transformers import AutoModel, AutoTokenizer# Initialize LightRAG with Hugging Face model
rag = LightRAG(working_dir=WORKING_DIR,llm_model_func=hf_model_complete,  # Use Hugging Face model for text generationllm_model_name='meta-llama/Llama-3.1-8B-Instruct',  # Model name from Hugging Face# Use Hugging Face embedding functionembedding_func=EmbeddingFunc(embedding_dim=384,max_token_size=5000,func=lambda texts: hf_embedding(texts,tokenizer=AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2"),embed_model=AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2"))),
)

使用 Ollama 模型

概述

如果您想使用 Ollama 模型，您需要拉取您计划使用的模型和嵌入模型，例如nomic-embed-text。

然后你只需要按如下方式设置LightRAG：

from lightrag.llm import ollama_model_complete, ollama_embedding# Initialize LightRAG with Ollama model
rag = LightRAG(working_dir=WORKING_DIR,llm_model_func=ollama_model_complete,  # Use Ollama model for text generationllm_model_name='your_model_name', # Your model name# Use Ollama embedding functionembedding_func=EmbeddingFunc(embedding_dim=768,max_token_size=8192,func=lambda texts: ollama_embedding(texts,embed_model="nomic-embed-text")),
)

增加上下文大小

为了使 LightRAG 正常工作，上下文至少应为 32k 个标记。默认情况下，Ollama 模型的上下文大小为 8k。您可以使用以下两种方式之一实现此目的：

增加num_ctxModelfile中的参数。

拉取模型：

ollama pull qwen2

显示模型文件：

ollama show --modelfile qwen2 > Modelfile

编辑模型文件，添加以下行：

PARAMETER num_ctx 32768

创建修改后的模型：

ollama create -f Modelfile qwen2m

num_ctx通过 Ollama API设置。

Tiy 可以使用llm_model_kwargsparam 来配置 ollama：

rag = LightRAG(working_dir=WORKING_DIR,llm_model_func=ollama_model_complete,  # Use Ollama model for text generationllm_model_name='your_model_name', # Your model namellm_model_kwargs={"options": {"num_ctx": 32768}},# Use Ollama embedding functionembedding_func=EmbeddingFunc(embedding_dim=768,max_token_size=8192,func=lambda texts: ollama_embedding(texts,embed_model="nomic-embed-text")),
)

功能齐全的示例

examples/lightrag_ollama_demo.py这里有一个利用模型的功能齐全的示例gemma2:2b，仅并行运行 4 个请求，并将上下文大小设置为 32k。

低 RAM GPU

为了在低 RAM GPU 上运行此实验，您应该选择小模型并调整上下文窗口（增加上下文会增加内存消耗）。例如，在重新利用的挖矿 GPU 上运行此 ollama 示例，使用时需要将上下文大小设置为 26k，内存为 6Gb gemma2:2b。它能够在上找到 197 个实体和 19 个关系book.txt。

查询参数

class QueryParam:mode: Literal["local", "global", "hybrid", "naive"] = "global"only_need_context: bool = Falseresponse_type: str = "Multiple Paragraphs"# Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode.top_k: int = 60# Number of tokens for the original chunks.max_token_for_text_unit: int = 4000# Number of tokens for the relationship descriptionsmax_token_for_global_context: int = 4000# Number of tokens for the entity descriptionsmax_token_for_local_context: int = 4000

批量插入

# Batch Insert: Insert multiple texts at once
rag.insert(["TEXT1", "TEXT2",...])

增量插入

# Incremental Insert: Insert new documents into an existing LightRAG instance
rag = LightRAG(working_dir=WORKING_DIR,llm_model_func=llm_model_func,embedding_func=EmbeddingFunc(embedding_dim=embedding_dimension,max_token_size=8192,func=embedding_func,),
)with open("./newText.txt") as f:rag.insert(f.read())

多文件类型支持

支持testract读取TXT、DOCX、PPTX、CSV和PDF等文件类型。

import textractfile_path = 'TEXT.pdf'
text_content = textract.process(file_path)rag.insert(text_content.decode('utf-8'))

图形可视化

使用 HTML 进行图形可视化

以下代码可以在examples/graph_visual_with_html.py

import networkx as nx
from pyvis.network import Network# Load the GraphML file
G = nx.read_graphml('./dickens/graph_chunk_entity_relation.graphml')# Create a Pyvis network
net = Network(notebook=True)# Convert NetworkX graph to Pyvis network
net.from_nx(G)# Save and display the network
net.show('knowledge_graph.html')

使用 Neo4j 进行图形可视化

以下代码可以在examples/graph_visual_with_neo4j.py

import os
import json
from lightrag.utils import xml_to_json
from neo4j import GraphDatabase# Constants
WORKING_DIR = "./dickens"
BATCH_SIZE_NODES = 500
BATCH_SIZE_EDGES = 100# Neo4j connection credentials
NEO4J_URI = "bolt://localhost:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "your_password"def convert_xml_to_json(xml_path, output_path):"""Converts XML file to JSON and saves the output."""if not os.path.exists(xml_path):print(f"Error: File not found - {xml_path}")return Nonejson_data = xml_to_json(xml_path)if json_data:with open(output_path, 'w', encoding='utf-8') as f:json.dump(json_data, f, ensure_ascii=False, indent=2)print(f"JSON file created: {output_path}")return json_dataelse:print("Failed to create JSON data")return Nonedef process_in_batches(tx, query, data, batch_size):"""Process data in batches and execute the given query."""for i in range(0, len(data), batch_size):batch = data[i:i + batch_size]tx.run(query, {"nodes": batch} if "nodes" in query else {"edges": batch})def main():# Pathsxml_file = os.path.join(WORKING_DIR, 'graph_chunk_entity_relation.graphml')json_file = os.path.join(WORKING_DIR, 'graph_data.json')# Convert XML to JSONjson_data = convert_xml_to_json(xml_file, json_file)if json_data is None:return# Load nodes and edgesnodes = json_data.get('nodes', [])edges = json_data.get('edges', [])# Neo4j queriescreate_nodes_query = """UNWIND $nodes AS nodeMERGE (e:Entity {id: node.id})SET e.entity_type = node.entity_type,e.description = node.description,e.source_id = node.source_id,e.displayName = node.idREMOVE e:EntityWITH e, nodeCALL apoc.create.addLabels(e, [node.entity_type]) YIELD node AS labeledNodeRETURN count(*)"""create_edges_query = """UNWIND $edges AS edgeMATCH (source {id: edge.source})MATCH (target {id: edge.target})WITH source, target, edge,CASEWHEN edge.keywords CONTAINS 'lead' THEN 'lead'WHEN edge.keywords CONTAINS 'participate' THEN 'participate'WHEN edge.keywords CONTAINS 'uses' THEN 'uses'WHEN edge.keywords CONTAINS 'located' THEN 'located'WHEN edge.keywords CONTAINS 'occurs' THEN 'occurs'ELSE REPLACE(SPLIT(edge.keywords, ',')[0], '\"', '')END AS relTypeCALL apoc.create.relationship(source, relType, {weight: edge.weight,description: edge.description,keywords: edge.keywords,source_id: edge.source_id}, target) YIELD relRETURN count(*)"""set_displayname_and_labels_query = """MATCH (n)SET n.displayName = n.idWITH nCALL apoc.create.setLabels(n, [n.entity_type]) YIELD nodeRETURN count(*)"""# Create a Neo4j driverdriver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))try:# Execute queries in batcheswith driver.session() as session:# Insert nodes in batchessession.execute_write(process_in_batches, create_nodes_query, nodes, BATCH_SIZE_NODES)# Insert edges in batchessession.execute_write(process_in_batches, create_edges_query, edges, BATCH_SIZE_EDGES)# Set displayName and labelssession.run(set_displayname_and_labels_query)except Exception as e:print(f"Error occurred: {e}")finally:driver.close()if __name__ == "__main__":main()

API 服务器实现

LightRAG 还提供了基于 FastAPI 的服务器实现，用于通过 RESTful API 访问 RAG 操作。这样您就可以将 LightRAG 作为服务运行，并通过 HTTP 请求与其进行交互。

设置 API 服务器

设置说明
1.首先，确保您具有所需的依赖项：

pip install fastapi uvicorn pydantic

2.设置环境变量：

export RAG_DIR="your_index_directory"  # Optional: Defaults to "index_default"

3.运行 API 服务器：

python examples/lightrag_api_openai_compatible_demo.py

服务器将于启动 http://0.0.0.0:8020。

API 端点

API 服务器提供以下端点：

查询端点

URL： /query
Method： POST
Body：

{"query": "Your question here","mode": "hybrid"  // Can be "naive", "local", "global", or "hybrid"
}

Example：

curl -X POST "http://127.0.0.1:8020/query" \-H "Content-Type: application/json" \-d '{"query": "What are the main themes?", "mode": "hybrid"}'

插入文本端点

URL: /insert
Method: POST
Body:

{"text": "Your text content here"
}

Example:

curl -X POST "http://127.0.0.1:8020/insert" \-H "Content-Type: application/json" \-d '{"text": "Content to be inserted into RAG"}'

插入文件端点

URL: /insert_file
Method: POST
Body:

{"file_path": "path/to/your/file.txt"
}

Example:

curl -X POST "http://127.0.0.1:8020/insert_file" \-H "Content-Type: application/json" \-d '{"file_path": "./book.txt"}'

健康检查端点

URL: /health
Method: GET
Example:

curl -X GET "http://127.0.0.1:8020/health"

配置

可以使用环境变量来配置 API 服务器：

RAG_DIR：存储 RAG 索引的目录（默认值：“index_default”）
应在代码中为特定的 LLM 和嵌入模型提供程序配置 API 密钥和基本 URL

错误处理

该 API 包括全面的错误处理：

未找到文件错误（404）
处理错误（500）
支持多种文件编码（UTF-8和GBK）

评估

数据集

LightRAG 中使用的数据集可以从TommyChien/UltraDomain下载。

生成查询

LightRAG 使用以下提示来生成高级查询，其中相应的代码在中example/generate_query.py。

迅速的

批量评估

为了评估两个 RAG 系统在高级查询上的性能，LightRAG 使用了以下提示，具体代码可在中找到example/batch_eval.py。

迅速的

总体绩效表

	Agriculture		CS		Legal		Mix
	NaiveRAG	LightRAG	NaiveRAG	LightRAG	NaiveRAG	LightRAG	NaiveRAG	LightRAG
全面性	32.69%	67.31%	35.44%	64.56%	19.05%	80.95%	36.36%	63.64%
多样性	24.09%	75.91%	35.24%	64.76%	10.98%	89.02%	30.76%	69.24%
赋能	31.35%	68.65%	35.48%	64.52%	17.59%	82.41%	40.95%	59.05%
整体	33.30%	66.70%	34.76%	65.24%	17.46%	82.54%	37.59%	62.40%
	RQ-RAG	LightRAG	RQ-RAG	LightRAG	RQ-RAG	LightRAG	RQ-RAG	LightRAG
全面性	32.05%	67.95%	39.30%	60.70%	18.57%	81.43%	38.89%	61.11%
多样性	29.44%	70.56%	38.71%	61.29%	15.14%	84.86%	28.50%	71.50%
赋能	32.51%	67.49%	37.52%	62.48%	17.80%	82.20%	43.96%	56.04%
整体	33.29%	66.71%	39.03%	60.97%	17.80%	82.20%	39.61%	60.39%
	HyDE	LightRAG	HyDE	LightRAG	HyDE	LightRAG	HyDE	LightRAG
全面性	24.39%	75.61%	36.49%	63.51%	27.68%	72.32%	42.17%	57.83%
多样性	24.96%	75.34%	37.41%	62.59%	18.79%	81.21%	30.88%	69.12%
赋能	24.89%	75.11%	34.99%	65.01%	26.99%	73.01%	45.61%	54.39%
整体	23.17%	76.83%	35.67%	64.33%	27.68%	72.32%	42.72%	57.28%
	GraphRAG	LightRAG	GraphRAG	LightRAG	GraphRAG	LightRAG	GraphRAG	LightRAG
全面性	45.56%	54.44%	45.98%	54.02%	47.13%	52.87%	51.86%	48.14%
多样性	19.65%	80.35%	39.64%	60.36%	25.55%	74.45%	35.87%	64.13%
赋能	36.69%	63.31%	45.09%	54.91%	42.81%	57.19%	52.94%	47.06%
整体	43.62%	56.38%	45.98%	54.02%	45.70%	54.30%	51.86%	48.14%

以上数据基于 Agriculture、CS、Legal、Mix 四份数据集分别对LightRAG 、NaiveRAG、 RQ-RAG 、HyDE 、GraphRAG进行评估对比，从数据结果来看 LightRAG 在四份数据集下的在全面性、多样性、赋能、总体的表现完胜NaiveRAG、 RQ-RAG 。和HYDE对比，仅在Mix数据集评估的赋能方面上输于HYDE，其他方面均高于HYDE。和GraphRAG 对比，也是在Mix数据集评估中，在全面性、赋能、总体方面输于GraphRAG ，其他方面均也高于GraphRAG 。

步骤重现

所有代码均可在./reproduce目录中找到。

步骤 0 提取唯一上下文

首先，我们需要从数据集中提取独特的上下文。

代码

def extract_unique_contexts(input_directory, output_directory):os.makedirs(output_directory, exist_ok=True)jsonl_files = glob.glob(os.path.join(input_directory, '*.jsonl'))print(f"Found {len(jsonl_files)} JSONL files.")for file_path in jsonl_files:filename = os.path.basename(file_path)name, ext = os.path.splitext(filename)output_filename = f"{name}_unique_contexts.json"output_path = os.path.join(output_directory, output_filename)unique_contexts_dict = {}print(f"Processing file: {filename}")try:with open(file_path, 'r', encoding='utf-8') as infile:for line_number, line in enumerate(infile, start=1):line = line.strip()if not line:continuetry:json_obj = json.loads(line)context = json_obj.get('context')if context and context not in unique_contexts_dict:unique_contexts_dict[context] = Noneexcept json.JSONDecodeError as e:print(f"JSON decoding error in file {filename} at line {line_number}: {e}")except FileNotFoundError:print(f"File not found: {filename}")continueexcept Exception as e:print(f"An error occurred while processing file {filename}: {e}")continueunique_contexts_list = list(unique_contexts_dict.keys())print(f"There are {len(unique_contexts_list)} unique `context` entries in the file {filename}.")try:with open(output_path, 'w', encoding='utf-8') as outfile:json.dump(unique_contexts_list, outfile, ensure_ascii=False, indent=4)print(f"Unique `context` entries have been saved to: {output_filename}")except Exception as e:print(f"An error occurred while saving to the file {output_filename}: {e}")print("All files have been processed.")

步骤 1 插入上下文

对于提取的上下文，我们将其插入到LightRAG系统中。

代码

def insert_text(rag, file_path):with open(file_path, mode='r') as f:unique_contexts = json.load(f)retries = 0max_retries = 3while retries < max_retries:try:rag.insert(unique_contexts)breakexcept Exception as e:retries += 1print(f"Insertion failed, retrying ({retries}/{max_retries}), error: {e}")time.sleep(10)if retries == max_retries:print("Insertion failed after exceeding the maximum number of retries")

步骤 2 生成查询

我们从数据集中每个上下文的前半部分和后半部分提取标记，然后将它们组合为数据集描述以生成查询。

代码

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')def get_summary(context, tot_tokens=2000):tokens = tokenizer.tokenize(context)half_tokens = tot_tokens // 2start_tokens = tokens[1000:1000 + half_tokens]end_tokens = tokens[-(1000 + half_tokens):1000]summary_tokens = start_tokens + end_tokenssummary = tokenizer.convert_tokens_to_string(summary_tokens)return summary

步骤 3 查询

对于步骤 2 中生成的查询，我们将提取它们并查询 LightRAG。

代码

def extract_queries(file_path):with open(file_path, 'r') as f:data = f.read()data = data.replace('**', '')queries = re.findall(r'- Question \d+: (.+)', data)return queries

代码结构

.
├── examples
│   ├── batch_eval.py
│   ├── generate_query.py
│   ├── graph_visual_with_html.py
│   ├── graph_visual_with_neo4j.py
│   ├── lightrag_api_openai_compatible_demo.py
│   ├── lightrag_azure_openai_demo.py
│   ├── lightrag_bedrock_demo.py
│   ├── lightrag_hf_demo.py
│   ├── lightrag_lmdeploy_demo.py
│   ├── lightrag_ollama_demo.py
│   ├── lightrag_openai_compatible_demo.py
│   ├── lightrag_openai_demo.py
│   ├── lightrag_siliconcloud_demo.py
│   └── vram_management_demo.py
├── lightrag
│   ├── __init__.py
│   ├── base.py
│   ├── lightrag.py
│   ├── llm.py
│   ├── operate.py
│   ├── prompt.py
│   ├── storage.py
│   └── utils.py
├── reproduce
│   ├── Step_0.py
│   ├── Step_1_openai_compatible.py
│   ├── Step_1.py
│   ├── Step_2.py
│   ├── Step_3_openai_compatible.py
│   └── Step_3.py
├── .gitignore
├── .pre-commit-config.yaml
├── LICENSE
├── README.md
├── requirements.txt
└── setup.py

星历史

在这里插入图片描述

引用

@article{guo2024lightrag,
title={LightRAG: Simple and Fast Retrieval-Augmented Generation},
author={Zirui Guo and Lianghao Xia and Yanhua Yu and Tu Ao and Chao Huang},
year={2024},
eprint={2410.05779},
archivePrefix={arXiv},
primaryClass={cs.IR}
}