MongoDB跨表跨库查询
- 1.数据准备:
- 2.跨集合查询
- 3.跨库查询
- 应该怎么做?
讲一个简单的例子,python连接mongodb做跨表跨库查询的正确姿势
1.数据准备:
use order_db;
db.createCollection("orders");
db.orders.insertMany([{"_id": 1,"order_number": "ORD123","product": "Laptop","customer_id": 101},{"_id": 2,"order_number": "ORD124","product": "Smartphone","customer_id": 102},{"_id": 3,"order_number": "ORD125","product": "Tablet","customer_id": 103}
])use customer_db;
db.createCollection("customers");
db.customers.insertMany([{"_id": 101,"name": "John Doe","email": "john@example.com","address": "123 Main St"},{"_id": 102,"name": "Jane Smith","email": "jane@example.com","address": "456 Oak Ave"},{"_id": 103,"name": "Bob Johnson","email": "bob@example.com","address": "789 Pine Blvd"}
]);
2.跨集合查询
from pymongo import MongoClient## Joint Table Quety
# 连接到 MongoDB 数据库
client = MongoClient("mongodb://admin:admin@localhost:27017/")# 选择数据库和集合
db_orders = client.order_db.orders
db_customers = client.customer_db.customers# 执行跨表查询
pipeline = [{"$lookup": {"from": "customers","localField": "customer_id","foreignField": "_id","as": "customer_info"}},{"$unwind": "$customer_info"},{"$project": {"_id": 1,"order_number": 1,"product": 1,"customer_info.name": 1,"customer_info.email": 1}}
]result = list(db_orders.aggregate(pipeline))# 打印结果
for order in result:print(order)
分析:
经过代码测试会发现,pipeline走到lookup结束,customer_info为空,lookup是作用于单个数据库下的不同集合之间的联合查询,但不支持跨库,而网络上充斥着所谓支持跨库查询的案例。。。
因此,将collection放于同一个db下,发现结果符合预期。
3.跨库查询
应该怎么做?
思考:想象我们做的业务,通常都是模块化的,之间都是通过服务/应用层接口调用来实现的,其底层正对应着不同的数据库。比如常见的订单系统和用户系统,因为集中式管理(单个数据库)容易造成性能瓶颈,会按业务进行合理拆分,也更容易复用和拓展。
所以,所谓的跨库查询,实际上就跟业务之间的通信是类似的,这里并不是单库下的主外键查询问题,而是实际场景中多库下多个服务之间的数据互通与一致性查询问题,一般处理手段是将一些联合查询问题放到业务层解决,当然,针对做不同数据库的相同表做同步复制也是可以的,不过显然这与业务拆分的初衷相违背了。
以下是简单的sample,样例数据保持不变:
from pymongo import MongoClient## Cross Database Query# 连接到 MongoDB 数据库
client_db1 = MongoClient("mongodb://admin:admin@localhost:27017/")
client_db2 = MongoClient("mongodb://admin:admin@localhost:27017/")# 选择数据库和集合
customer_db = client_db1.customer_db
order_db = client_db2.order_dbcustomers_collection = customer_db.customers
orders_collection = order_db.orders# 查询 orders 数据
orders_data = list(orders_collection.find())
# 查询 customers 数据
customers_data_dict = {customer["_id"]: customer for customer in customers_collection.find()}
# 手动关联数据
result = []
for order in orders_data:customer_id = order.get("customer_id")# 在 customers 数据中查找匹配的 customer_idmatching_customer = customers_data_dict.get(customer_id)if matching_customer:# 合并数据merged_data = {**order, "customer_info": matching_customer}result.append(merged_data)# 打印结果
for item in result:print(item)
结果符合预期: