文章目录
- 使用gRPC基于Protobuf传输大文件或数据流
- 1. 背景和技术选择
- 1.1 gRPC的优势
- 1.2 Protocol Buffers的优势
- 2. 项目配置与环境搭建
- 2.1 安装gRPC和Protocol Buffers
- 2.1.1 安装Cmake
- 2.1.2 设置环境变量
- 2.1.3 安装必要的依赖
- 2.1.4 下载gRPC源码
- 2.1.5 编译gRPC和 [Protocol Buffers](https://grpc.io/docs/languages/cpp/quickstart/#build-and-install-grpc-and-protocol-buffers)
- 2.2 CMake配置详解
- 2.1.1 通用配置
- 2.2.2 项目配置
- 2.3 服务定义与proto文件
- 3. 客户端和服务端的实现
- 3.1 gRPC客户端实现
- 3.2 gRPC服务端实现
- 3.3 TCP客户端实现
- 3.4 TCP服务端实现
- 4. 性能测试与分析
- 4.1 测试方法
- 4.1.1 gPRC测试结果
- 4.1.2 TCP测试结果
- 4.1.3 多次测试结果
- 4.2 性能比较结果
- 4.2.1 CPU 使用
- 4.2.2 内存复用
- 4.2.3 传输时间和速率
- 5. 结论
使用gRPC基于Protobuf传输大文件或数据流
在现代软件开发中,性能通常是关键的考虑因素之一,尤其是在进行大文件传输时。高效的协议和工具可以显著提升传输速度和可靠性。本文详细介绍如何使用gRPC和Protobuf进行大文件传输,并与传统TCP传输进行性能比较。
1. 背景和技术选择
在过去,大文件传输常常使用传统的TCP/IP协议,虽然简单但在处理大规模数据传输时往往速度较慢,尤其在网络条件不佳的环境下更是如此。在最近一个项目中,就有传输大数据文件的需求,用传统方式进行测试发现传输延时无法达到要求,于是在网上查阅资料发现了一个较优的解决方案。
相对于TCP,gRPC提供了一种现代的、高性能的解决方案。gRPC是一个高性能的远程过程调用(RPC)框架,由Google主导开发,使用HTTP/2作为传输层协议,支持多种开发语言,如C++, Java, Python和Go等。Protobuf(Protocol Buffers)则是一种轻量级的数据交换格式,可以高效地序列化结构化数据。
1.1 gRPC的优势
- 高性能: 利用HTTP/2协议,支持多路复用、服务器推送等现代网络技术。
- 跨语言支持: 支持多种编程语言,便于在不同的系统间交互。
- 接口定义: 使用.proto文件定义服务,自动生成服务端和客户端代码,减少重复工作量。
- 流控制: 支持流式传输数据,适合大文件传输和实时数据处理。
1.2 Protocol Buffers的优势
- 高效: 编码和解码迅速,且生成的数据包比XML小3到10倍。
- 灵活: 支持向后兼容性,新旧数据格式可以无缝转换。
- 简洁: 简化了复杂数据结构的处理,易于开发者使用。
2. 项目配置与环境搭建
为了使用gRPC进行项目开发,首先需要在开发环境中安装gRPC及其依赖的库。以下是gRPC安装的步骤,适用于多种操作系统,包括Windows、Linux和macOS。
2.1 安装gRPC和Protocol Buffers
gRPC的安装可以通过多种方式进行,包括使用包管理器或从源代码编译。以下介绍Ubuntu下安装C++版本的gRPC(捆绑了Protocol Buffers)
注:如果gRPC和Protocol Buffers的版本不匹配会有问题,无法正常使用
2.1.1 安装Cmake
sudo apt install -y cmake
2.1.2 设置环境变量
export MY_INSTALL_DIR=$HOME/.local
mkdir -p $MY_INSTALL_DIR
export PATH="$MY_INSTALL_DIR/bin:$PATH" # 为了永久生效可以将该命令写入~/.bashrc文件中
2.1.3 安装必要的依赖
sudo apt install -y build-essential autoconf libtool pkg-config
2.1.4 下载gRPC源码
git clone --recurse-submodules -b v1.62.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc
2.1.5 编译gRPC和 Protocol Buffers
cd grpc
mkdir -p cmake/build
pushd cmake/build
cmake -DgRPC_INSTALL=ON \-DgRPC_BUILD_TESTS=OFF \-DCMAKE_INSTALL_PREFIX=$MY_INSTALL_DIR \../..
make -j 4
make install
popd
2.2 CMake配置详解
2.1.1 通用配置
common.cmake
是一个辅助性的 CMake 模块文件,通常用于存放项目中共用的 CMake 配置,以简化和集中管理 CMakeLists.txt
文件中的代码。这种做法有助于提升项目的可维护性和可读性。
在 gRPC 项目中,示例代码中的common.cmake
包括以下内容:
- 变量设置:定义项目中使用的常见路径和变量,例如 gRPC 和 protobuf 的安装路径,以便在整个项目中重用。
- 库查找:使用
find_package()
或find_library()
命令来查找和配置项目所需的依赖库,如 gRPC、protobuf、SSL 等。 - 编译器选项:统一设置编译器标志,例如 C++ 版本标准、优化级别、警告处理等。
- 宏定义:创建复用的 CMake 宏或函数,例如用于处理 proto 文件生成相关命令的宏,这有助于避免在
CMakeLists.txt
文件中重复相同的代码块。
# Copyright 2018 gRPC authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# cmake build file for C++ route_guide example.
# Assumes protobuf and gRPC have been installed using cmake.
# See cmake_externalproject/CMakeLists.txt for all-in-one cmake build
# that automatically builds all the dependencies before building route_guide.cmake_minimum_required(VERSION 3.8)if(MSVC)add_definitions(-D_WIN32_WINNT=0x600)
endif()find_package(Threads REQUIRED)if(GRPC_AS_SUBMODULE)# One way to build a projects that uses gRPC is to just include the# entire gRPC project tree via "add_subdirectory".# This approach is very simple to use, but the are some potential# disadvantages:# * it includes gRPC's CMakeLists.txt directly into your build script# without and that can make gRPC's internal setting interfere with your# own build.# * depending on what's installed on your system, the contents of submodules# in gRPC's third_party/* might need to be available (and there might be# additional prerequisites required to build them). Consider using# the gRPC_*_PROVIDER options to fine-tune the expected behavior.## A more robust approach to add dependency on gRPC is using# cmake's ExternalProject_Add (see cmake_externalproject/CMakeLists.txt).# Include the gRPC's cmake build (normally grpc source code would live# in a git submodule called "third_party/grpc", but this example lives in# the same repository as gRPC sources, so we just look a few directories up)add_subdirectory(../../.. ${CMAKE_CURRENT_BINARY_DIR}/grpc EXCLUDE_FROM_ALL)message(STATUS "Using gRPC via add_subdirectory.")# After using add_subdirectory, we can now use the grpc targets directly from# this build.set(_PROTOBUF_LIBPROTOBUF libprotobuf)set(_REFLECTION grpc++_reflection)set(_ORCA_SERVICE grpcpp_orca_service)if(CMAKE_CROSSCOMPILING)find_program(_PROTOBUF_PROTOC protoc)else()set(_PROTOBUF_PROTOC $<TARGET_FILE:protobuf::protoc>)endif()set(_GRPC_GRPCPP grpc++)if(CMAKE_CROSSCOMPILING)find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)else()set(_GRPC_CPP_PLUGIN_EXECUTABLE $<TARGET_FILE:grpc_cpp_plugin>)endif()
elseif(GRPC_FETCHCONTENT)# Another way is to use CMake's FetchContent module to clone gRPC at# configure time. This makes gRPC's source code available to your project,# similar to a git submodule.message(STATUS "Using gRPC via add_subdirectory (FetchContent).")include(FetchContent)FetchContent_Declare(grpcGIT_REPOSITORY https://github.com/grpc/grpc.git# when using gRPC, you will actually set this to an existing tag, such as# v1.25.0, v1.26.0 etc..# For the purpose of testing, we override the tag used to the commit# that's currently under test.GIT_TAG vGRPC_TAG_VERSION_OF_YOUR_CHOICE)FetchContent_MakeAvailable(grpc)# Since FetchContent uses add_subdirectory under the hood, we can use# the grpc targets directly from this build.set(_PROTOBUF_LIBPROTOBUF libprotobuf)set(_REFLECTION grpc++_reflection)set(_PROTOBUF_PROTOC $<TARGET_FILE:protoc>)set(_GRPC_GRPCPP grpc++)if(CMAKE_CROSSCOMPILING)find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)else()set(_GRPC_CPP_PLUGIN_EXECUTABLE $<TARGET_FILE:grpc_cpp_plugin>)endif()
else()# This branch assumes that gRPC and all its dependencies are already installed# on this system, so they can be located by find_package().# Find Protobuf installation# Looks for protobuf-config.cmake file installed by Protobuf's cmake installation.option(protobuf_MODULE_COMPATIBLE TRUE)find_package(Protobuf CONFIG REQUIRED)message(STATUS "Using protobuf ${Protobuf_VERSION}")set(_PROTOBUF_LIBPROTOBUF protobuf::libprotobuf)set(_REFLECTION gRPC::grpc++_reflection)if(CMAKE_CROSSCOMPILING)find_program(_PROTOBUF_PROTOC protoc)else()set(_PROTOBUF_PROTOC $<TARGET_FILE:protobuf::protoc>)endif()# Find gRPC installation# Looks for gRPCConfig.cmake file installed by gRPC's cmake installation.find_package(gRPC CONFIG REQUIRED)message(STATUS "Using gRPC ${gRPC_VERSION}")set(_GRPC_GRPCPP gRPC::grpc++)if(CMAKE_CROSSCOMPILING)find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)else()set(_GRPC_CPP_PLUGIN_EXECUTABLE $<TARGET_FILE:gRPC::grpc_cpp_plugin>)endif()
endif()
2.2.2 项目配置
这个配置文件包括了从proto文件生成C++代码的命令,以及编译这些生成的源代码文件为库和可执行文件的命令。利用CMake,我们能够确保项目在不同环境中具有可重复构建的能力。
cmake_minimum_required(VERSION 3.15)project(grpcDemo)set(CMAKE_CXX_STANDARD 17)include(common.cmake)# Proto file
get_filename_component(transfer_proto "transferfile.proto" ABSOLUTE)
get_filename_component(transfer_proto_path "${transfer_proto}" PATH)# Generated sources
set(transfer_proto_srcs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.pb.cc")
set(transfer_proto_hdrs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.pb.h")
set(transfer_grpc_srcs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.grpc.pb.cc")
set(transfer_grpc_hdrs "${CMAKE_CURRENT_BINARY_DIR}/transferfile.grpc.pb.h")
add_custom_command(OUTPUT "${transfer_proto_srcs}" "${transfer_proto_hdrs}" "${transfer_grpc_srcs}" "${transfer_grpc_hdrs}"COMMAND ${_PROTOBUF_PROTOC}ARGS --grpc_out "${CMAKE_CURRENT_BINARY_DIR}"--cpp_out "${CMAKE_CURRENT_BINARY_DIR}"-I "${transfer_proto_path}"--plugin=protoc-gen-grpc="${_GRPC_CPP_PLUGIN_EXECUTABLE}""${transfer_proto}"DEPENDS "${transfer_proto}")# Include generated *.pb.h files
include_directories("${CMAKE_CURRENT_BINARY_DIR}")# transfer_grpc_proto
add_library(transfer_grpc_proto${transfer_grpc_srcs}${transfer_grpc_hdrs}${transfer_proto_srcs}${transfer_proto_hdrs})
target_link_libraries(transfer_grpc_proto${_REFLECTION}${_GRPC_GRPCPP}${_PROTOBUF_LIBPROTOBUF})# Targets greeter_[async_](client|server)
foreach(_targettransfer_client transfer_server)add_executable(${_target} "${_target}.cpp")target_link_libraries(${_target}transfer_grpc_proto${_REFLECTION}${_GRPC_GRPCPP}${_PROTOBUF_LIBPROTOBUF})
endforeach()
2.3 服务定义与proto文件
在gRPC中,服务和消息的定义是通过.proto文件进行的。例如,定义一个文件传输服务,可以在transferfile.proto
中如下定义:
syntax = "proto3";package filetransfer;service FileTransferService {rpc Upload(stream FileChunk) returns (UploadStatus) {}
}message FileChunk {bytes content = 1;
}message UploadStatus {bool success = 1;string message = 2;
}
这里定义了一个FileTransferService
服务,包含了一个Upload
方法,该方法接受一个FileChunk
类型的流,并返回一个UploadStatus
状态。
3. 客户端和服务端的实现
客户端和服务端的实现是通过gRPC框架生成的接口进行的,这些接口基于前面定义的.proto文件。
3.1 gRPC客户端实现
客户端的主要职责是打开文件,读取数据,然后以流的形式发送到服务端。实现代码如下:
#include <iostream>
#include <string>
#include <fstream>
#include <chrono>
#include <grpcpp/grpcpp.h>
#include <grpcpp/channel.h>
#include <grpcpp/client_context.h>
#include <grpcpp/create_channel.h>
#include <grpcpp/security/credentials.h>
#include "transferfile.grpc.pb.h"using grpc::Channel;
using grpc::ClientContext;
using grpc::ClientWriter;
using grpc::Status;
using transferfile::FileChunk;
using transferfile::FileUploadStatus;
using transferfile::TransferFile;#define CHUNK_SIZE 3 * 1024 * 1024 // 3MBclass TransferClient
{
public:TransferClient(std::shared_ptr<Channel> channel) : stub_(TransferFile::NewStub(channel)) {}void uploadFile(const std::string &filename);private:std::unique_ptr<TransferFile::Stub> stub_;
};void TransferClient::uploadFile(const std::string &filename)
{FileChunk chunk;char *buffer = new char[CHUNK_SIZE];FileUploadStatus status;ClientContext context;std::ifstream infile;unsigned long len = 0;auto start = std::chrono::steady_clock::now();infile.open(filename, std::ios::binary | std::ios::in);if (!infile.is_open()){std::cerr << "Error: File not found" << std::endl;return;}std::unique_ptr<ClientWriter<FileChunk>> writer(stub_->UploadFile(&context, &status)); // Create a writer to send chunks of filewhile (!infile.eof()){infile.read(buffer, CHUNK_SIZE);chunk.set_buffer(buffer, infile.gcount());if (!writer->Write(chunk)){std::cerr << "Error: Failed to write chunk" << std::endl;break;}len += infile.gcount();}infile.close();delete[] buffer;writer->WritesDone();Status status1 = writer->Finish();if (status1.ok() && len != status.length()){auto end = std::chrono::steady_clock::now();std::cout << "File uploaded successfully" << std::endl;std::cout << "Time taken: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;std::cout << "File size: " << len << " bytes" << std::endl;auto speed = (len / std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()) * 1000 / 1024 / 1024;std::cout << "Speed: " << speed << " MB/s" << std::endl;}else{std::cerr << "Error: " << status1.error_message() << "len: " << len << " status.length(): " << status.length()<< std::endl;}
}int main(int argc, char *argv[])
{if (argc < 2){std::cerr << "Usage: " << argv[0] << " <filename> [server_ip]" << std::endl;return 1;}std::string filename(argv[1]);std::string server_ip = "localhost";if (argc == 3){server_ip = argv[2];}TransferClient client(grpc::CreateChannel(server_ip + ":50051", grpc::InsecureChannelCredentials()));client.uploadFile(filename);return 0;
}
客户端代码展示了如何创建一个gRPC客户端,如何打开文件,如何将文件切割成块,并且如何将这些块通过网络发送到服务端。
3.2 gRPC服务端实现
服务端的实现则负责接收来自客户端的数据块,并将其写入到服务器上的文件中。服务端代码如下:
#include <iostream>
#include <string>
#include <fstream>
#include <grpcpp/grpcpp.h>
#include <grpcpp/server.h>
#include <grpcpp/server_builder.h>
#include <grpcpp/server_context.h>
#include <grpcpp/security/server_credentials.h>
#include "transferfile.grpc.pb.h"
#include <sys/resource.h>
#include <unistd.h>
#include <chrono>using grpc::Server;
using grpc::ServerBuilder;
using grpc::ServerContext;
using grpc::ServerReader;
using grpc::Status;
using transferfile::FileChunk;
using transferfile::FileUploadStatus;
using transferfile::TransferFile;
using namespace std::chrono;#define CHUNK_SIZE 3 * 1024 * 1024 // 3MBclass TransferServiceImpl final : public TransferFile::Service
{
public:Status UploadFile(ServerContext *context, ServerReader<FileChunk> *reader, FileUploadStatus *status) override;protected:void printUsage(const struct rusage &start, const struct rusage &end){std::cout << "CPU Usage: User time: " << (end.ru_utime.tv_sec - start.ru_utime.tv_sec) + (end.ru_utime.tv_usec - start.ru_utime.tv_usec) / 1000000.0<< "s, System time: " << (end.ru_stime.tv_sec - start.ru_stime.tv_sec) + (end.ru_stime.tv_usec - start.ru_stime.tv_usec) / 1000000.0 << "s" << std::endl;std::cout << "Max resident set size: " << (end.ru_maxrss - start.ru_maxrss) << " KB" << std::endl;}
};Status TransferServiceImpl::UploadFile(ServerContext *context, ServerReader<FileChunk> *reader, FileUploadStatus *status)
{FileChunk chunk;std::ofstream outfile;const char *data;struct rusage usage_start, usage_end;getrusage(RUSAGE_SELF, &usage_start);auto start = high_resolution_clock::now();outfile.open("output.bin", std::ios::binary | std::ios::out | std::ios::trunc);if (!outfile.is_open()){std::cerr << "Error: Failed to open file" << std::endl;return Status::CANCELLED;}while (reader->Read(&chunk)){data = chunk.buffer().c_str();outfile.write(data, chunk.buffer().length());}long pos = outfile.tellp();status->set_length(pos);outfile.close();auto end = high_resolution_clock::now();getrusage(RUSAGE_SELF, &usage_end);auto duration = duration_cast<milliseconds>(end - start);printUsage(usage_start, usage_end);std::cout << "Total transmission time: " << duration.count() << " ms" << std::endl;std::cout << "Total data transmitted: " << pos << " bytes" << std::endl;std::cout << "Transmission rate: " << (pos * 1000.0 / duration.count()) / 1024 / 1024 << " MB/s" << std::endl;return Status::OK;
}void RunServer()
{std::string server_address("0.0.0.0:50051");TransferServiceImpl service;ServerBuilder builder;builder.AddListeningPort(server_address, grpc::InsecureServerCredentials());builder.RegisterService(&service);std::unique_ptr<Server> server(builder.BuildAndStart());std::cout << "Server listening on " << server_address << std::endl;server->Wait();
}int main(int argc, char **argv)
{RunServer();return 0;
}
服务端代码展示了如何创建一个gRPC服务端,如何接收客户端发送的数据块,以及如何将这些数据块写入到磁盘文件中。
3.3 TCP客户端实现
功能同gRPC客户端
#include <iostream>
#include <fstream>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string>
#include <cstring>int main(int argc, char *argv[])
{const char *server_ip = "127.0.0.1";int server_port = 8888;std::string file_path = argv[1];// Create socketint sock = socket(AF_INET, SOCK_STREAM, 0);if (sock < 0){std::cerr << "Socket creation failed." << std::endl;return 1;}// Define the server addressstruct sockaddr_in server_address;memset(&server_address, 0, sizeof(server_address));server_address.sin_family = AF_INET;server_address.sin_port = htons(server_port);inet_pton(AF_INET, server_ip, &server_address.sin_addr);// Connect to the serverif (connect(sock, (struct sockaddr *)&server_address, sizeof(server_address)) < 0){std::cerr << "Connection to server failed." << std::endl;close(sock);return 1;}std::ifstream file(file_path, std::ios::binary);if (!file.is_open()){std::cerr << "Failed to open file: " << file_path << std::endl;close(sock);return 1;}// Send file contentschar *buffer = new char[1024];while (file.read(buffer, sizeof(buffer)) || file.gcount()){if (send(sock, buffer, file.gcount(), 0) < 0){std::cerr << "Failed to send data." << std::endl;break;}}delete[] buffer;std::cout << "File sent successfully." << std::endl;file.close();close(sock);return 0;
}
3.4 TCP服务端实现
功能同gRPC服务端
#include <iostream>
#include <fstream>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <cstring>
#include <string>
#include <sys/resource.h>
#include <unistd.h>
#include <chrono>#define CHUNK_SIZE 3 * 1024 * 1024 // 3MBusing namespace std::chrono;void printUsage(const struct rusage &start, const struct rusage &end)
{std::cout << "CPU Usage: User time: " << (end.ru_utime.tv_sec - start.ru_utime.tv_sec) + (end.ru_utime.tv_usec - start.ru_utime.tv_usec) / 1000000.0<< "s, System time: " << (end.ru_stime.tv_sec - start.ru_stime.tv_sec) + (end.ru_stime.tv_usec - start.ru_stime.tv_usec) / 1000000.0 << "s" << std::endl;std::cout << "Max resident set size: " << (end.ru_maxrss - start.ru_maxrss) << " KB" << std::endl;
}int main()
{int server_port = 8888;const char *output_file = "output_received_file";// Create socketint server_sock = socket(AF_INET, SOCK_STREAM, 0);if (server_sock < 0){std::cerr << "Socket creation failed." << std::endl;return 1;}// Bind socket to IP / portstruct sockaddr_in server_address;memset(&server_address, 0, sizeof(server_address));server_address.sin_family = AF_INET;server_address.sin_port = htons(server_port);server_address.sin_addr.s_addr = INADDR_ANY;if (bind(server_sock, (struct sockaddr *)&server_address, sizeof(server_address)) < 0){std::cerr << "Bind failed." << std::endl;close(server_sock);return 1;}// Listenif (listen(server_sock, 10) < 0){std::cerr << "Listen failed." << std::endl;close(server_sock);return 1;}std::cout << "Server is listening on port " << server_port << std::endl;// Accept connectionstruct sockaddr_in client_address;socklen_t client_len = sizeof(client_address);int client_sock = accept(server_sock, (struct sockaddr *)&client_address, &client_len);if (client_sock < 0){std::cerr << "Accept failed." << std::endl;close(server_sock);return 1;}struct rusage usage_start, usage_end;getrusage(RUSAGE_SELF, &usage_start);auto start = high_resolution_clock::now();std::ofstream file(output_file, std::ios::binary);if (!file.is_open()){std::cerr << "Failed to open file for writing." << std::endl;close(client_sock);close(server_sock);return 1;}// Receive datachar *buffer = new char[CHUNK_SIZE];int bytes_received;while ((bytes_received = recv(client_sock, buffer, sizeof(buffer), 0)) > 0){file.write(buffer, bytes_received);}if (bytes_received < 0){std::cerr << "Error in recv()." << std::endl;}else{std::cout << "File received successfully." << std::endl;long pos = file.tellp();auto end = high_resolution_clock::now();getrusage(RUSAGE_SELF, &usage_end);auto duration = duration_cast<milliseconds>(end - start);printUsage(usage_start, usage_end);std::cout << "Total transmission time: " << duration.count() << " ms" << std::endl;std::cout << "Total data transmitted: " << pos << " bytes" << std::endl;std::cout << "Transmission rate: " << (pos * 1000.0 / duration.count()) / 1024 / 1024 << " MB/s" << std::endl;}delete[] buffer;file.close();close(client_sock);close(server_sock);return 0;
}
4. 性能测试与分析
为了验证gRPC与Protobuf的效率,我设置了一个基准测试,比较使用gRPC和传统TCP直接传输大文件的性能差异。
4.1 测试方法
测试方法包括:
- 准备一定大小的测试文件,这里随机生成了2GB的文件。(
fallocate -l 2G 2GBfile.txt
) - 分别使用gRPC和TCP传输此文件,记录所需的总时间和CPU、内存等资源的使用情况。
- 重复测试,确保数据的准确性。
4.1.1 gPRC测试结果
4.1.2 TCP测试结果
4.1.3 多次测试结果
User time | System time | Max resident set size | Transmission time | Transmission rate | |
---|---|---|---|---|---|
gRPC | 0.837348 s | 3.61289 s | 118196 KB | 3.313 s | 618.171 MB/s |
TCP | 15.0924 s | 64.0648 s | 0 KB | 79.672 s | 25.7054 MB/s |
gRPC | 0.782648 s | 8.72327 s | 42992 KB | 8.792 s | 232.939 MB/s |
TCP | 14.2556 s | 61.0553 s | 0 KB | 75.441 s | 27.147 MB/s |
gRPC | 0.921911 s | 2.25723 s | 44548 KB | 2.340 s | 875.214 MB/s |
TCP | 14.1333 s | 61.6136 s | 0 KB | 76.090 s | 26.9155 MB/s |
gRPC | 0.8371 s | 3.73357 s | 130056 KB | 3.565 s | 574.474 MB/s |
TCP | 14.6751 s | 65.7286 s | 0 KB | 80.590 s | 25.4126 MB/s |
gRPC | 0.942197 s | 2.74693 s | 36912 KB | 2.601 s | 787.389 MB/s |
TCP | 18.8037 s | 73.1375 s | 0 KB | 92.526 s | 22.1343 MB/s |
4.2 性能比较结果
4.2.1 CPU 使用
gRPC在CPU使用上明显低于传统的TCP socket方法。这主要因为gRPC内部使用了更现代的HTTP/2协议,它支持多路复用、服务器推送等高效的数据传输机制,而不需要像TCP那样对每个文件传输任务建立单独的连接。此外,gRPC的实现中可能包含了更优化的数据处理路径,减少了上下文切换和系统调用的开销。
4.2.2 内存复用
Max resident set size(最大常驻集大小)表示进程在内存中占用的最大空间。TCP方式的最大常驻集大小一直是0KB,而gRPC分配的比较多,这是由于gRPC框架本身的内存需求,以及可能的内存缓冲机制,这有助于提高数据处理的速率和效率。
4.2.3 传输时间和速率
gRPC在传输速度上极大超过了TCP socket。这种巨大的差异主要来自于gRPC使用HTTP/2的优势,如头部压缩、二进制帧传输和连接复用。HTTP/2的二进制帧结构使得传输更加高效,并减少了因为文本解析带来的开销。此外,连接复用允许在单一连接上并行交换消息,从而显著提升了数据传输效率,减少了因建立和关闭多个TCP连接所产生的延迟和资源消耗。
测试结果显示,使用gRPC和Protobuf传输大文件在多个方面均优于传统TCP方法:
- 传输速度: gRPC利用HTTP/2的多路复用功能,可以在一个连接中并行传输多个文件,显著提升了传输效率。
- 资源利用率: gRPC的传输过程中CPU利用率较低,内存复用率较高,更适合长时间运行的应用场景。
- 错误处理: gRPC内置的错误处理机制能有效地管理网络问题和数据传输错误,保证数据的完整性。
- 高效的数据序列化: Protobuf非常高效,生成的数据包体积小,通常比相等的XML小3到10倍。这意味着在网络上传输相同的数据量时,Protobuf需要的带宽更少。
- 快速序列化与反序列化: Protobuf提供了非常快的数据序列化和反序列化能力,这对于性能要求高的应用尤其重要,可以显著减少数据处理时间。
- 明确的结构定义: 使用Protobuf可以让数据结构更加清晰和严格,有助于团队内部的沟通和后期的维护。
- 避免手动解析:与自定义的二进制格式相比,Protobuf避免了手动解析数据的错误和复杂性,因为解析工作是自动化的,由工具链支持。
5. 结论
使用gRPC和Protobuf传输大文件,不仅提高了传输速度,而且确保了更高的可靠性和更低的资源消耗。这使得gRPC成为大规模数据处理和分布式系统中的理想选择。未来,随着技术的进一步成熟和优化,预计gRPC在更多场景中将显示出其优越性。
希望本文对于那些寻求改进大文件传输性能的开发者有所帮助,并能够启发更多的技术创新和应用。