6.聊天室环境安装 - Ubuntu22.04 - elasticsearch(es)的安装和使用

介绍

Elasticsearch，简称 ES，它是个开源分布式搜索引擎，它的特点有：分布式，零配置，自动发现，索引自动分片，索引副本机制，restful 风格接口，多数据源，自动搜索负载等。它可以近乎实时的存储、检索数据；本身扩展性很好，可以扩展到上百台服务器，处理 PB 级别的数据。es 也使用 Java 开发并使用 Lucene 作为其核心来实现所有索引和搜索的功能，但是它的目的是通过简单的 RESTful API 来隐藏 Lucene 的复杂性，从而让全文搜索变得简单。
Elasticsearch 是面向文档(document oriented)的，这意味着它可以存储整个对象或文档(document)。然而它不仅仅是存储，还会索引(index)每个文档的内容使之可以被搜索。在 Elasticsearch 中，你可以对文档（而非成行成列的数据）进行索引、搜索、排序、过滤。

安装

1.添加仓库密钥
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
2.添加镜像源仓库
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elasticsearch.list
3.更新软件包列表
sudo apt update
4.安装es
sudo apt-get install elasticsearch=7.17.21
5.安装ik分词器插件
sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install https://get.infini.cloud/elasticsearch/analysis-ik/7.17.21
6.启动es
sudo systemctl start elasticsearch
6.5如果启动es失败
调整 ES 虚拟内存，虚拟内存默认最大映射数为 65530，无法满足 ES 系统要求，需要调整为 262144 以上
sysctl -w vm.max_map_count=262144
增加虚拟机内存配置
vim /etc/elasticsearch/jvm.options
新增如下内容:
-Xms512m
-Xmx512m
在这里插入图片描述

设置完后重启ubuntu
7.查看es服务的状态
sudo systemctl status elasticsearch.service
在这里插入图片描述
8.验证es是否安装成功
curl -X GET "http://localhost:9200/"

9.设置能够外部访问: 如果新配置完成默认只能在本机进行访问
vim /etc/elasticsearch/elasticsearch.yml
新增配置

network.host: 0.0.0.0
http.port: 9200
cluster.initial_master_nodes: ["node-1"]

在这里插入图片描述
重启后sudo systemctl restart elasticsearch.service
浏览器访问http://自己的IP:9200/

安装kibana

kibana可以支持通过网页对ES进行访问(增删查改), 这可以让我们的测试更加直观一些
1.安装kibana
sudo apt install kibana
2.配置kibana
sudo vim /etc/kibana/kibana.yml
在7行修改为server.host: "0.0.0.0"
在这里插入图片描述

在32行修改为elasticsearch.hosts: ["http://0.0.0.0:9200"]
在这里插入图片描述
3.启动kibana服务
sudo systemctl start kibana
4.验证安装
sudo systemctl status kibana
5.访问kibana
在浏览器访问kibana, http://你的ip:5601

安装ES客户端

1.克隆代码
git clone https://github.com/seznam/elasticlient
注意: 这里不能从github下载源码然后拖拽进来, 因为内部有git子模块, 需要去更新子模块之后, 才能去编译
如果无法始终clone不下来, 也有对应的解决方案:https://blog.csdn.net/eyuyanniniu/article/details/145807381
2.切换目录
cd elasticlient
3.安装 MicroHTTPD 库
sudo apt-get install libmicrohttpd-dev
4.更新子模块
git submodule update --init --recursive
5.编译安装代码
mkdir build
cd build
cmake ..
make
make install

6.配置环境变量
因为我们的库默认安装路径是/usr/local/lib, 编译器可能找不到这个库目录的位置
所以我们需要配置(这些文件最好都进行配置):
全局设置: /etc/profile 当前用户设置: .bash_profil或.bashrc
在文件末尾加上 export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

使用

封装icsearch.hpp文件

#include <elasticlient/client.h>
#include <cpr/cpr.h>
#include <json/json.h>
#include <iostream>
#include <memory>
#include "logger.hpp"//ES的二次封装, 原因: 为了简化ES的使用操作, 我们可以看到, 请求的时候, 正文很长, 我们希望只设置我们关心的参数即可, 而且能自动的构造完成
//封装四个操作: 索引创建, 数据新增, 数据查询, 数据删除namespace wufan_im{
bool UnSerialize(const std::string& src, Json::Value& val)
{// 同样的Read类, 需要先构造出工厂类Json::CharReaderBuilder crb;std::unique_ptr<Json::CharReader> cr(crb.newCharReader());std::string err;bool ret = cr->parse(src.c_str(), src.c_str() + src.size(), &val, &err);if (ret == false) {std::cout << "json反序列化失败: " << err << std::endl;return false;}return true;
}bool Serialize(const Json::Value& val, std::string& dst)
{// Writer(StreamWriter)类, 这个类就是用来序列化的, 但是这个类不能直接构造, 因为使用了工厂模式// 先定义Json::SreamWriter 工厂类 Json::StreamWriterBuilderJson::StreamWriterBuilder swb;  //构造出工厂类std::unique_ptr<Json::StreamWriter> sw(swb.newStreamWriter());// 通过Json::StreamWriter中的write接口进行序列化std::stringstream ss;int ret = sw->write(val, &ss);  //将其序列化到字符流里面if (ret != 0) {std::cout << "Json反序列化失败!\n";return false;}dst = ss.str();return true;
}// 索引创建:
// 传两个参数, 索引名称 和 索引类型 就可以创建出索引
// 能够添加字段, 并设置字段类型, 设置分词器类型, 是否构造索引
class ESIndex{public:ESIndex(std::shared_ptr<elasticlient::Client>& client, const std::string& name, const std::string& type = "_doc"):_name(name), _type(type), _client(client){Json::Value analysis;   //可以把Value当做Json里的{ }Json::Value analyzer;Json::Value ik;Json::Value tokenizer;tokenizer["tokenizer"] = "ik_max_word";ik["ik"] = tokenizer;analyzer["analyzer"] = ik;analysis["analysis"] = analyzer;_index["settings"] = analysis;}// 创建索引, 就相当于在设置表结构 - ai说的// 添加字段, 就相当于设置表的字段属性ESIndex& append(const std::string& key, const std::string& type = "text", const std::string& analyzer = "ik_max_word", bool enabled = true) {Json::Value fields;fields["type"] = type;fields["analyzer"] = analyzer;if (enabled == false) fields["enabled"] = enabled;_properties[key] = fields;return *this;}bool create(const std::string& index_id = "default_index_id") {Json::Value mappings;mappings["dynamic"] = true;mappings["properties"] = _properties;_index["mappings"] = mappings;std::string body;bool ret = Serialize(_index, body);if (ret == false) {LOG_ERROR("索引序列化失败! ");return false;}LOG_DEBUG("{}", body);// 2. 发起搜索请求try{    //因为请求失败就可能会抛异常, 异常你不接住, 程序就会崩溃auto rsp = _client->index(_name, _type, index_id, body);if (rsp.status_code < 200 || rsp.status_code >= 300) {LOG_ERROR("创建ES索引 {} 失败, 响应状态码异常: {}", _name, rsp.status_code);return false;}} catch(std::exception& e) {LOG_ERROR("创建ES索引 {} 失败: {}", _name, e.what());return false;}return true;}private:std::string _name;std::string _type;Json::Value _properties;Json::Value _index;std::shared_ptr<elasticlient::Client> _client;
};// 数据新增
class ESInsert{public:ESInsert(std::shared_ptr<elasticlient::Client>& client, const std::string& name,const std::string& type = "_doc"):_name(name), _type(type), _client(client){}ESInsert& append(const std::string& key, const std::string& val){_item[key] = val;return *this;}// 插入到哪个id里面 -  这个ID就相当于是每一次插入时数据的唯一标识bool insert(const std::string id = ""){std::string body;bool ret = Serialize(_item, body);if (ret == false) {LOG_ERROR("索引序列化失败! ");return false;}LOG_DEBUG("{}", body);// 2. 发起搜索请求try{    //因为请求失败就可能会抛异常, 异常你不接住, 程序就会崩溃auto rsp = _client->index(_name, _type, id, body);if (rsp.status_code < 200 || rsp.status_code >= 300) {LOG_ERROR("新增数据 {} 失败, 响应状态码异常: {}", body, rsp.status_code);return false;}} catch(std::exception& e) {LOG_ERROR("新增数据 {} 失败: {}", body, e.what());return false;}return true;}private:std::string _name;std::string _type;Json::Value _item;std::shared_ptr<elasticlient::Client> _client;
};// 数据删除
class ESRemove{public:ESRemove(std::shared_ptr<elasticlient::Client>& client, const std::string& name, const std::string& type):_name(name), _type(type), _client(client){}bool remove(const std::string& id) {try{    //因为请求失败就可能会抛异常, 异常你不接住, 程序就会崩溃auto rsp = _client->remove(_name, _type, id);if (rsp.status_code < 200 || rsp.status_code >= 300) {LOG_ERROR("删除数据 {} 失败, 响应状态码异常: {}", id, rsp.status_code);return false;}} catch(std::exception& e) {LOG_ERROR("删除数据 {} 失败: {}", id, e.what());return false;}return true;}private:std::string _name;std::string _type;std::shared_ptr<elasticlient::Client> _client;
};//数据查询
class ESSearch{public: //用户还会设置过滤条件，以及应该包含的字段ESSearch(std::shared_ptr<elasticlient::Client>& client, const std::string& name, const std::string& type = "_doc"):_name(name), _type(type), _client(client){}ESSearch& append_must_not_terms(const std::string& key, const std::vector<std::string>& vals){Json::Value fields;for (const auto& val : vals) {fields[key].append(val);}Json::Value terms;terms["terms"] = fields;_must_not.append(terms);return *this;}ESSearch& append_should_match(const std::string& key, const std::string& val) {Json::Value field;field[key] = val;Json::Value match;match["match"] = field;_should.append(match);return *this;}Json::Value search() {Json::Value cond;if (_must_not.empty() == false) cond["must_not"] = _must_not;if (_should.empty() == false) cond["should"] = _should;Json::Value query;query["bool"] = cond;Json::Value root;root["query"] = query;std::string body;bool ret = Serialize(root, body);if (ret == false) {LOG_ERROR("索引序列化失败! ");return Json::Value();}LOG_DEBUG("{}", body);// 2. 发起搜索请求cpr::Response rsp;try{    //因为请求失败就可能会抛异常, 异常你不接住, 程序就会崩溃rsp = _client->search(_name, _type, body);if (rsp.status_code < 200 || rsp.status_code >= 300) {LOG_ERROR("检索数据 {} 失败, 响应状态码异常: {}", body, rsp.status_code);return Json::Value();}} catch(std::exception& e) {LOG_ERROR("检索数据 {} 失败: {}", body, e.what());return Json::Value();}//3. 需要对响应正文进行反序列化LOG_DEBUG("检索响应正文: [{}]", rsp.text);Json::Value json_res;ret = UnSerialize(rsp.text, json_res);if (ret == false) {LOG_ERROR("检索数据 {} 结果反序列化失败", rsp.text);return Json::Value();}return json_res["hits"]["hits"];}private:std::string _name;std::string _type;//用户还会设置过滤条件，以及应该包含的字段Json::Value _must_not;  //必须不包含的Json::Value _should;    //必须包含的, 多选一即可std::shared_ptr<elasticlient::Client> _client;
};
}

main.cc文件

#include "../../common/icsearch.hpp"
#include <gflags/gflags.h>DEFINE_bool(run_mode, false, "程序的运行模式, false-调试; true-发布;");
DEFINE_string(log_file, "", "发布模式下, 用于指定日志的输出文件");
DEFINE_int32(log_level, 0, "发布模式下, 用于指定日志输出等级");int main(int argc, char* argv[])
{google::ParseCommandLineFlags(&argc, &argv, true);wufan_im::init_logger(FLAGS_run_mode, FLAGS_log_file, FLAGS_log_level);std::shared_ptr<elasticlient::Client> client(new elasticlient::Client({"http://127.0.0.1:9200/"}));bool ret = wufan_im::ESIndex(client, "test_user")// 创建索引, 就相当于在设置表结构 - ai说的// 添加字段, 就相当于设置表的字段属性.append("nickname").append("phone", "keyword", "standard", true) //手机号是不能进行分词的, 是一个关键字, 分词器用标准分词器, 需要构造索引.create();if (ret == false) {LOG_INFO("索引创建失败!");return -1;}LOG_INFO("索引创建成功");// 新增数据ret = wufan_im::ESInsert(client, "test_user").append("nickname", "张三").append("phone", "155666777").insert("00001");   // 这个ID就相当于是每一次插入时数据的唯一标识if (ret == false) {LOG_ERROR("数据插入失败!");return -1;}// 数据的修改ret = wufan_im::ESInsert(client, "test_user").append("nickname", "张三").append("phone", "1334444555").insert("00001");if (ret == false) {LOG_ERROR("数据更新失败!");return -1;}LOG_INFO("数据新增成功");Json::Value user = wufan_im::ESSearch(client, "test_user").append_should_match("phone.keyword", "1334444555")     //检索的时候, 告诉ES, 这个关键词不要进行分词// .append_must_not_terms("nickname.keyword", {"张三"}).search();if (user.empty() || user.isArray() == false) {LOG_ERROR("结果为空, 或者结果不是数组类型");return -1;}LOG_INFO("数据检索成功");int sz = user.size();LOG_DEBUG("检索结果条目数量: {}", sz);for (int i = 0; i < sz; ++i) {LOG_INFO("nickname: {}", user[i]["_source"]["nickname"].asString());}ret = wufan_im::ESRemove(client, "test_user", "_doc").remove("00001");if (ret == false) {LOG_ERROR("删除数据失败");return -1;}LOG_INFO("数据删除成功");return 0;
}