微服务day07

Elasticsearch

需要安装elasticsearch和Kibana，应为Kibana中有一套控制台可以方便的进行操作。

安装elasticsearch

使用docker命令安装：

docker run -d \ --name es \-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \  //设置他的运行内存空间，不要低于512否则出问题-e "discovery.type=single-node" \ //设置安装模式，当前为单点模式，而非集群模式。-v es-data:/usr/share/elasticsearch/data \-v es-plugins:/usr/share/elasticsearch/plugins \--privileged \--network hm-net \-p 9200:9200 \-p 9300:9300 \  //集群通信的端口。elasticsearch:7.12.1

安装Kibana

docker run -d \
--name kibana \
-e ELASTICSEARCH_HOSTS=http://es:9200 \
--network=hm-net \
-p 5601:5601  \
kibana:7.12.1

通过kibana向es发送Http请求。

倒排索引

小结：

IK分词器

安装

在线安装：

docker exec -it es ./bin/elasticsearch-plugin  install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.1/elasticsearch-analysis-ik-7.12.1.zip

重启：

docker restart es

线下安装：

首先，查看之前安装的Elasticsearch容器的plugins数据卷目录：

docker volume inspect es-plugins

结果如下：

[{"CreatedAt": "2024-11-06T10:06:34+08:00","Driver": "local","Labels": null,"Mountpoint": "/var/lib/docker/volumes/es-plugins/_data","Name": "es-plugins","Options": null,"Scope": "local"}
]

可以看到elasticsearch的插件挂载到了/var/lib/docker/volumes/es-plugins/_data这个目录。我们需要把IK分词器上传至这个目录。

找到课前资料提供的ik分词器插件，课前资料提供了7.12.1版本的ik分词器压缩文件，你需要对其解压：

然后上传至虚拟机的/var/lib/docker/volumes/es-plugins/_data这个目录

重启es容器：

docker restart es

IK分词器包含两种模式：

ik_smart：智能语义切分
ik_max_word：最细粒度切分

POST /_analyze
{"analyzer": "ik_smart","text": "黑马程序员学习java太棒了"
}

结果：

{"tokens" : [{"token" : "黑马","start_offset" : 0,"end_offset" : 2,"type" : "CN_WORD","position" : 0},{"token" : "程序员","start_offset" : 2,"end_offset" : 5,"type" : "CN_WORD","position" : 1},{"token" : "学习","start_offset" : 5,"end_offset" : 7,"type" : "CN_WORD","position" : 2},{"token" : "java","start_offset" : 7,"end_offset" : 11,"type" : "ENGLISH","position" : 3},{"token" : "太棒了","start_offset" : 11,"end_offset" : 14,"type" : "CN_WORD","position" : 4}]
}

由于互联网词汇不断增多，需要拓展词汇库。

所以要想正确分词，IK分词器的词库也需要不断的更新，IK分词器提供了扩展词汇的功能。

1）打开IK分词器config目录：

注意，如果采用在线安装的通过，默认是没有config目录的，需要把课前资料提供的ik下的config上传至对应目录。

在IKAnalyzer.cfg.xml配置文件内容添加：

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties><comment>IK Analyzer 扩展配置</comment><!--用户可以在这里配置自己的扩展字典 *** 添加扩展词典--><entry key="ext_dict">ext.dic</entry>
</properties>

在IK分词器的config目录新建一个 ext.dic，可以参考config目录下复制一个配置文件进行修改：

传智播客
泰裤辣

重启elasticsearch

docker restart es

再次测试，可以发现传智播客和泰裤辣都正确分词了：

{"tokens" : [{"token" : "传智播客","start_offset" : 0,"end_offset" : 4,"type" : "CN_WORD","position" : 0},{"token" : "开设","start_offset" : 4,"end_offset" : 6,"type" : "CN_WORD","position" : 1},{"token" : "大学","start_offset" : 6,"end_offset" : 8,"type" : "CN_WORD","position" : 2},{"token" : "真的","start_offset" : 9,"end_offset" : 11,"type" : "CN_WORD","position" : 3},{"token" : "泰裤辣","start_offset" : 11,"end_offset" : 14,"type" : "CN_WORD","position" : 4}]
}

总结

分词器的作用是什么？

创建倒排索引时，对文档分词
用户搜索时，对输入的内容分词

IK分词器有几种模式？

ik_smart：智能切分，粗粒度
ik_max_word：最细切分，细粒度

IK分词器如何拓展词条？如何停用词条？

利用config目录的IkAnalyzer.cfg.xml文件添加拓展词典和停用词典
在词典中添加拓展词条或者停用词条

基础概念

索引库操作

Mapping映射属性

索引库操作

Restful规范

索引库操作

创建索引库：

基本语法：

请求方式：PUT
请求路径：/索引库名，可以自定义
请求参数：mapping映射

格式：

PUT /索引库名称
{"mappings": {"properties": {"字段名":{"type": "text","analyzer": "ik_smart"},"字段名2":{"type": "keyword","index": "false"},"字段名3":{"properties": {"子字段": {"type": "keyword"}}},// ...略}}
}

实例代码

PUT /hmall
{"mappings": {"properties": {"info":{"type": "text","analyzer": "ik_smart"},"email":{"type": "keyword","index": false},"name":{"properties": {"firstName":{"type":"keyword"},"lastName":{"type":"keyword"}}}}}
}

返回结果：

{"acknowledged" : true,"shards_acknowledged" : true,"index" : "hmall"
}

查询索引库

基本语法：

请求方式：GET
请求路径：/索引库名
请求参数：无

格式

GET /索引库名

实例代码

GET /hmall

查询结果：

{"hmall" : {"aliases" : { },"mappings" : {"properties" : {"email" : {"type" : "keyword","index" : false},"info" : {"type" : "text","analyzer" : "ik_smart"},"name" : {"properties" : {"firstName" : {"type" : "keyword"},"lastName" : {"type" : "keyword"}}}}},"settings" : {"index" : {"routing" : {"allocation" : {"include" : {"_tier_preference" : "data_content"}}},"number_of_shards" : "1","provided_name" : "hmall","creation_date" : "1731489076840","number_of_replicas" : "1","uuid" : "KJqNTGY9RhiqFHkliRdGcw","version" : {"created" : "7120199"}}}}
}

修改索引库

倒排索引结构虽然不复杂，但是一旦数据结构改变（比如改变了分词器），就需要重新创建倒排索引，这简直是灾难。因此索引库一旦创建，无法修改mapping。

虽然无法修改mapping中已有的字段，但是却允许添加新的字段到mapping中，因为不会对倒排索引产生影响。因此修改索引库能做的就是向索引库中添加新字段，或者更新索引库的基础属性。

语法说明：

PUT /索引库名/_mapping
{"properties": {"新字段名":{"type": "integer"}}
}

添加字段：

PUT /hmall/_mapping
{"properties": {"age":{"type": "integer"}}
}

返回结果：

{"acknowledged" : true
}

删除索引库

语法：

请求方式：DELETE
请求路径：/索引库名
请求参数：无

语法格式

DELETE /索引库名

删除库

DELETE /hmall

执行结果

{"acknowledged" : true
}

索引库操作总结：

索引库操作有哪些？

创建索引库：PUT /索引库名
查询索引库：GET /索引库名
删除索引库：DELETE /索引库名
修改索引库，添加字段：PUT /索引库名/_mapping

可以看到，对索引库的操作基本遵循的Restful的风格，因此API接口非常统一，方便记忆。

文档操作

新增文档

语法规范

POST /索引库名/_doc/文档id
{"字段1": "值1","字段2": "值2","字段3": {"子属性1": "值3","子属性2": "值4"},
}

实例代码

POST /hmall/_doc/1
{"info": "黑马程序员Java讲师","email": "zy@itcast.cn","name": {"firstName": "云","lastName": "赵"}
}

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 8,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 8,"_primary_term" : 1
}

查询文档

语法规范

GET /{索引库名称}/_doc/{id}

实例代码

GET /hmall/_doc/1

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 8,"_seq_no" : 8,"_primary_term" : 1,"found" : true,"_source" : {"info" : "黑马程序员Java讲师","email" : "zy@itcast.cn","name" : {"firstName" : "云","lastName" : "赵"}}
}

删除文档

语法规范

DELETE /{索引库名}/_doc/id值

实例代码

DELETE /hmall/_doc/1

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 9,"result" : "deleted","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 9,"_primary_term" : 1
}

修改文档

全量修改

语法规范

PUT /{索引库名}/_doc/文档id
{"字段1": "值1","字段2": "值2",// ... 略
}

实例代码

#全量修改
PUT /hmall/_doc/1
{"info": "黑马程序员Python讲师","email": "ls@itcast.cn","name": {"firstName": "四","lastName": "李"}
}

运行结果

第一次运行，由于前面已经删除了该文档所以执行类型为 created.

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 10,"result" : "created","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 10,"_primary_term" : 1
}

第二次运行，为修改返回的类型为 updated。

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 11,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 11,"_primary_term" : 1
}

部分修改

语法规范

POST /{索引库名}/_update/文档id
{"doc": {"字段名": "新的值",}
}

实例代码

#局部修改
POST /hmall/_update/1
{"doc": {"email":"ZhaoYuen@itcast.com"}
}

运行结果

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 12,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 12,"_primary_term" : 1
}

当前文档的数据：

{"_index" : "hmall","_type" : "_doc","_id" : "1","_version" : 12,"_seq_no" : 12,"_primary_term" : 1,"found" : true,"_source" : {"info" : "黑马程序员Python讲师","email" : "ZhaoYuen@itcast.com","name" : {"firstName" : "四","lastName" : "李"}}
}

文档基础操作小结

文档批量处理

index代表新增操作
- _index：指定索引库名
- _id指定要操作的文档id
- { "field1" : "value1" }：则是要新增的文档内容
delete代表删除操作
- _index：指定索引库名
- _id指定要操作的文档id
update代表更新操作
- _index：指定索引库名
- _id指定要操作的文档id
- { "doc" : {"field2" : "value2"} }：要更新的文档字段

#批量新增
POST /_bulk
{"index": {"_index":"hmall", "_id": "3"}}
{"info": "黑马程序员C++讲师", "email": "ww@itcast.cn", "name":{"firstName": "五", "lastName":"王"}}
{"index": {"_index":"hmall", "_id": "4"}}
{"info": "黑马程序员前端讲师", "email": "zhangsan@itcast.cn", "name":{"firstName": "三", "lastName":"张"}}POST /_bulk
{"delete":{"_index":"hmall", "_id": "3"}}
{"delete":{"_index":"hmall", "_id": "4"}}

JavaRestClient

作用：向restful接口发送请求。

客户端初始化

1）在item-service模块中引入es的RestHighLevelClient依赖：

<dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency>

2）因为SpringBoot默认的ES版本是7.17.10，所以我们需要覆盖默认的ES版本：

  <properties><maven.compiler.source>11</maven.compiler.source><maven.compiler.target>11</maven.compiler.target><elasticsearch.version>7.12.1</elasticsearch.version></properties>

3）初始化RestHighLevelClient：

初始化的代码如下：

RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.150.101:9200")
));

4)创建一个测试用例来实现

package com.hmall.item.es;import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;import java.io.IOException;public class Test {private RestHighLevelClient client;//创建开始初始化函数，对客户端进行初始化@BeforeEachvoid setUp() {this.client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.21.101:9200")));}//测试代码@org.junit.jupiter.api.Testpublic void test(){System.out.println(client);}//测试用例结束后释放资源@AfterEachvoid tearDown() throws IOException {client.close();}
}

商品Mapping映射

分析那些字段需要添加到文档中。

根据分析编写的需求：

PUT /hmall
{"mappings": {"properties": {"id":{"type": "keyword"},"name":{"type": "text","analyzer": "ik_smart"},"price":{"type": "integer"},"image":{"type": "keyword","index": false},"category":{"type": "keyword"},"brand":{"type": "keyword"},"sold":{"type": "integer"},"commentCount":{"type": "integer","index": false},"isAD":{"type": "boolean"},"updateTime":{"type": "date"}}}
}

索引库操作

创建索引库：

    //创建索引库@org.junit.jupiter.api.Testpublic void testCreateIndex() throws IOException {//创建Requst对象CreateIndexRequest request = new CreateIndexRequest("items");//准备参数 ,将编写的JSON字符串赋值给MAPPINF_JSON，XContentType.JSON用于指定是JSON格式request.source(MAPPINF_JSON, XContentType.JSON);//发送请求，RequestOptions.DEFAULT为是否有自定义配置，选择默认client.indices().create(request, RequestOptions.DEFAULT);}private final String MAPPINF_JSON = "{\n" +"  \"mappings\": {\n" +"    \"properties\": {\n" +"      \"id\":{\n" +"        \"type\": \"keyword\"\n" +"      },\n" +"      \"name\":{\n" +"        \"type\": \"text\",\n" +"        \"analyzer\": \"ik_smart\"\n" +"      },\n" +"      \"price\":{\n" +"        \"type\": \"integer\"\n" +"      },\n" +"      \"image\":{\n" +"        \"type\": \"keyword\",\n" +"        \"index\": false\n" +"      },\n" +"      \"category\":{\n" +"        \"type\": \"keyword\"\n" +"      },\n" +"      \"brand\":{\n" +"        \"type\": \"keyword\"\n" +"      },\n" +"      \"sold\":{\n" +"        \"type\": \"integer\"\n" +"      },\n" +"      \"commentCount\":{\n" +"        \"type\": \"integer\",\n" +"        \"index\": false\n" +"      },\n" +"      \"isAD\":{\n" +"        \"type\": \"boolean\"\n" +"      },\n" +"      \"updateTime\":{\n" +"        \"type\": \"date\"\n" +"      }\n" +"    }\n" +"  }\n" +"}";

删除索引库

    //删除索引库@org.junit.jupiter.api.Testpublic void testDeleteIndex() throws IOException {//创建Requst对象DeleteIndexRequest request = new DeleteIndexRequest("items");//发送请求client.indices().delete(request,RequestOptions.DEFAULT);}

查看索引库

    //查找索引库@org.junit.jupiter.api.Testpublic void testGetIndex() throws IOException {//创建Requst对象GetIndexRequest request = new GetIndexRequest("items");//发送2请求
//        client.indices().get(request, RequestOptions.DEFAULT);//由于GEt请求返回的数据较多，如果只需要判断是否存在，则使用exists方法boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);System.out.println("exists = " + exists);}

文档操作

新增文档

由于索引库结构与数据库结构还存在一些差异，因此我们要定义一个索引库结构对应的实体。

package com.hmall.item.domain.po;import io.swagger.annotations.ApiModel;
import io.swagger.annotations.ApiModelProperty;
import lombok.Data;import java.time.LocalDateTime;@Data
@ApiModel(description = "索引库实体")
public class ItemDoc{@ApiModelProperty("商品id")private String id;@ApiModelProperty("商品名称")private String name;@ApiModelProperty("价格（分）")private Integer price;@ApiModelProperty("商品图片")private String image;@ApiModelProperty("类目名称")private String category;@ApiModelProperty("品牌名称")private String brand;@ApiModelProperty("销量")private Integer sold;@ApiModelProperty("评论数")private Integer commentCount;@ApiModelProperty("是否是推广广告，true/false")private Boolean isAD;@ApiModelProperty("更新时间")private LocalDateTime updateTime;
}

API语法

POST /{索引库名}/_doc/1
{"name": "Jack","age": 21
}

可以看到与索引库操作的API非常类似，同样是三步走：

1）创建Request对象，这里是IndexRequest，因为添加文档就是创建倒排索引的过程
2）准备请求参数，本例中就是Json文档
3）发送请求

变化的地方在于，这里直接使用client.xxx()的API，不再需要client.indices()了。

    //新增文档@org.junit.jupiter.api.Testpublic void test() throws IOException {//读取数据库获取商品信息Item item = service.getById(317578);//将商品信息对象转换为DtoItemDoc itemDoc = BeanUtil.copyProperties(item, ItemDoc.class);//将对象转换为JSONString jsonStr = JSONUtil.toJsonStr(itemDoc);//创建Requst对象IndexRequest request = new IndexRequest("items").id(itemDoc.getId());request.source(jsonStr, XContentType.JSON);//发送信息client.index(request, RequestOptions.DEFAULT);}

package com.hmall.item.es;import cn.hutool.core.bean.BeanUtil;
import cn.hutool.json.JSONUtil;
import com.hmall.item.domain.dto.ItemDoc;
import com.hmall.item.domain.po.Item;
import com.hmall.item.service.impl.ItemServiceImpl;
import org.apache.http.HttpHost;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;
@SpringBootTest(properties = {"spring.profiles.active=local"})
public class docTest {private RestHighLevelClient client;@Autowiredprivate ItemServiceImpl service;//创建开始初始化函数，对客户端进行初始化@BeforeEachvoid setUp() {this.client = new RestHighLevelClient(RestClient.builder(HttpHost.create("http://192.168.21.101:9200")));}//新增文档,全局修改文档@org.junit.jupiter.api.Testpublic void test() throws IOException {//读取数据库获取商品信息Item item = service.getById(317578);//将商品信息对象转换为DtoItemDoc itemDoc = BeanUtil.copyProperties(item, ItemDoc.class);//将对象转换为JSONString jsonStr = JSONUtil.toJsonStr(itemDoc);//创建Requst对象IndexRequest request = new IndexRequest("items").id(itemDoc.getId());request.source(jsonStr, XContentType.JSON);//发送信息client.index(request, RequestOptions.DEFAULT);}//查看文档@org.junit.jupiter.api.Testpublic void testGet() throws IOException {//创建Requst对象GetRequest request = new GetRequest("items","317578");//发送信息GetResponse response = client.get(request, RequestOptions.DEFAULT);String json = response.getSourceAsString();ItemDoc bean = JSONUtil.toBean(json, ItemDoc.class);System.out.println("bean = " + bean);}//删除文档@org.junit.jupiter.api.Testpublic void testdelect() throws IOException {//创建Requst对象DeleteRequest request = new DeleteRequest("items","317578");//发送信息client.delete(request, RequestOptions.DEFAULT);}//修改文档@org.junit.jupiter.api.Testpublic void testupdata2() throws IOException {//创建Requst对象UpdateRequest request = new UpdateRequest("items","317578");//数据都是独立的不是成对出现的request.doc("price", 10000);//发送信息client.update(request, RequestOptions.DEFAULT);}//测试用例结束后释放资源@AfterEachvoid tearDown() throws IOException {client.close();}}

批量操作文档

    //新增文档,全局修改文档@org.junit.jupiter.api.Testpublic void testBulk() throws IOException {//设置分页信息int size = 1000,pageNo = 1;//循环写入数据while (true){//读取数据库获取商品信息分页查询//设置当前查询条件，即状态为起售//创建分页对象Page<Item> page = service.lambdaQuery().eq(Item::getStatus, 1).page(new Page<Item>(pageNo, size));List<Item> records = page.getRecords();//判空if (records.isEmpty()||records == null){return;}//创建Requst对象
//            BulkRequest request = new BulkRequest();
//            for (Item item : records) {
//                request.add(new IndexRequest("items")
//                        .id(item.getId().toString())
//                        .source(JSONUtil.toJsonStr(BeanUtil.copyProperties(item, ItemDoc.class))),XContentType.JSON
//                );
//            }BulkRequest request = new BulkRequest("items");for (Item item : records) {// 2.1.转换为文档类型ItemDTOItemDoc itemDoc = BeanUtil.copyProperties(item, ItemDoc.class);// 2.2.创建新增文档的Request对象request.add(new IndexRequest().id(itemDoc.getId()).source(JSONUtil.toJsonStr(itemDoc), XContentType.JSON));}//发送信息client.bulk(request, RequestOptions.DEFAULT);pageNo++;}}