由于es高亮显示机制的问题。当全文内容过多,且搜索中标又少时,就会出现高亮结果无法覆盖全文。因此需要根据需求手动替换。
1.根据es的ik分词器获取搜索词的分词结果。
es部分:
//中文分词解析
post /_analyze
{"analyzer":"ik_smart","text":"谷歌浏览器"
}//结果
{"tokens": [{"token": "谷歌","start_offset": 0,"end_offset": 2,"type": "CN_WORD","position": 0},{"token": "浏览器","start_offset": 2,"end_offset": 5,"type": "CN_WORD","position": 1}]
}
注意:ik_smart 是最粗颗粒度,不会有重复分词。ik_max_word 是最细颗粒度,会有重复分词。高亮显示只需要最粗即可。
ik_smart:
ik_max_word:
将es的语句转为Java语句:
//主要使用的包
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.elasticsearch.client.Request;
import org.elasticsearch.client.Response;
import org.elasticsearch.client.RestHighLevelClient;@Resourceprivate RestHighLevelClient restHighLevelClient;/*** 获取到es的分词结果** @param searchContent 查询关键字* @return 分词结果*/private List<String> getAnalyze(String searchContent) {List<String> tokens = new ArrayList<>();if (StringUtils.isNotEmpty(searchContent)) {String endpoint = "/_analyze";String body = "{\n" +" \"analyzer\": \"ik_smart\",\n" +" \"text\": \"" + searchContent + "\"\n" +"}";try {Request request = new Request("POST", endpoint);request.setJsonEntity(body);Response response = restHighLevelClient.getLowLevelClient().performRequest(request);InputStream content = response.getEntity().getContent();JsonNode jsonNode = objectMapper.readTree(content);if (jsonNode.has("tokens")) {for (JsonNode token : jsonNode.get("tokens")) {tokens.add(token.get("token").asText());}}} catch (IOException | UnsupportedOperationException e) {log.error("ES查询分词异常", e);}}return tokens;}
2.根据获取到的多个分词数据。替换全文内容。
/*** 根据多个需要替换的字符,高效替换全文数据* @param replaceStrList 替换字符* @param content 全文* @return 高亮显示的全文*/private String replaceHighlight(List<String> replaceStrList, String content) {StringBuffer result = new StringBuffer();try {Map<String, String> replacements = new HashMap<>();for (String replaceStr : replaceStrList) {replacements.put(replaceStr, "<font class='eslight'>" + replaceStr + "</font>");}Pattern pattern = Pattern.compile(String.join("|", replacements.keySet()));Matcher matcher = pattern.matcher(content);while (matcher.find()) {matcher.appendReplacement(result, replacements.get(matcher.group(0)));}matcher.appendTail(result);} catch (Exception e) {log.error("替换高亮显示异常", e);}return result.toString();}
此时就能将全文关键词以分词的效果高亮显示了。