本文记录了使用spring-boot-starter-data-elasticsearch 包中的API去执行复杂查询的一些示例。文中会结合文字需求描述,以及实现代码进行说明。
引入依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
配置PO类
这些内容。在加载的时候,会自动在elasticsearch后台创建索引。@Field的一些属性配置,则会给Index建立不同filed的mapping字段
Document 和Field注解说明
其中english 是一种默认的 language analyzer
@AllArgsConstructor
@NoArgsConstructor
@Data
@Document(indexName = "person",type = "man", shards = 1,replicas = 1)
public class ES_Person {
@Id
private Integer id;
@Field(type = FieldType.keyword)
private String name;
private Integer age;
@Field(type = FieldType.keyword)
private String power;
@Field(type = FieldType.text, analyzer = "english")
private String remark;
@Field(type = FieldType.text, analyzer = "english")
private String remark2;
@Field(type = FieldType.text, analyzer = "english")
private String remark3;
}
类似于在Kibana后台创建一个index,这个Index一旦创建,字段的内容是不能修改的。但是可以新增字段,一般系统会根据字段类型自动增加新字段的内容。
#### 上面的对象创建类似于使用下面这种restful API 风格的语句
PUT person
{
"settings": {
"number_of_shards": 1
, "number_of_replicas": 1
},
"mappings": {
"properties": {
"name":{
"type": "keyword",
"ignore_above": 256
},
"age":{
"type": "long"
},
"power":{
"type": "keyword",
"ignore_above": 256
},
"remark":{
"type": "text",
"analyzer": "english",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"remark2":{
"type": "text",
"analyzer": "english",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"remark3":{
"type": "text",
"analyzer": "english",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}


上面两个内容不一致是因为我把name设置为了type="keyword"关键字,也就是整体作为一个查询的内容,不进行分词。remark使用的是language analyzer分词器,会根据空格把句子分成词组,建立倒排索引和正序索引。
建立Repository
/**
* @program: espractice
* @description:
* @author: ZengGuangfu
* @create 2019-08-27 19:30
*/
public interface ES_PersonRepository extends ElasticsearchRepository<ES_Person,Integer>{
/** 类似于springDataJPA,可以通过提示信息建立非常多的快速接口直接调用 */
void findAllByNameEqualsAndAgeAfter(String name,Integer age);
}

列表如上,但是这些并不是本文的主要内容
自定义复杂查询
NativeSearchQueryBuilder
对象通常是用来组成比较复杂的查询请求,可囊括包含多种的查询条件的语句,以后的例子会有更多的体现。
以下的例子会包含一些分页查询、模糊查询、全匹配、短句、权重分配等查询
/** 测试用返回成功信息 */
public static final String SUCCESS = "success";
@Autowired
private ES_PersonRepository es_personRepository;
- 示例1:在 Elasticsearch 批量新增 Document
@PutMapping
public String createRepository(HttpServletRequest request){
ArrayList<ES_Person> esPeopleList = new ArrayList<>();
esPeopleList.add(new ES_Person(1,"zengguangfu",26,"m","handsome qiangwudi, a good boy","a good boy","handsome, Shame on you"));
esPeopleList.add(new ES_Person(2,"wuyanzu",45,"m","come from HK","come from HongKong","fly high"));
esPeopleList.add(new ES_Person(3,"pengyuyan",37,"h","come from HongKong too","like bool","nice man"));
esPeopleList.add(new ES_Person(4,"linzhiyin",40,"h","wan wan","come from TaiWan of China","handsome"));
es_personRepository.saveAll(esPeopleList);
return SUCCESS;
}
- 示例2:分页查询
public Page<ES_Person> pageSearch(Integer pageNum,Integer pageSize){
Pageable pageable = PageRequest.of(pageNum, pageSize);
// 方式一,直接使用 Pageable 对象查询
//Page<ES_Person> resultPage = es_personRepository.findAll(PageRequest.of(pageNum, pageSize));
// 方式二,使用通用NativeSearchQueryBuilder对象拼接复杂的查询结果
NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
nativeSearchQueryBuilder.withPageable(pageable);
Page<ES_Person> resultPage = es_personRepository.search(nativeSearchQueryBuilder.build());
return new PageImpl<ES_Person>(resultPage.getContent());
}
- 示例3:按照某字段排序 分页查询
/**
* 按照某字段排序 分页查询
* field 要排序的字段
* sort 排序的方式,传入asc 或 desc
*/
public Page<ES_Person> pageAndSortSearch(Integer pageNum,Integer pageSize,String field, String sort){
Pageable pageable = PageRequest.of(pageNum, pageSize);
NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
nativeSearchQueryBuilder.withPageable(pageable);
// 针对 field 字段进行分页查询
if (StringUtils.equals(sort.toUpperCase(),SortOrder.ASC.toString().toUpperCase())){
nativeSearchQueryBuilder.withSort(SortBuilders.fieldSort(field).order(SortOrder.ASC));
}else{
nativeSearchQueryBuilder.withSort(SortBuilders.fieldSort(field).order(SortOrder.DESC));
}
Page<ES_Person> result = es_personRepository.search(nativeSearchQueryBuilder.build());
return new PageImpl<ES_Person>(result.getContent());
}
- 示例4:查询年龄20岁到40岁之间的
/**
* 查询年龄在20 到 40 岁之间的
* 用NativeSearchQueryBuilder也可以包装BoolQuery
*/
public List<ES_Person> ageLess(){
ArrayList<ES_Person> list = new ArrayList<>();
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
//.filter(QueryBuilders.rangeQuery("age").gte(20).lte(40));
// 或者也可以用from 和 to gte/gt/lte/lt分别调用了from和to方法
.filter(QueryBuilders.rangeQuery("age").from(20,true).to(40,true));
Iterable<ES_Person> result = es_personRepository.search(queryBuilder);
result.forEach(person->{ list.add(person); });
return list;
}
- 示例5:match 匹配查询
match:匹配,_mapping有分词器的字段会自动分词查询
match_all:一般就是无条件全部获取
match_phrase:短语匹配查询,输入的内容不会被打断。
查询"hello world "要是document没有这个短语就查询不到
multi_match:关键字匹配对字段
termQuery 建议学习下 term query和Match query的区别
/**
* 全匹配和分词、短句查询
* 因为对象中name和power 两个字段type="keyword"的,因此不能分词只能全匹配
* 此外,remark、remark2和 remark3都是分词查询
*/
public void matchAndMustSearch(String name,String power,String cont){
NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
nativeSearchQueryBuilder.withQuery(QueryBuilders.termQuery("name",name))
.withQuery(QueryBuilders.termQuery("power",power))
// 第一个参数是要匹配的对象,后面是要匹配的字段
.withQuery(QueryBuilders.multiMatchQuery(cont,"remark","remark2","remark3"))
// 也就是短语查询,中间不能打断,比如说查询 love you 这个短句,在remark中就不能被打断叉开)
.withQuery(QueryBuilders.matchPhraseQuery("remark","love you"));
es_personRepository.search(nativeSearchQueryBuilder.build());
}
- 示例6:ConstantScore Query
参考链接 https://www.cnblogs.com/asker009/p/10201051.html
/**
* 权重相关性,单独提高remark的权重
* 一般都结合filter使用,因为filter context忽略score
* 返回的结果中,_score 就是填写的boost的值
*
*/
public void constantScoreQuery(String keyword){
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.filter(QueryBuilders.constantScoreQuery(QueryBuilders.rangeQuery("age").gte(30)).boost(3.0f));
Iterable<ES_Person> result = es_personRepository.search(boolQueryBuilder);
}
- 示例7:使用disMax查询,顾名思义即获取最大值的_score
/**
* 表示remark、remark2、remark3 这三个匹配中,取分值最高的作为score的值。如果
* remark =1.3 remark2=1.1 remark3= 0.3 那么结果就是1.3
*/
public void disMaxQuery(String keyword){
DisMaxQueryBuilder disMaxQueryBuilder = QueryBuilders.disMaxQuery()
.add(QueryBuilders.matchQuery("remark", keyword))
.add(QueryBuilders.matchQuery("remark2", keyword))
.add(QueryBuilders.matchQuery("remark3", keyword));
es_personRepository.search(disMaxQueryBuilder);
}
- 示例8:boosting query 查询
聚类我们有时候搜索亚马逊的时候,想得到的结果是一个公司而不是热带雨林。一般也尽量不去完全屏蔽tree相关的字段,must_not程度太重了,因此我们希望降低它的相关度。
/**
* boosting query
* negative_boost必须小于 1
* 第一个参数是positive,第二个是negative
*/
public void boosting(){
QueryBuilders.boostingQuery(QueryBuilders.termQuery("name", "company"),
QueryBuilders.matchQuery("remark", "tree")).negativeBoost(0.5f);
}
类似于这样
GET /_search
{
"query": {
"boosting": {
"positive": {
"match": {
"remark": "Amazon company"
}
},
"negative": {
"match": {
"text": "tree"
}
},
"negative_boost": 0.5
}
}
}
示例9:模糊查询和通配符查询

/**
* 容错查询 内容里面是first ,我使用 firsa也能查询出来
*
* 通配符就没什么好说的了
*/
public void fuzzyQuery(String keyword){
QueryBuilders.fuzzyQuery("remark", keyword);
QueryBuilders.wildcardQuery("remark","fir*");
}
- 示例10:分权重的复杂查询,包含排序,包含了分页
Function Score Query
/**
* 搜索,排序,分页
* keyword 在 remark作为搜索条件
* 相关链接 https://blog.csdn.net/dm_vincent/article/details/42201721
*/
public Page<ES_Person> getESPersons2(String name, Integer age, String keyword,
Integer pageNum, Integer pageSize, Integer sort){
NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
// 分页
Pageable pageAble = PageRequest.of(pageNum, pageSize);
nativeSearchQueryBuilder.withPageable(pageAble);
// 查询条件
if (!StringUtils.isEmpty(name) || age != null){
if (StringUtils.isEmpty(name)){
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
if (!StringUtils.isEmpty(name)){
boolQueryBuilder.must(QueryBuilders.termQuery("name",name));
}
if (age != null){
boolQueryBuilder.must(QueryBuilders.termQuery("age",age));
}
nativeSearchQueryBuilder.withQuery(boolQueryBuilder);
}
}
/**
* keyword作为关键值,remark、remark2、remark3 作为搜索点,资以不同的权重 remark > remark2 > remark3
* new WeightBuilder().setWeight(2) 或者
* ScoreFunctionBuilders.weightFactorFunction(2)
*/
if (StringUtils.isEmpty(keyword)){
nativeSearchQueryBuilder.withQuery(QueryBuilders.matchAllQuery());
}else{
List<FunctionScoreQueryBuilder.FilterFunctionBuilder> filterFunctionBuilders = new ArrayList<>();
filterFunctionBuilders.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(QueryBuilders.matchQuery("remark", keyword)
, new WeightBuilder().setWeight(2)));
filterFunctionBuilders.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(QueryBuilders.matchQuery("remark2", keyword)
, ScoreFunctionBuilders.weightFactorFunction(1.5f)));
filterFunctionBuilders.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(QueryBuilders.matchQuery("remark3", keyword)
, ScoreFunctionBuilders.weightFactorFunction(1.0f)));
FunctionScoreQueryBuilder.FilterFunctionBuilder[] filterFunctionBuildersArray = filterFunctionBuilders
.toArray(new FunctionScoreQueryBuilder.FilterFunctionBuilder[filterFunctionBuilders.size()]);
// 将计算后的分数(weight是计算的分数乘以weight的数) 再加上_score
FunctionScoreQueryBuilder functionScoreQueryBuilder = QueryBuilders.functionScoreQuery(filterFunctionBuildersArray)
.scoreMode(FiltersFunctionScoreQuery.ScoreMode.SUM)
.setMinScore(2);
nativeSearchQueryBuilder.withQuery(functionScoreQueryBuilder);
}
// sort==1 年纪递增排序 默认就是匹配分倒序
if (sort == 1){
nativeSearchQueryBuilder.withSort(SortBuilders.fieldSort("age").order(SortOrder.ASC));
}else{
nativeSearchQueryBuilder.withSort(SortBuilders.scoreSort().order(SortOrder.DESC));
}
List<ES_Person> result = es_personRepository.search(nativeSearchQueryBuilder.build()).getContent();
return new PageImpl<ES_Person>(result);
}
网友评论