美文网首页
Java Elasticsearch

Java Elasticsearch

作者: 茧铭 | 来源:发表于2019-08-29 15:32 被阅读0次

       本文记录了使用spring-boot-starter-data-elasticsearch 包中的API去执行复杂查询的一些示例。文中会结合文字需求描述,以及实现代码进行说明。

引入依赖
 <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
配置PO类

这些内容。在加载的时候,会自动在elasticsearch后台创建索引。@Field的一些属性配置,则会给Index建立不同filed的mapping字段
Document 和Field注解说明
其中english 是一种默认的 language analyzer

@AllArgsConstructor
@NoArgsConstructor
@Data
@Document(indexName = "person",type = "man", shards = 1,replicas = 1)
public class ES_Person {

    @Id
    private Integer id;

    @Field(type = FieldType.keyword)
    private String name;

    private Integer age;

    @Field(type = FieldType.keyword)
    private String power;

    @Field(type = FieldType.text, analyzer = "english")
    private String remark;

    @Field(type = FieldType.text, analyzer = "english")
    private String remark2;

    @Field(type = FieldType.text, analyzer = "english")
    private String remark3;
}

类似于在Kibana后台创建一个index,这个Index一旦创建,字段的内容是不能修改的。但是可以新增字段,一般系统会根据字段类型自动增加新字段的内容。

#### 上面的对象创建类似于使用下面这种restful API 风格的语句

PUT person
{
  "settings": {
    "number_of_shards": 1
    , "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "name":{
        "type": "keyword",
        "ignore_above": 256
      },
      "age":{
        "type": "long"
      },
      "power":{
        "type": "keyword",
        "ignore_above": 256
      },
      "remark":{
        "type": "text",
        "analyzer": "english",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        }
      },
      "remark2":{
        "type": "text",
        "analyzer": "english",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        }
      },
      "remark3":{
        "type": "text",
        "analyzer": "english",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        }
      }
    }
  }
}
测试分词器-- remark 字段 测试分词器-- name字段

上面两个内容不一致是因为我把name设置为了type="keyword"关键字,也就是整体作为一个查询的内容,不进行分词。remark使用的是language analyzer分词器,会根据空格把句子分成词组,建立倒排索引和正序索引。

建立Repository
/**
 * @program: espractice
 * @description:
 * @author: ZengGuangfu
 * @create 2019-08-27 19:30
 */

public interface ES_PersonRepository extends ElasticsearchRepository<ES_Person,Integer>{
   
    /** 类似于springDataJPA,可以通过提示信息建立非常多的快速接口直接调用 */
    void findAllByNameEqualsAndAgeAfter(String name,Integer age);
}
别人家的图

列表如上,但是这些并不是本文的主要内容

自定义复杂查询

NativeSearchQueryBuilder对象通常是用来组成比较复杂的查询请求,可囊括包含多种的查询条件的语句,以后的例子会有更多的体现。
以下的例子会包含一些分页查询、模糊查询、全匹配、短句、权重分配等查询

    /** 测试用返回成功信息 */
    public static final String SUCCESS = "success";    

    @Autowired
    private ES_PersonRepository es_personRepository;
  • 示例1:在 Elasticsearch 批量新增 Document
    

    @PutMapping
    public String createRepository(HttpServletRequest request){
        ArrayList<ES_Person> esPeopleList = new ArrayList<>();
        esPeopleList.add(new ES_Person(1,"zengguangfu",26,"m","handsome qiangwudi, a good boy","a good boy","handsome, Shame on you"));
        esPeopleList.add(new ES_Person(2,"wuyanzu",45,"m","come from HK","come from HongKong","fly high"));
        esPeopleList.add(new ES_Person(3,"pengyuyan",37,"h","come from HongKong too","like bool","nice man"));
        esPeopleList.add(new ES_Person(4,"linzhiyin",40,"h","wan wan","come from TaiWan of China","handsome"));
      
        es_personRepository.saveAll(esPeopleList);
        return SUCCESS;
    }
  • 示例2:分页查询
    public Page<ES_Person> pageSearch(Integer pageNum,Integer pageSize){
        Pageable pageable = PageRequest.of(pageNum, pageSize);
        // 方式一,直接使用 Pageable 对象查询
        //Page<ES_Person> resultPage = es_personRepository.findAll(PageRequest.of(pageNum, pageSize));

        // 方式二,使用通用NativeSearchQueryBuilder对象拼接复杂的查询结果
        NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
        nativeSearchQueryBuilder.withPageable(pageable);
        Page<ES_Person> resultPage = es_personRepository.search(nativeSearchQueryBuilder.build());

        return new PageImpl<ES_Person>(resultPage.getContent());
    }
  • 示例3:按照某字段排序 分页查询
    /**
     * 按照某字段排序 分页查询
     * field 要排序的字段
     * sort 排序的方式,传入asc 或 desc
     */
    public Page<ES_Person> pageAndSortSearch(Integer pageNum,Integer pageSize,String field, String sort){
        Pageable pageable = PageRequest.of(pageNum, pageSize);
        NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
        nativeSearchQueryBuilder.withPageable(pageable);
        // 针对 field 字段进行分页查询
        if (StringUtils.equals(sort.toUpperCase(),SortOrder.ASC.toString().toUpperCase())){
            nativeSearchQueryBuilder.withSort(SortBuilders.fieldSort(field).order(SortOrder.ASC));
        }else{
            nativeSearchQueryBuilder.withSort(SortBuilders.fieldSort(field).order(SortOrder.DESC));
        }

        Page<ES_Person> result = es_personRepository.search(nativeSearchQueryBuilder.build());
        return new PageImpl<ES_Person>(result.getContent());
    }
  • 示例4:查询年龄20岁到40岁之间的
    /**
     * 查询年龄在20 到 40 岁之间的
     * 用NativeSearchQueryBuilder也可以包装BoolQuery
     */
    public List<ES_Person> ageLess(){
        ArrayList<ES_Person> list = new ArrayList<>();
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery()
                //.filter(QueryBuilders.rangeQuery("age").gte(20).lte(40));
        // 或者也可以用from 和 to    gte/gt/lte/lt分别调用了from和to方法
                .filter(QueryBuilders.rangeQuery("age").from(20,true).to(40,true));
        Iterable<ES_Person> result = es_personRepository.search(queryBuilder);
        result.forEach(person->{  list.add(person); });
        return list;
    }
  • 示例5:match 匹配查询
    match:匹配,_mapping有分词器的字段会自动分词查询
    match_all:一般就是无条件全部获取
    match_phrase:短语匹配查询,输入的内容不会被打断。
         查询"hello world "要是document没有这个短语就查询不到
    multi_match:关键字匹配对字段
    termQuery 建议学习下 term query和Match query的区别
    /**
     * 全匹配和分词、短句查询
     * 因为对象中name和power 两个字段type="keyword"的,因此不能分词只能全匹配
     * 此外,remark、remark2和 remark3都是分词查询
     */
    public void matchAndMustSearch(String name,String power,String cont){
       NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
       nativeSearchQueryBuilder.withQuery(QueryBuilders.termQuery("name",name))
                                .withQuery(QueryBuilders.termQuery("power",power))
                                    // 第一个参数是要匹配的对象,后面是要匹配的字段
                                .withQuery(QueryBuilders.multiMatchQuery(cont,"remark","remark2","remark3"))
                                    // 也就是短语查询,中间不能打断,比如说查询 love you 这个短句,在remark中就不能被打断叉开)
                                .withQuery(QueryBuilders.matchPhraseQuery("remark","love you"));
        es_personRepository.search(nativeSearchQueryBuilder.build());
    }
    /**
     * 权重相关性,单独提高remark的权重
     * 一般都结合filter使用,因为filter context忽略score
     * 返回的结果中,_score 就是填写的boost的值
     * 
     */
    public void constantScoreQuery(String keyword){
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
                .filter(QueryBuilders.constantScoreQuery(QueryBuilders.rangeQuery("age").gte(30)).boost(3.0f));
        Iterable<ES_Person> result = es_personRepository.search(boolQueryBuilder);
    }
  • 示例7:使用disMax查询,顾名思义即获取最大值的_score
    /**
     * 表示remark、remark2、remark3 这三个匹配中,取分值最高的作为score的值。如果
     * remark =1.3  remark2=1.1   remark3= 0.3   那么结果就是1.3
     */
    public void disMaxQuery(String keyword){
        DisMaxQueryBuilder disMaxQueryBuilder = QueryBuilders.disMaxQuery()
                .add(QueryBuilders.matchQuery("remark", keyword))
                .add(QueryBuilders.matchQuery("remark2", keyword))
                .add(QueryBuilders.matchQuery("remark3", keyword));

        es_personRepository.search(disMaxQueryBuilder);
    }
  • 示例8:boosting query 查询

聚类我们有时候搜索亚马逊的时候,想得到的结果是一个公司而不是热带雨林。一般也尽量不去完全屏蔽tree相关的字段,must_not程度太重了,因此我们希望降低它的相关度。

/**
     * boosting query
     * negative_boost必须小于 1
     * 第一个参数是positive,第二个是negative
     */
    public void boosting(){
        QueryBuilders.boostingQuery(QueryBuilders.termQuery("name", "company"),
                QueryBuilders.matchQuery("remark", "tree")).negativeBoost(0.5f);
    }

类似于这样

GET /_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": {
          "remark": "Amazon company"
        }
      },
      "negative": {
        "match": {
          "text": "tree"
        }
      },
      "negative_boost": 0.5
    }
  }
}

示例9:模糊查询和通配符查询

模糊查询
    /**
     * 容错查询  内容里面是first ,我使用 firsa也能查询出来
     *
     * 通配符就没什么好说的了
     */
    public void fuzzyQuery(String keyword){
        QueryBuilders.fuzzyQuery("remark", keyword);

        QueryBuilders.wildcardQuery("remark","fir*");
    }
/**
     * 搜索,排序,分页
     * keyword 在 remark作为搜索条件
     * 相关链接 https://blog.csdn.net/dm_vincent/article/details/42201721

     */
    public Page<ES_Person> getESPersons2(String name, Integer age, String keyword,
                                         Integer pageNum, Integer pageSize, Integer sort){
        NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder();
        // 分页
        Pageable pageAble = PageRequest.of(pageNum, pageSize);
        nativeSearchQueryBuilder.withPageable(pageAble);
        // 查询条件
        if (!StringUtils.isEmpty(name) || age != null){
            if (StringUtils.isEmpty(name)){
                BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
                if (!StringUtils.isEmpty(name)){
                    boolQueryBuilder.must(QueryBuilders.termQuery("name",name));
                }
                if (age != null){
                    boolQueryBuilder.must(QueryBuilders.termQuery("age",age));
                }
                nativeSearchQueryBuilder.withQuery(boolQueryBuilder);
            }
        }

        /**
         * keyword作为关键值,remark、remark2、remark3 作为搜索点,资以不同的权重  remark > remark2 > remark3
         * new WeightBuilder().setWeight(2)  或者
         * ScoreFunctionBuilders.weightFactorFunction(2)
         */
        if (StringUtils.isEmpty(keyword)){
            nativeSearchQueryBuilder.withQuery(QueryBuilders.matchAllQuery());
        }else{
            List<FunctionScoreQueryBuilder.FilterFunctionBuilder> filterFunctionBuilders = new ArrayList<>();
            filterFunctionBuilders.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(QueryBuilders.matchQuery("remark", keyword)
                                                                                        , new WeightBuilder().setWeight(2)));
            filterFunctionBuilders.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(QueryBuilders.matchQuery("remark2", keyword)
                                                                                        , ScoreFunctionBuilders.weightFactorFunction(1.5f)));
            filterFunctionBuilders.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(QueryBuilders.matchQuery("remark3", keyword)
                                                                                        , ScoreFunctionBuilders.weightFactorFunction(1.0f)));

            FunctionScoreQueryBuilder.FilterFunctionBuilder[] filterFunctionBuildersArray = filterFunctionBuilders
                    .toArray(new FunctionScoreQueryBuilder.FilterFunctionBuilder[filterFunctionBuilders.size()]);

            // 将计算后的分数(weight是计算的分数乘以weight的数) 再加上_score
            FunctionScoreQueryBuilder functionScoreQueryBuilder = QueryBuilders.functionScoreQuery(filterFunctionBuildersArray)
                                                                .scoreMode(FiltersFunctionScoreQuery.ScoreMode.SUM)
                                                                .setMinScore(2);

            nativeSearchQueryBuilder.withQuery(functionScoreQueryBuilder);
        }

        // sort==1 年纪递增排序  默认就是匹配分倒序
        if (sort == 1){
            nativeSearchQueryBuilder.withSort(SortBuilders.fieldSort("age").order(SortOrder.ASC));
        }else{
            nativeSearchQueryBuilder.withSort(SortBuilders.scoreSort().order(SortOrder.DESC));
        }
        List<ES_Person> result = es_personRepository.search(nativeSearchQueryBuilder.build()).getContent();
        return new PageImpl<ES_Person>(result);
    }

相关文章

网友评论

      本文标题:Java Elasticsearch

      本文链接:https://www.haomeiwen.com/subject/fxlmectx.html