ElasticSearch基础操作 - 浏览数据

作者: SlowGO | 来源:发表于2019-07-04 14:29 被阅读0次

ElasticSearch基础操作 - 浏览数据
ElasticSearch基础操作 - 浏览集群
Elasticsearch用户指南一基础(2)
ElasticSearch 02 与SpringBoot2.x集
ElasticSearch 03 全文检索
ElasticSearch 01 基础概念及基本操作
ElasticSearch-5 ES7 RestHighLeve
ElasticSearch-4 ES7 RestHighLeve
ElasticSearch基础操作 - 修改数据
ElasticSearch基础操作

当前使用的ES版本为 7.2

1. 准备测试数据

下载 accounts.json，导入es：

curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"

查看索引列表，验证bank是否存在：

curl "localhost:9200/_cat/indices?v"

2. 搜索API

搜索有2种方式：

通过URI发送搜索参数，简单，但比较局限
通过request body发送搜索参数，可以使用json定义搜索，可读性高

URI方式示例：

curl -X GET "localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty"

请求参数说明：

q=*指定匹配所有文档
sort=account_number:asc 指定排序方式为对account_number升序排序
pretty 对返回的json结果进行格式化

返回结果说明：

took 搜索执行的毫秒数
timed_out 是否超时
_shards 搜索了多少分片
hits 搜索结果
hits.total.value 搜索结果数量
hits.total.relation 说明 “hits.total.value” 与实际结果的关系，比如有eq和gte。
hits.hits 实际搜索结果

request body方式示例：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ]
}
'

3. 查询语言 Query Language

request body方式使用json来定义查询，有一套标准，例如：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} }
}
'

其中query指定了查询策略，match_all 部分指定了查询类型，此处为搜索所有文档。

除了“query”还可以指定其他的参数来影响查询结果，例如：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "from": 10,
  "size": 10
}
'

from 指定从索引的那个位置开始
size 指定共返回文档的数量，默认值为10

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": { "balance": { "order": "desc" } }
}
'

指定了排序方式，根据 balance 来排序，使用降序。

3. 执行搜索

搜索时默认返回的是文档的全部字段，可以指定只返回哪些字段，就像 SQL select xxx from 指定返回的列。

例如只返回account_number和balance这2个字段：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "_source": ["account_number", "balance"]
}
'

之前都是使用match_all匹配所有文档，现在使用 match 来进行更精准的查询。

示例 - 查询account_number为20的文档：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "account_number": 20 } }
}
'

示例 - 查询address中包含“mail”的：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "address": "mill" } }
}
'

示例 - 查询address中包含“mail”或”lane“的：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "address": "mill lane" }}
}
'

示例 - 查询address中包含“mail lane“这个短语的：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_phrase": { "address": "mill lane" } }
}
'

match_phrase 会把查询条件的值作为一个整体，而 match 是把值中的词看做或的关系。

示例 - bool 查询：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}
'

上面的 bool must 指明其中的2个match必须都为true。

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "should": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}
'

上面的 bool should 指明其中的2个match符合1个即可。

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must_not": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}
'

上面的 bool must_not 指明搜索的是完全不匹配其中的2个match条件的文档。

must,should,must_not 可以混合使用，例如：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}
'

4. 执行过滤

示例 - 查询balance在20000-30000之间的：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  }
}
'

5. 执行聚合

聚合提供了对数据进行分组和统计的能力，类似SQL的group by和聚合函数。

es中可以一次性返回查询结果和聚合结果，非常高效，可以避免多次网络请求。

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      }
    }
  }
}
'

这个例子是根据state对所有文档进行分组，返回前10个（默认），根据count值进行降序排序（默认）。

其中size设置为0，因为我们不需要返回查询结果，只是看聚合结果。

这个查询如果写成SQL，类似这样：

SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC LIMIT 10;

在上个示例的基础上，在分组结果中再增加一个聚合操作，添加对每组的balance计算平均值：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}
'

继续修改，聚合结果根据每组balance平均值这个聚合结果进行降序排序：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword",
        "order": {
          "average_balance": "desc"
        }
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}
'

做一个复杂聚合，先根据年龄段分组，分为3组：20-30、30-40、40-50，组内再根据性别分组，性别组内计算balance的平均值：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "aggs": {
    "group_by_age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      },
      "aggs": {
        "group_by_gender": {
          "terms": {
            "field": "gender.keyword"
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
  }
}
'

网友评论

工作生活

本文标题：ElasticSearch基础操作 - 浏览数据

本文链接：https://www.haomeiwen.com/subject/wizlhctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！