Elasticsearch 是一个开源的分布式、RESTful 风格的搜索和数据分析引擎,它的底层是开源库 Apache Lucene。
Lucene 可以说是当下最先进、高性能、全功能的搜索引擎库——无论是开源还是私有,但它也仅仅只是一个库。为了充分发挥其功能,你需要使用 Java 并将 Lucene 直接集成到应用程序中。 更糟糕的是,您可能需要获得信息检索学位才能了解其工作原理,因为 Lucene 非常复杂。
为了解决 Lucene 使用时的繁复性,于是 Elasticsearch 便应运而生。它使用 Java 编写,内部采用 Lucene 做索引与搜索,但是它的目标是使全文检索变得更简单,简单来说,就是对 Lucene 做了一层封装,它提供了一套简单一致的 RESTful API 来帮助我们实现存储和检索。
1.索引基本操作
1.1 创建一个索引
1 2 3 4 5 6 7 8
| #创建一个索引 PUT /person { "settings": { "number_of_shards": 5, "number_of_replicas": 1 } }
|
1.2 查看索引信息
1.3 删除索引
1.4 ES 中 Field 可以指定的类型
1 2 3 4 5
| # text:一般用于全文检索。将当前的field进行分词 # keyword: 当前的Field不可被分词 # int # long # ……
|
1.5 创建索引并指定数据结构
——以创建小说为例子
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| PUT /book { "settings": { #备份数 "number_of_replicas": 1, #分片数 "number_of_shards": 5 }, #指定数据结构 "mappings": { #指定类型 Type "novel": { # 文件存储的Field属性名 "properties": { "name": { "type": "text", "analyzer": "ik_max_word", # 指定当前的Field可以作为查询的条件 "index": true }, "authoor": { "type": "keyword" }, "onsale": { "type": "date", "format": "yyyy-MM-dd" } } } } }
|
1.6 文档的操作
- 文档在 ES 服务中的唯一标志,_index, _type, _id 三个内容为组合,来锁定一个文档,操作抑或是修改
1.6.1 新建文档
1 2 3 4 5 6
| PUT /book/novel { "name": "西游记", "authoor": "刘明", "onsale": "2020-12-11" }
|
1 2 3 4 5 6
| PUT /book/novel/1 { "name": "三国演义", "authoor": "小明", "onsale": "2020-12-11" }
|
1.6.2 修改文档
覆盖式修改
1 2 3 4 5 6 7 8
| POST /book/novel/1 { "name": "三国演义", "authoor": "小明", "onsale": "2020-12-11" }
|
doc 修改方式(==更推荐==)
1 2 3 4 5 6 7
| POST /book/novel/1/_update { "doc": { "name": "极品家丁" } } #先锁定文档,_update 修改需要的字段即可
|
1.6.3 删除文档
2. java 操作 ElaticSearch
2.1 java 链接 ES
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| 1、创建Maven工程 导入依赖 # 4个依赖 1、1 elasticsearch
<dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>6.5.4</version> </dependency>
1、2 elasticsearch的高级API
<dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>6.5.4</version> </dependency>
1、3 junit
<dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> <scope>test</scope> </dependency>
1、4 lombok
<dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.18.12</version> <scope>provided</scope> </dependency>
|
2.1.2 创建测试类,连接 ES
1 2 3 4 5 6 7 8 9 10 11 12
| public class ESClient { public static RestHighLevelClient getClient(){
HttpHost httpHost = new HttpHost("127.0.0.1",9200);
RestClientBuilder builder = RestClient.builder(httpHost); RestHighLevelClient client = new RestHighLevelClient(builder); return client; } }
|
2.2 java 创建索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| import com.dengzhou.utils.ESClient; import org.elasticsearch.action.admin.indices.create.CreateIndexRequest; import org.elasticsearch.action.admin.indices.create.CreateIndexResponse; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.client.RestHighLevelClient; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.common.xcontent.XContentBuilder; import org.elasticsearch.common.xcontent.json.JsonXContent; import org.junit.jupiter.api.Test; import java.io.IOException; public class Create_ES_Index { String index = "person"; String type = "man"; @Test public void createIndex() throws IOException { Settings.Builder settings = Settings.builder() .put("number_of_shards", 3) .put("number_of_replicas", 1); XContentBuilder mappings = JsonXContent.contentBuilder() .startObject() .startObject("properties") .startObject("name") .field("type","text") .endObject() .startObject("age") .field("type","integer") .endObject() .startObject("birthday") .field("type","date") .field("format","yyyy-MM-dd") .endObject() .endObject() .endObject(); CreateIndexRequest request = new CreateIndexRequest(index) .settings(settings) .mapping(type,mappings); RestHighLevelClient client = ESClient.getClient(); CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT); System.out.println("response"+response.toString()); }
|
2.3 检查索引是否存在,删除索引
1 2 3 4 5 6 7 8 9 10 11
| @Test public void exists() throws IOException { GetIndexRequest request = new GetIndexRequest(); request.indices(index); RestHighLevelClient client = ESClient.getClient(); boolean exists = client.indices().exists(request, RequestOptions.DEFAULT); System.out.println(exists); }
|
2.4 修改文档
添加文档操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| @Test public void createDoc() throws IOException { ObjectMapper mapper = new ObjectMapper();
Person person = new Person(1, "张三", 23, new Date()); String json = mapper.writeValueAsString(person); System.out.println(json);
IndexRequest indexRequest = new IndexRequest(index,type,person.getId().toString()); indexRequest.source(json, XContentType.JSON);
RestHighLevelClient client = ESClient.getClient(); IndexResponse resp = client.index(indexRequest, RequestOptions.DEFAULT);
System.out.println(resp.getResult().toString()); }
|
修改文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| @Test public void updateDoc() throws IOException {
Map<String,Object> map = new HashMap<String, Object>(); map.put("name","李四"); String docId = "1";
UpdateRequest updateRequest = new UpdateRequest(index,type,docId); updateRequest.doc(map);
RestHighLevelClient client = ESClient.getClient(); UpdateResponse update = client.update(updateRequest, RequestOptions.DEFAULT);
System.out.println(update.getResult().toString()); }
|
2.5 删除文档
2.6 java 批量操作文档
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
| public void create_index() throws IOException { Settings.Builder settings = Settings.builder() .put("number_of_shards", 3) .put("number_of_replicas", 1);
XContentBuilder mappings = JsonXContent.contentBuilder() .startObject() .startObject("properties") .startObject("createDate") .field("type", "text") .endObject() .startObject("sendDate") .field("type", "date") .field("format", "yyyy-MM-dd") .endObject() .startObject("longCode") .field("type", "text") .endObject() .startObject("mobile") .field("type", "text") .endObject() .startObject("corpName") .field("type", "text") .field("analyzer", "ik_max_word") .endObject() .startObject("smsContent") .field("type", "text") .field("analyzer", "ik_max_word") .endObject() .startObject("state") .field("type", "integer") .endObject() .startObject("operatorid") .field("type", "integer") .endObject() .startObject("province") .field("type", "text") .endObject() .startObject("ipAddr") .field("type", "text") .endObject() .startObject("replyTotal") .field("type", "integer") .endObject() .startObject("fee") .field("type", "integer") .endObject() .endObject() .endObject();
CreateIndexRequest request = new CreateIndexRequest(index) .settings(settings) .mapping(type,mappings);
RestHighLevelClient client = ESClient.getClient(); CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT); System.out.println(response.toString()); }
|
- 数据导入部分
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| PUT /sms_logs_index/sms_logs_type/1 { "corpName": "途虎养车", "createDate": "2020-1-22", "fee": 3, "ipAddr": "10.123.98.0", "longCode": 106900000009, "mobile": "1738989222222", "operatorid": 1, "province": "河北", "relyTotal": 10, "sendDate": "2020-2-22", "smsContext": "【途虎养车】亲爱的灯先生,您的爱车已经购买", "state": 0 }
|
4. ES 的各种查询
4.1 term&terms 查询
4.1.1 term 查询
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| #term匹配查询 POST /sms_logs_index/sms_logs_type/_search { "from": 0, #limit from,size "size": 5, "query": { "term": { "province": { "value": "河北" } } } } ##不会对term中所匹配的值进行分词查询
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| @Test public void testQuery() throws IOException {
SearchRequest request = new SearchRequest(index); request.types(type);
SearchSourceBuilder builder = new SearchSourceBuilder(); builder.from(0); builder.size(5); builder.query(QueryBuilders.termQuery("province", "河北")); request.source(builder);
RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT);
for (SearchHit hit : response.getHits().getHits()) { Map<String, Object> result = hit.getSourceAsMap(); System.out.println(result); } }
|
terms 是针对一个字段包含多个值得运用
- terms: where province = 河北 or province = ? or province = ?
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| #terms 匹配查询 POST /sms_logs_index/sms_logs_type/_search { "from": 0, "size": 5, "query": { "terms": { "province": [ "河北", "河南" ] } } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| @Test public void test_terms() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type);
SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.termsQuery("province","河北","河南"));
request.source(builder);
RestHighLevelClient client = ESClient.getClient(); SearchResponse resp = client.search(request, RequestOptions.DEFAULT);
for (SearchHit hit : resp.getHits().getHits()){ System.out.println(hit); } }
|
4.2 match 查询
match 查询属于高层查询,它会根据你查询字段类型不一样,采用不同的查询方式
match 查询,实际底层就是多个 term 查询,将多个 term 查询的结果进行了封装
- 查询的如果是日期或者是数值的话,它会根据你的字符串查询内容转换为日期或者是数值对等
- 如果查询的内容是一个不可被分的内容(keyword),match 查询不会对你的查询的关键字进行分词
- 如果查询的内容是一个可被分的内容(text),match 则会根据指定的查询内容按照一定的分词规则去分词进行查询
4.2.1 match_all 查询
查询全部内容,不指定任何查询条件
1 2 3 4 5 6
| POST /sms_logs_index/sms_logs_type/_search { "query": { "match_all": {} } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| @Test public void test_match_all() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); builder.size(20); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.matchAllQuery()); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit); } }
|
4.2.2 match 查询 根据某个 Field
1 2 3 4 5 6 7 8
| POST /sms_logs_index/sms_logs_type/_search { "query": { "match": { "smsContent": "打车" } } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13
| @Test public void test_match_field() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.matchQuery("smsContext","打车")); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit); } }
|
4.2.3 布尔 match 查询
基于一个 Filed 匹配的内容,采用 and 或者 or 的方式进行连接
1 2 3 4 5 6 7 8 9 10 11 12
| # 布尔match查询 POST /sms_logs_index/sms_logs_type/_search { "query": { "match": { "smsContext": { "query": "打车 女士", "operator": "and" #or } } } }
|
1 2 3 4 5 6 7 8 9 10 11 12
| @Test public void test_match_boolean() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.matchQuery("smsContext","打车 女士").operator(Operator.AND)); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit); }
|
4.2.4 multi_match 查询
match 针对一个 field 做检索,multi_match 针对多个 field 进行检索,多个 key 对应一个 text
1 2 3 4 5 6 7 8 9
| POST /sms_logs_index/sms_logs_type/_search { "query": { "multi_match": { "query": "河北", #指定text "fields": ["province","smsContext"] #指定field } } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| @Test public void test_multi_match() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.multiMatchQuery("河北", "province", "smsContext")); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()) { System.out.println(hit); } }
|
4.3 ES 的其他查询
4.3.1 ID 查询
1 2 3
| # id查询 GET /sms_logs_index/sms_logs_type/1 GET /索引名/type类型/id
|
1 2 3 4 5 6
| public void test_multi_match() throws IOException { GetRequest request = new GetRequest(index,type,"1"); RestHighLevelClient client = ESClient.getClient(); GetResponse resp = client.get(request, RequestOptions.DEFAULT); System.out.println(resp.getSourceAsMap()); }
|
4.3.2 ids 查询
根据多个 id 进行查询,类似 MySql 中的 where Id in (id1,id2,id3….)
1 2 3 4 5 6 7 8
| POST /sms_logs_index/sms_logs_type/_search { "query": { "ids": { "values": [1,2,3] #id值 } } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| @Test public void test_query_ids() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.idsQuery().addIds("1","2","3")); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } }
|
4.3.3 prefix 查询
前缀查询,可以通过一个关键字去指定一个 Field 的前缀,从而查询到指定的文档
1 2 3 4 5 6 7 8 9 10 11 12
| POST /sms_logs_index/sms_logs_type/_search { "query": { "prefix": { "smsContext": { "value": "河" } } } } #与 match查询的不同在于,prefix类似mysql中的模糊查询。而match的查询类似于严格匹配查询 # 针对不可分割词
|
1 2 3 4 5 6 7 8 9 10 11 12 13
| @Test public void test_query_prefix() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.prefixQuery("smsContext","河")); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } }
|
4.3.4 fuzzy 查询
fuzzy 查询:模糊查询,我们可以输入一个字符的大概,ES 就可以根据输入的内容大概去匹配一下结果,eg.你可以存在一些错别字
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| #fuzzy查询 #fuzzy查询 POST /sms_logs_index/sms_logs_type/_search { "query": { "fuzzy": { "corpName": { "value": "盒马生鲜", "prefix_length": 2 # 指定前几个字符要严格匹配 } } } } #不稳定,查询字段差太多也可能查不到
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| @Test public void test_query_fuzzy() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.fuzzyQuery("corpName","盒马生鲜").prefixLength(2)); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } } .prefixLength() :指定前几个字符严格匹配
|
4.3.5 wildcard 查询
通配查询,与 mysql 中的 like 查询是一样的,可以在查询时,在字符串中指定通配符*和占位符?
1 2 3 4 5 6 7 8 9 10 11 12 13
| #wildcard查询 POST /sms_logs_index/sms_logs_type/_search { "query": { "wildcard": { "corpName": { "value": "*车" # 可以使用*和?指定通配符和占位符 } } } } ?代表一个占位符 ??代表两个占位符
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| @Test public void test_query_wildcard() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.wildcardQuery("corpName","*车")); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } }
|
4.3.6 range 查询
范围查询,只针对数值类型,对某一个 Field 进行大于或者小于的范围指定
1 2 3 4 5 6 7 8 9 10 11 12 13
| POST /sms_logs_index/sms_logs_type/_search { "query": { "range": { "relyTotal": { "gte": 0, "lte": 3 } } } } 查询范围:[gte,lte] 查询范围:(gt,lt)
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| @Test public void test_query_range() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.rangeQuery("fee").lt(5).gt(2)); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } }
|
4.3.7 regexp 查询
正则查询,通过你编写的正则表达式去匹配内容
PS: prefix,fuzzy,wildcar 和 regexp 查询效率相对比较低,在对效率要求比较高时,避免去使用
1 2 3 4 5 6 7 8
| POST /sms_logs_index/sms_logs_type/_search { "query": { "regexp": { "moible": "109[0-8]{7}" # 匹配的正则规则 } } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| @Test public void test_query_regexp() throws IOException { SearchRequest request = new SearchRequest(index); request.types(type); SearchSourceBuilder builder = new SearchSourceBuilder(); builder.query(QueryBuilders.regexpQuery("moible","106[0-9]{8}")); request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request, RequestOptions.DEFAULT); for (SearchHit hit : response.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } }
|
ES 对 from+size 有限制,from 和 size 两者之和不能超过 1w
原理:
1 2 3 4 5 6 7 8
| from+size ES查询数据的方式: 1 先将用户指定的关键词进行分词处理 2 将分词去词库中进行检索,得到多个文档的id 3 去各个分片中拉去指定的数据 耗时 4 根据数据的得分进行排序 耗时 5 根据from的值,将查询到的数据舍弃一部分, 6 返回查询结果 Scroll+size 在ES中查询方式 1 先将用户指定的关键词进行分词处理 2 将分词去词库中进行检索,得到多个文档的id 3 将文档的id存放在一个ES的上下文中,ES内存 4 根据你指定给的size的个数去ES中检索指定个数的数据,拿完数据的文档id,会从上下文中移除 5 如果需要下一页的数据,直接去ES的上下文中,找后续内容 6 循环进行4.5操作
|
==缺点,Scroll 是从内存中去拿去数据的,不适合做实时的查询,拿到的数据不是最新的==
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| # 执行scroll查询,返回第一页数据,并且将文档id信息存放在ES的上下文中,指定生存时间 POST /sms_logs_index/sms_logs_type/_search?scroll=1m { "query": { "match_all": {} }, "size": 2, "sort": [ { "fee": { "order": "desc" } } ] } #查询下一页的数据 POST /_search/scroll { "scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAACSPFnJjV1pHbENVVGZHMmlQbHVZX1JGdmcAAAAAAAAkkBZyY1daR2xDVVRmRzJpUGx1WV9SRnZnAAAAAAAAJJEWcmNXWkdsQ1VUZkcyaVBsdVlfUkZ2Zw==", "scoll" :"1m" #scorll信息的生存时间 } #删除scroll在ES中上下文的数据 DELETE /_search/scroll/scrill_id
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| @Test public void test_query_scroll() throws IOException {
SearchRequest request = new SearchRequest(index); request.types(type);
request.scroll(TimeValue.timeValueMinutes(1L));
SearchSourceBuilder builder = new SearchSourceBuilder(); builder.size(2); builder.sort("fee",SortOrder.DESC); builder.query(QueryBuilders.matchAllQuery());
request.source(builder); RestHighLevelClient client = ESClient.getClient(); SearchResponse response = client.search(request,RequestOptions.DEFAULT); String scrollId = response.getScrollId(); System.out.println(scrollId); while(true){
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId); scrollRequest.scroll(TimeValue.timeValueMinutes(1L));
SearchResponse scrollResp = client.scroll(scrollRequest, RequestOptions.DEFAULT);
if (scrollResp.getHits().getHits() != null && scrollResp.getHits().getHits().length > 0){ System.out.println("=======下一页的数据========"); for (SearchHit hit : scrollResp.getHits().getHits()){ System.out.println(hit.getSourceAsMap()); } }else{ System.out.println("没得"); break; } } ClearScrollRequest clearScrollRequest = new ClearScrollRequest(); clearScrollRequest.addScrollId(scrollId); client.clearScroll(clearScrollRequest,RequestOptions.DEFAULT); }
|
==实际使用中前 10000 条可以使用浅度分页,10000 条之后使用 scroll 深度分页==
4.5 delete-by-query
根据 term,match 等查询方式去删除大量的文档
如果你需要删除的内容,是 index 下的大部分数据,不建议使用,建议逆向操作,创建新的索引,添加需要保留的数据内容
1 2 3 4 5 6 7 8 9 10 11 12
| POST /sms_logs_index/sms_logs_type/_delete_by_query { "query": { "range": { "relyTotal": { "gte": 2, "lte": 3 } } } } ##中间跟你的查询条件,查到什么,删什么t
|
1 2 3 4 5 6 7 8 9 10 11 12 13
| public class test_sms_search2 { String index = "sms_logs_index"; String type = "sms_logs_type"; @Test public void test_query_fuzzy() throws IOException { DeleteByQueryRequest request = new DeleteByQueryRequest(index); request.types(type); request.setQuery(QueryBuilders.rangeQuery("relyTotal").gt("2").lt("3")); RestHighLevelClient client = ESClient.getClient(); BulkByScrollResponse response = client.deleteByQuery(request, RequestOptions.DEFAULT); System.out.println(response.toString()); } }
|
4.6 复合查询
4.6. 1 bool 查询
复合过滤器,可以将多个查询条件以一定的逻辑组合在一起,and or
- must : 所有的条件,用 must 组合在一起,表示 AND
- must_not:将 must_not 中的条件,全部不能匹配,表示 not 的意思,不能匹配该查询条件
- should: 所有条件,用 should 组合在一起,表示 or 的意思,文档必须匹配一个或者多个查询条件
- filter: 过滤器,文档必须匹配该过滤条件,跟 must 子句的唯一区别是,filter 不影响查询的 score
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| #查询省份为河北或者河南的 #并且公司名不是河马生鲜的 #并且smsContext中包含软件两个字 POST /sms_logs_index/sms_logs_type/_search { "query": { "bool": { "should": [ { "term": { "province": { "value": "河北" } } }, { "term": { "province": { "value": "河南" }
} ], "must_not": [ { "term": { "corpName": { "value": "河马生鲜" } } } ], "must": [ { "match": { "smsContext": "软件" } } ] } } }
|