This repository was archived by the owner on Sep 21, 2021. It is now read-only.
File tree 15 files changed +388
-10
lines changed
snippets/120_Proximity_Matching
15 files changed +388
-10
lines changed Original file line number Diff line number Diff line change @@ -16,6 +16,7 @@ GET /my_index/my_type/_search
16
16
}
17
17
}
18
18
--------------------------------------------------
19
+ // SENSE: 120_Proximity_Matching/05_Match_phrase_query.json
19
20
20
21
Like the `match` query, the `match_phrase` query first analyzes the query
21
22
string to produce a list of terms. It then searches for all the terms, but
@@ -38,6 +39,7 @@ The `match_phrase` query can also be written as a `match` query with type
38
39
}
39
40
}
40
41
--------------------------------------------------
42
+ // SENSE: 120_Proximity_Matching/05_Match_phrase_query.json
41
43
42
44
****
43
45
@@ -51,6 +53,7 @@ also the _position_ or order of each term in the original string:
51
53
GET /_analyze?analyzer=standard
52
54
Quick brown fox
53
55
--------------------------------------------------
56
+ // SENSE: 120_Proximity_Matching/05_Term_positions.json
54
57
55
58
This returns:
56
59
Original file line number Diff line number Diff line change @@ -20,6 +20,7 @@ GET /my_index/my_type/_search
20
20
}
21
21
}
22
22
--------------------------------------------------
23
+ // SENSE: 120_Proximity_Matching/10_Slop.json
23
24
24
25
The `slop` parameter tells the `match_phrase` query how far apart terms are
25
26
allowed to be while still considering the document a match. By ``how far
Original file line number Diff line number Diff line change @@ -10,6 +10,7 @@ PUT /my_index/groups/1
10
10
"names": [ "John Abraham", "Lincoln Smith"]
11
11
}
12
12
--------------------------------------------------
13
+ // SENSE: 120_Proximity_Matching/15_Multi_value_fields.json
13
14
14
15
Then run a phrase query for `"Abraham Lincoln"` :
15
16
@@ -24,6 +25,7 @@ GET /my_index/groups/_search
24
25
}
25
26
}
26
27
--------------------------------------------------
28
+ // SENSE: 120_Proximity_Matching/15_Multi_value_fields.json
27
29
28
30
Surprisingly our document matches, even though `"Abraham"` and `"Lincoln"`
29
31
belong to two different people in the `names` array. The reason for this comes
@@ -61,6 +63,8 @@ PUT /my_index/_mapping/groups <2>
61
63
}
62
64
}
63
65
--------------------------------------------------
66
+ // SENSE: 120_Proximity_Matching/15_Multi_value_fields.json
67
+
64
68
<1> First delete the `group` mapping and and documents of that type.
65
69
<2> Then create a new `group` mapping with the correct values.
66
70
Original file line number Diff line number Diff line change @@ -25,6 +25,8 @@ POST /my_index/my_type/_search
25
25
}
26
26
}
27
27
--------------------------------------------------
28
+ // SENSE: 120_Proximity_Matching/20_Scoring.json
29
+
28
30
<1> Note the high `slop` value
29
31
30
32
[source,js]
@@ -33,19 +35,20 @@ POST /my_index/my_type/_search
33
35
"hits": [
34
36
{
35
37
"_id": "3",
36
- "_score": 0.75,
38
+ "_score": 0.75, <1>
37
39
"_source": {
38
40
"title": "The quick brown fox jumps over the quick dog"
39
41
}
40
42
},
41
43
{
42
44
"_id": "2",
43
- "_score": 0.28347334,
45
+ "_score": 0.28347334, <2>
44
46
"_source": {
45
47
"title": "The quick brown fox jumps over the lazy dog"
46
48
}
47
49
}
48
50
]
49
51
}
50
52
--------------------------------------------------
51
-
53
+ <1> Higher score because `quick` and `dog` are close together.
54
+ <2> Lower score because `quick` and `dog` are further apart.
Original file line number Diff line number Diff line change @@ -16,7 +16,7 @@ that we should combine them using the `bool` query.
16
16
17
17
We can use a simple `match` query as a `must` clause. This is the query that
18
18
will determine which documents are included in our resultset -- we can trim
19
- the long tail with the `minimum_must_match ` parameter. Then we can add other
19
+ the long tail with the `minimum_should_match ` parameter. Then we can add other
20
20
more specific queries as `should` clauses -- every one that matches will
21
21
increase the relevance of the matching docs.
22
22
@@ -29,13 +29,13 @@ GET /my_index/my_type/_search
29
29
"must": {
30
30
"match": { <1>
31
31
"title": {
32
- "query": "quick brown fox",
33
- "minimum_must_match ": "30%"
32
+ "query": "quick brown fox",
33
+ "minimum_should_match ": "30%"
34
34
}
35
35
}
36
36
},
37
37
"should": {
38
- "match_phrase": <2>
38
+ "match_phrase": { <2>
39
39
"title": {
40
40
"query": "quick brown fox",
41
41
"slop": 50
@@ -46,6 +46,8 @@ GET /my_index/my_type/_search
46
46
}
47
47
}
48
48
--------------------------------------------------
49
+ // SENSE: 120_Proximity_Matching/25_Relevance.json
50
+
49
51
<1> The `must` clause includes or excludes documents from the resultset.
50
52
<2> The `should` clause increases the relevance score of those documents that
51
53
match.
Original file line number Diff line number Diff line change @@ -58,16 +58,16 @@ GET /my_index/my_type/_search
58
58
"query": {
59
59
"match": { <1>
60
60
"title": {
61
- "query": "quick brown fox",
62
- "minimum_must_match ": "30%"
61
+ "query": "quick brown fox",
62
+ "minimum_should_match ": "30%"
63
63
}
64
64
}
65
65
},
66
66
"rescore": {
67
67
"window_size": 50, <2>
68
68
"query": { <3>
69
69
"rescore_query": {
70
- "match_phrase":
70
+ "match_phrase": {
71
71
"title": {
72
72
"query": "quick brown fox",
73
73
"slop": 50
@@ -78,6 +78,8 @@ GET /my_index/my_type/_search
78
78
}
79
79
}
80
80
--------------------------------------------------
81
+ // SENSE: 120_Proximity_Matching/30_Performance.json
82
+
81
83
<1> The `match` query decides which results will be included in the final
82
84
result set and ranks results according to TF/IDF.
83
85
<2> The `window_size` is the number of top results to rescore, per shard.
Original file line number Diff line number Diff line change @@ -92,6 +92,7 @@ PUT /my_index
92
92
}
93
93
}
94
94
--------------------------------------------------
95
+ // SENSE: 120_Proximity_Matching/35_Shingles.json
95
96
96
97
<1> See <<relevance-is-broken>>.
97
98
<2> The default min/max shingle size is `2` so we don't really need to set
Original file line number Diff line number Diff line change
1
+ # Delete the `my_index` index
2
+ DELETE /my_index
3
+
4
+ # Create `my_index` with a single primary shard
5
+ PUT /my_index
6
+ { "settings" : { "number_of_shards" : 1 }}
7
+
8
+ # Index some example docs
9
+ POST /my_index/my_type/_bulk
10
+ { "index" : { "_id" : 1 }}
11
+ { "title" : " The quick brown fox" }
12
+ { "index" : { "_id" : 2 }}
13
+ { "title" : " The quick brown fox jumps over the lazy dog" }
14
+ { "index" : { "_id" : 3 }}
15
+ { "title" : " The quick brown fox jumps over the quick dog" }
16
+ { "index" : { "_id" : 4 }}
17
+ { "title" : " Brown fox brown dog" }
18
+
19
+ # match_phrase query
20
+ GET /my_index/my_type/_search
21
+ {
22
+ "query" : {
23
+ "match_phrase" : {
24
+ "title" : " quick brown fox"
25
+ }
26
+ }
27
+ }
28
+
29
+ # match query, type phrase
30
+ GET /my_index/my_type/_search
31
+ {
32
+ "query" : {
33
+ "match" : {
34
+ "title" : {
35
+ "type" : " phrase" ,
36
+ "query" : " quick brown fox"
37
+ }
38
+ }
39
+ }
40
+ }
Original file line number Diff line number Diff line change
1
+ # Term positions
2
+ GET /_analyze?text=Quick brown fox
3
+
Original file line number Diff line number Diff line change
1
+ # Delete the `my_index` index
2
+ DELETE /my_index
3
+
4
+ # Create `my_index` with a single primary shard
5
+ PUT /my_index
6
+ { "settings" : { "number_of_shards" : 1 }}
7
+
8
+ # Index some example docs
9
+ POST /my_index/my_type/_bulk
10
+ { "index" : { "_id" : 1 }}
11
+ { "title" : " The quick brown fox" }
12
+ { "index" : { "_id" : 2 }}
13
+ { "title" : " The quick brown fox jumps over the lazy dog" }
14
+ { "index" : { "_id" : 3 }}
15
+ { "title" : " The quick brown fox jumps over the quick dog" }
16
+ { "index" : { "_id" : 4 }}
17
+ { "title" : " Brown fox brown dog" }
18
+
19
+
20
+ # Phrase query - doesn't match
21
+ GET /my_index/my_type/_search
22
+ {
23
+ "query" : {
24
+ "match_phrase" : {
25
+ "title" : {
26
+ "query" : " quick fox"
27
+ }
28
+ }
29
+ }
30
+ }
31
+
32
+
33
+ # Proximity query with slop - matches
34
+ GET /my_index/my_type/_search
35
+ {
36
+ "query" : {
37
+ "match_phrase" : {
38
+ "title" : {
39
+ "query" : " quick fox" ,
40
+ "slop" : 1
41
+ }
42
+ }
43
+ }
44
+ }
Original file line number Diff line number Diff line change
1
+ # Delete the `my_index` index
2
+ DELETE /my_index
3
+
4
+ # Create `my_index` with a single primary shard
5
+ PUT /my_index
6
+ { "settings" : { "number_of_shards" : 1 }}
7
+
8
+ # Index an example doc
9
+ PUT /my_index/groups/1
10
+ {
11
+ "names" : [
12
+ " John Abraham" ,
13
+ " Lincoln Smith"
14
+ ]
15
+ }
16
+
17
+ # Phrase "Abraham Lincoln" matches!
18
+ GET /my_index/groups/_search
19
+ {
20
+ "query" : {
21
+ "match_phrase" : {
22
+ "names" : " Abraham Lincoln"
23
+ }
24
+ }
25
+ }
26
+
27
+ # Delete `groups` mapping and data
28
+ DELETE /my_index/groups/
29
+
30
+ # Map `names` to use position_offset_gap
31
+ PUT /my_index/_mapping/groups
32
+ {
33
+ "properties" : {
34
+ "names" : {
35
+ "type" : " string" ,
36
+ "position_offset_gap" : 100
37
+ }
38
+ }
39
+ }
40
+
41
+ # Reindex document
42
+ PUT /my_index/groups/1
43
+ {
44
+ "names" : [
45
+ " John Abraham" ,
46
+ " Lincoln Smith"
47
+ ]
48
+ }
49
+
50
+ # Phrase "Abraham Lincoln" no longer matches
51
+ GET /my_index/groups/_search
52
+ {
53
+ "query" : {
54
+ "match_phrase" : {
55
+ "names" : " Abraham Lincoln"
56
+ }
57
+ }
58
+ }
59
+
60
+ # But phrase "John Abraham" does
61
+ GET /my_index/groups/_search
62
+ {
63
+ "query" : {
64
+ "match_phrase" : {
65
+ "names" : " John Abraham"
66
+ }
67
+ }
68
+ }
Original file line number Diff line number Diff line change
1
+ # Delete the `my_index` index
2
+ DELETE /my_index
3
+
4
+ # Create `my_index` with a single primary shard
5
+ PUT /my_index
6
+ { "settings" : { "number_of_shards" : 1 }}
7
+
8
+ # Index some example docs
9
+ POST /my_index/my_type/_bulk
10
+ { "index" : { "_id" : 1 }}
11
+ { "title" : " The quick brown fox" }
12
+ { "index" : { "_id" : 2 }}
13
+ { "title" : " The quick brown fox jumps over the lazy dog" }
14
+ { "index" : { "_id" : 3 }}
15
+ { "title" : " The quick brown fox jumps over the quick dog" }
16
+ { "index" : { "_id" : 4 }}
17
+ { "title" : " Brown fox brown dog" }
18
+
19
+ # High slop value
20
+ POST /my_index/my_type/_search
21
+ {
22
+ "query" : {
23
+ "match_phrase" : {
24
+ "title" : {
25
+ "query" : " quick dog" ,
26
+ "slop" : 50
27
+ }
28
+ }
29
+ }
30
+ }
Original file line number Diff line number Diff line change
1
+ # Delete the `my_index` index
2
+ DELETE /my_index
3
+
4
+ # Create `my_index` with a single primary shard
5
+ PUT /my_index
6
+ { "settings" : { "number_of_shards" : 1 }}
7
+
8
+ # Index some example docs
9
+ POST /my_index/my_type/_bulk
10
+ { "index" : { "_id" : 1 }}
11
+ { "title" : " The quick brown fox" }
12
+ { "index" : { "_id" : 2 }}
13
+ { "title" : " The quick brown fox jumps over the lazy dog" }
14
+ { "index" : { "_id" : 3 }}
15
+ { "title" : " The quick brown fox jumps over the quick dog" }
16
+ { "index" : { "_id" : 4 }}
17
+ { "title" : " Brown fox brown dog" }
18
+
19
+ # Combine phrase with match query to boost relevance
20
+ GET /my_index/my_type/_search
21
+ {
22
+ "query" : {
23
+ "bool" : {
24
+ "must" : {
25
+ "match" : {
26
+ "title" : {
27
+ "query" : " quick brown fox" ,
28
+ "minimum_should_match" : " 30%"
29
+ }
30
+ }
31
+ },
32
+ "should" : {
33
+ "match_phrase" : {
34
+ "title" : {
35
+ "query" : " quick brown fox" ,
36
+ "slop" : 50
37
+ }
38
+ }
39
+ }
40
+ }
41
+ }
42
+ }
You can’t perform that action at this time.
0 commit comments