|
| 1 | +--- |
| 2 | +navigation_title: "Rank Vectors" |
| 3 | +mapped_pages: |
| 4 | + - https://www.elastic.co/guide/en/elasticsearch/reference/master/rank-vectors.html |
| 5 | +--- |
| 6 | + |
| 7 | +# Rank Vectors [rank-vectors] |
| 8 | + |
| 9 | + |
| 10 | +::::{warning} |
| 11 | +This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. |
| 12 | +:::: |
| 13 | + |
| 14 | + |
| 15 | +The `rank_vectors` field type enables late-interaction dense vector scoring in Elasticsearch. The number of vectors per field can vary, but they must all share the same number of dimensions and element type. |
| 16 | + |
| 17 | +The purpose of vectors stored in this field is second order ranking documents with max-sim similarity. |
| 18 | + |
| 19 | +Here is a simple example of using this field with `float` elements. |
| 20 | + |
| 21 | +```console |
| 22 | +PUT my-rank-vectors-float |
| 23 | +{ |
| 24 | + "mappings": { |
| 25 | + "properties": { |
| 26 | + "my_vector": { |
| 27 | + "type": "rank_vectors" |
| 28 | + } |
| 29 | + } |
| 30 | + } |
| 31 | +} |
| 32 | + |
| 33 | +PUT my-rank-vectors-float/_doc/1 |
| 34 | +{ |
| 35 | + "my_vector" : [[0.5, 10, 6], [-0.5, 10, 10]] |
| 36 | +} |
| 37 | +``` |
| 38 | + |
| 39 | +In addition to the `float` element type, `byte` and `bit` element types are also supported. |
| 40 | + |
| 41 | +Here is an example of using this field with `byte` elements. |
| 42 | + |
| 43 | +```console |
| 44 | +PUT my-rank-vectors-byte |
| 45 | +{ |
| 46 | + "mappings": { |
| 47 | + "properties": { |
| 48 | + "my_vector": { |
| 49 | + "type": "rank_vectors", |
| 50 | + "element_type": "byte" |
| 51 | + } |
| 52 | + } |
| 53 | + } |
| 54 | +} |
| 55 | + |
| 56 | +PUT my-rank-vectors-byte/_doc/1 |
| 57 | +{ |
| 58 | + "my_vector" : [[1, 2, 3], [4, 5, 6]] |
| 59 | +} |
| 60 | +``` |
| 61 | + |
| 62 | +Here is an example of using this field with `bit` elements. |
| 63 | + |
| 64 | +```console |
| 65 | +PUT my-rank-vectors-bit |
| 66 | +{ |
| 67 | + "mappings": { |
| 68 | + "properties": { |
| 69 | + "my_vector": { |
| 70 | + "type": "rank_vectors", |
| 71 | + "element_type": "bit" |
| 72 | + } |
| 73 | + } |
| 74 | + } |
| 75 | +} |
| 76 | + |
| 77 | +POST /my-rank-vectors-bit/_bulk?refresh |
| 78 | +{"index": {"_id" : "1"}} |
| 79 | +{"my_vector": [127, -127, 0, 1, 42]} |
| 80 | +{"index": {"_id" : "2"}} |
| 81 | +{"my_vector": "8100012a7f"} |
| 82 | +``` |
| 83 | + |
| 84 | +## Parameters for rank vectors fields [rank-vectors-params] |
| 85 | + |
| 86 | +The `rank_vectors` field type supports the following parameters: |
| 87 | + |
| 88 | +$$$rank-vectors-element-type$$$ |
| 89 | + |
| 90 | +`element_type` |
| 91 | +: (Optional, string) The data type used to encode vectors. The supported data types are `float` (default), `byte`, and bit. |
| 92 | + |
| 93 | +::::{dropdown} Valid values for `element_type` |
| 94 | +`float` |
| 95 | +: indexes a 4-byte floating-point value per dimension. This is the default value. |
| 96 | + |
| 97 | +`byte` |
| 98 | +: indexes a 1-byte integer value per dimension. |
| 99 | + |
| 100 | +`bit` |
| 101 | +: indexes a single bit per dimension. Useful for very high-dimensional vectors or models that specifically support bit vectors. NOTE: when using `bit`, the number of dimensions must be a multiple of 8 and must represent the number of bits. |
| 102 | + |
| 103 | +:::: |
| 104 | + |
| 105 | + |
| 106 | +`dims` |
| 107 | +: (Optional, integer) Number of vector dimensions. Can’t exceed `4096`. If `dims` is not specified, it will be set to the length of the first vector added to the field. |
| 108 | + |
| 109 | + |
| 110 | +## Synthetic `_source` [rank-vectors-synthetic-source] |
| 111 | + |
| 112 | +::::{important} |
| 113 | +Synthetic `_source` is Generally Available only for TSDB indices (indices that have `index.mode` set to `time_series`). For other indices synthetic `_source` is in technical preview. Features in technical preview may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. |
| 114 | +:::: |
| 115 | + |
| 116 | + |
| 117 | +`rank_vectors` fields support [synthetic `_source`](mapping-source-field.md#synthetic-source) . |
| 118 | + |
| 119 | + |
| 120 | +## Scoring with rank vectors [rank-vectors-scoring] |
| 121 | + |
| 122 | +Rank vectors can be accessed and used in [`script_score` queries](/reference/query-languages/query-dsl-script-score-query.md). |
| 123 | + |
| 124 | +For example, the following query scores documents based on the maxSim similarity between the query vector and the vectors stored in the `my_vector` field: |
| 125 | + |
| 126 | +```console |
| 127 | +GET my-rank-vectors-float/_search |
| 128 | +{ |
| 129 | + "query": { |
| 130 | + "script_score": { |
| 131 | + "query": { |
| 132 | + "match_all": {} |
| 133 | + }, |
| 134 | + "script": { |
| 135 | + "source": "maxSimDotProduct(params.query_vector, 'my_vector')", |
| 136 | + "params": { |
| 137 | + "query_vector": [[0.5, 10, 6], [-0.5, 10, 10]] |
| 138 | + } |
| 139 | + } |
| 140 | + } |
| 141 | + } |
| 142 | +} |
| 143 | +``` |
| 144 | + |
| 145 | +Additionally, asymmetric similarity functions can be used to score against `bit` vectors. For example, the following query scores documents based on the maxSimDotProduct similarity between a floating point query vector and bit vectors stored in the `my_vector` field: |
| 146 | + |
| 147 | +```console |
| 148 | +PUT my-rank-vectors-bit |
| 149 | +{ |
| 150 | + "mappings": { |
| 151 | + "properties": { |
| 152 | + "my_vector": { |
| 153 | + "type": "rank_vectors", |
| 154 | + "element_type": "bit" |
| 155 | + } |
| 156 | + } |
| 157 | + } |
| 158 | +} |
| 159 | + |
| 160 | +POST /my-rank-vectors-bit/_bulk?refresh |
| 161 | +{"index": {"_id" : "1"}} |
| 162 | +{"my_vector": [127, -127, 0, 1, 42]} |
| 163 | +{"index": {"_id" : "2"}} |
| 164 | +{"my_vector": "8100012a7f"} |
| 165 | + |
| 166 | +GET my-rank-vectors-bit/_search |
| 167 | +{ |
| 168 | + "query": { |
| 169 | + "script_score": { |
| 170 | + "query": { |
| 171 | + "match_all": {} |
| 172 | + }, |
| 173 | + "script": { |
| 174 | + "source": "maxSimDotProduct(params.query_vector, 'my_vector')", |
| 175 | + "params": { |
| 176 | + "query_vector": [ |
| 177 | + [0.35, 0.77, 0.95, 0.15, 0.11, 0.08, 0.58, 0.06, 0.44, 0.52, 0.21, |
| 178 | + 0.62, 0.65, 0.16, 0.64, 0.39, 0.93, 0.06, 0.93, 0.31, 0.92, 0.0, |
| 179 | + 0.66, 0.86, 0.92, 0.03, 0.81, 0.31, 0.2 , 0.92, 0.95, 0.64, 0.19, |
| 180 | + 0.26, 0.77, 0.64, 0.78, 0.32, 0.97, 0.84] |
| 181 | + ] <1> |
| 182 | + } |
| 183 | + } |
| 184 | + } |
| 185 | + } |
| 186 | +} |
| 187 | +``` |
| 188 | + |
| 189 | +1. Note that the query vector has 40 elements, matching the number of bits in the bit vectors. |
| 190 | + |
| 191 | + |
| 192 | + |
0 commit comments