1
1
=== Indexing documents
2
2
3
- Before we can _index_ (store) our user document in Elasticsearch, we need
4
- to decide what the document represents, and where to store it.
3
+ Before we can _index_ (store and make searchable) our user document in
4
+ Elasticsearch, we need to decide what the document represents, and where to
5
+ store it.
5
6
6
7
In Elasticsearch, a document belongs to a _type_ , and those types live inside
7
8
an _index_ . You can draw some (rough) parallels to a traditional relational database:
8
9
9
- - RDBM => Databases => Tables => Columns/Rows
10
- - Elasticsearch => Indices => Types => Documents with Fields
11
10
12
- An Elasticsearch cluster can contain multiple Indices (databases), which in
13
- turn contain multiple Types (tables). These types hold multiple Documents (rows),
14
- and each document has Fields (columns).
11
+ Relational DB ⇒ Databases ⇒ Tables ⇒ Rows ⇒ Columns
12
+ Elasticsearch ⇒ Indices ⇒ Types ⇒ Documents ⇒ Fields
13
+
14
+ An Elasticsearch cluster can contain multiple _indices_ (databases), which in
15
+ turn contain multiple _types_ (tables). These types hold multiple _documents_
16
+ (rows), and each document has multiple _fields_ (columns).
15
17
16
18
==== An example
17
19
We are going to store a document in the `blogs` index, as type `user`, and we
@@ -21,8 +23,9 @@ than in the document itself:
21
23
22
24
[source,js]
23
25
--------------------------------------------------
24
- PUT /blogs/user/johnsmith?pretty
25
- {
26
+ <1> <2> <3>
27
+ PUT /blogs/user/johnsmith
28
+ { <4>
26
29
"email": "john@smith.com",
27
30
"name": {
28
31
"first": "John",
@@ -34,19 +37,22 @@ PUT /blogs/user/johnsmith?pretty
34
37
"interests": ["dolphins", "whales"]
35
38
}
36
39
--------------------------------------------------
37
-
40
+ <1> Index: `blogs`
41
+ <2> Type: `user`
42
+ <3> ID: `johnsmith`
43
+ <4> Document body
38
44
39
45
And we receive the following response, which confirms that our document
40
46
has been indexed correctly:
41
47
42
48
[source,js]
43
49
--------------------------------------------------
44
50
{
45
- "ok" : true ,
46
- "_index" : "blogs ",
47
- "_type" : "user ",
48
- "_id" : "johnsmith" ,
49
- "_version" : 1
51
+ "_index": "blogs" ,
52
+ "_type" : "user ",
53
+ "_id" : "johnsmith ",
54
+ "_version": 1 ,
55
+ "created": true
50
56
}
51
57
--------------------------------------------------
52
58
@@ -55,8 +61,8 @@ Congratulations! You just indexed your first document! How easy was that?
55
61
56
62
=== Real-time GET
57
63
58
- Elasticsearch has _real-time GET_. In other words, as soon as the document
59
- has been indexed, it can be retrieved from any node in the cluster.
64
+ Elasticsearch has _real-time GET_. In other words, as soon as a document
65
+ has been indexed it can be retrieved from any node in the cluster.
60
66
61
67
Not only that, but changes to documents are _persistent_: if the whole cluster
62
68
were to suffer a power failure immediately after indexing a document, the
@@ -67,10 +73,9 @@ that we specified when indexing it:
67
73
68
74
[source,js]
69
75
--------------------------------------------------
70
- GET /blogs/user/johnsmith?pretty
76
+ GET /blogs/user/johnsmith
71
77
--------------------------------------------------
72
78
73
-
74
79
The response contains the exact same JSON document that we indexed, as the
75
80
`_source` field, plus some extra metadata:
76
81
@@ -81,7 +86,7 @@ The response contains the exact same JSON document that we indexed, as the
81
86
"_type" : "user",
82
87
"_id" : "johnsmith",
83
88
"_version" : 1,
84
- "exists " : true,
89
+ "found " : true,
85
90
"_source" : {
86
91
"email": "john@smith.com",
87
92
"name": {
0 commit comments