webcast slides

Cloudant Querying Options
Agenda
Indexing Options
Cloudant Query
Primary
Review
What’s New
Live Tutorial
Views/MapReduce
Search
Geospatial
5/20/15
2
Reading & Writing Basics
POST
/<database>
/_bulk_docs
GET
/<database>/<doc_id>
/_all_docs
5/20/15
3
Indexes & Queries
5/20/15
4
Options
Primary
Secondary/Views
(MapReduce)
Cloudant
Query
Search
(Lucene)
Geospatial
5/20/15
5
Cloudant Indexes
Primary Index
– Exists out-of-the-box
Views (MapReduce), Search &
Geospatial Indexes
– Define access patterns, not just
for speed-ups
– Built incrementally
– Index functions are written in
JavaScript
– Stored in _design documents
Primary Index
/_all_docs?startkey=“a”&endkey=“d”&include_docs=true
Notes
•
•
•
Primary Key = doc._id
Exists OOTB
Stored in a b-tree
Use Cases
•
•
•
•
Use when you can find documents based on
their _id
Pull back a range of keys (_id)
Retrieve either only _ids and _revs, or full
doc bodies
Data exports
5/20/15
7
Views (MapReduce)
/_design/app/_view/count_by_user?group=true
Notes
•
•
•
Built using
MapReduce
Stored in a btree
Key = userdefined
field(s)
Use Cases
•
•
•
Use when you need to
analyze data or get a
range of secondary
keys
Time series analytics
Examples: count data
fields, sum/average
numeric results,
advanced stas, group
by date, etc.
5/20/15
8
Search (Lucene)
/_design/app/_search/animals?q=l* AND class:mammal
Notes
•
•
Built using
Lucene
FTI: Any or
all fields can
be indexed
Use Cases
•
•
•
•
Ad hoc queries
Lucene sytnax
(wildcards, fuzzy,
ranges, etc.)
Groups/facets on
fields
Basic geo: bbox &
sort by distance
5/20/15
9
Geospatial
/_design/app/_geo/geoidx?lat=-42&lon=-71&radius=1000
Notes
•
•
•
Stored in R* tree
TPR/MVR trees for
temporal
Lat/long
coordinates stored
in GeoJSON
Use Cases
•
•
Complex
geometries
(polygon,
circularstring, etc.)
Advanced relations
(intersect, overlaps,
etc.)
5/20/15
10
Cloudant Query
5/20/15
11
Cloudant Query is the place to start
Cloudant Query is designed to be the logical starting point for developers
new to Cloudant and to CouchDB
In fact, we’re contributing it back to the Apache CouchDB™ project
© "Apache", "CouchDB", "Apache CouchDB" and the CouchDB logo are trademarks or registered trademarks of
The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.
12
Cloudant Query
_index
Primary
Map
_find
Adhoc Search
Developer Familiarity
• SQL-like
• Mongo-like
• JSON everywhere
Intuitive
• _design docs
• JavaScript functions
• Consistent API
Powerful
• Operators & field filtering
• Auto-index all
• Natively compiled
5/20/15
13
Cloudant Query
•
_index
POST /_index – create index
• “type”: “json” for fast lookups on secondary keys
• “type”: “text” for full adhoc querying capability
•
GET /_index – list indexes
5/21/15
14
Cloudant Query
•
_find
POST /_find – query your database
Operator
Usage
$lt
Less than
$lte
Less than or equal to
$eq
Equal to
$ne
Not equal to
$gt
Greater than
$gte
Greater than or equal to
$text
(“text” type ONLY) matches any field using default analyzer
$exists
Boolean (exists or it does not)
$type
Check document field’s type
$in
Field must exist in the provided array of values
$nin
Field must not exist in the provided array of values
$size
Length of array field must match this value
$mod
[Divisor, Remainder]. Returns true when the field equals the
remainder after being divided by the divisor.
$regex
Matches provided regular expression
Operator
Usage
$and
Matches if all selectors in the array match
$or
Matches if any selectors in the array match
$not
Matches if the given selector does not match
$nor
Matches if none of the selectors (multiple) match
$all
Matches an array value if it contains all element of argument array
$elemMatch
Returns first element (if any) matching value of argument
5/21/15
15
SQL vs. Cloudant Query
•
•
•
•
selector - which subset of the data to return; the equivalent of the ‘WHERE’ part of an SQL
statement
fields - the fields to be returned; the equivalent of the ‘SELECT’ part of an SQL statement
sort - how the result set is to be ordered ; the equivalent of the ‘ORDER BY’ part of an SQL
statement
limit - how many results to return
16
Let’s go to the movies
Replicate me!
The dataset we’re using in the following example is a small
subset of IMDB data that the service makes available for
non-commercial and educational purposes. Here, we’ve
denormalized the separate tables for Actor, Movie, and
Person to fit within Cloudant’s JSON document-oriented
model.
https://examples.cloudant.com/query-movies
In accordance with IMDb’s Conditions of Use statement, we’d like
to add:
Information courtesy of
IMDb
(http://www.imdb.com).
Used with permission.
17
Best Practices on Querying
•
Start with Cloudant Query!
•
•
•
•
•
•
Find by _id OOTB
Use CQ json for fast lookups on fixed secondary keys
Use CQ text for full adhoc querying capability
Use MapReduce/Views for online analytical use cases, group-level
capabilities, or map-side joins.
Use Search for Lucene goodies (wildcards, facets, fuzzy, etc.) or
basic geo bbox
Use Geospatial for advanced spatial queries, polygons, 4D
5/21/15
18
Resources
Get an account and try it out:
▪
▪
▪
http://docs.cloudant.com/guides/cloudant-query.html
Create an account at https://cloudant.com/sign-up/
Sample db: https://examples.cloudant.com/query-movies
Check out the full docs for Cloudant Query:
▪
https://docs.cloudant.com/api.html#query
Watch videos, read the docs, and try tutorials in the
IBM Cloudant Learning Center:
▪
https://cloudant.com/learning-center/
19
Thank you!
5/20/15
20
Appendix
5/20/15
21
Unique to Cloudant
(not in CouchDB)
Cloudant Index/Query Options
Check out Index and Query intro video!
CRUD – Document
Primary
Index
• Direct document lookup • Exists “OOTB”
• Stored in a b-tree
by _id
• Primary key > doc._id
• Use when you want a
single document and
can find by its _id
• Docs
• For Developers Tutorial
Secondary Index (view)
• Built by using
• Built by using Lucene
• FTI: Any or all fields can
MapReduce
• Stored in a b-tree
be indexed
• Key > user-defined fields
• Use when you can find
• Use when you need to
documents based on
analyze data or get a
their _id
range of keys
• Pull back a range of keys • Examples: count data
fields, sum/average
numeric results,
advanced stats, group
date, and so on.
• Docs
• by
Docs
• For Developers Tutorial
• Video
Search
Index
• For Developers Tutorial
• Example
Geospatial
Index
Cloudant
Query
• Stored in R* tree
• “Mongo-style” querying
• Lat/Long coordinates in • Built natively in erlang
• Wraps Primary, views,
GeoJSON
and Search
• Ad hoc queries
• Find documents based
on their contents
• Can do groups, facets,
and basic geo queries
(bbox and sort by
distance)
• Complex geometries
• Ad hoc queries
(polygon, circularstring, • Many operators (>, <, IN,
etc.)
OR, AND, and so on)
• Advanced relations
• Intuitive for people who
(intersect, overlaps, etc.) come from Mongo or
SQL backgrounds
• Docs
• For Developers Tutorial
• Example
• Docs
• Example
• Docs
• Blog post
• Example
Cloudant Index Cheat Sheet
CRUD –
Document
Primary
Index
View (MapReduce)
Search
Index
X
X
X
X
X
X
X
X
X
X
Secondary key lookup
X
X
X
Secondary key range
X
X
X
X
X
X
Feature
Select Primary key lookup
ion Primary key range
X
List of keys
Geospatial Index
X
Complex (ex: array) keys
Cloudant
Query
X
Adhoc lookup/range on multiple keys
X
X
Boolean operations (AND, OR, NOT, etc.)
X
X
Lucene: wildcards, fuzzy, boosting terms, proximity, facets, analyzers,
etc.
X
Geo: bounding box
X
X
Geo: polygon, geometries, 4D, radius, relations
X
Group Joins (via map-side “linked” keys)
ing
Group by (count)
X
X
Group by (analytics)
Group by (hierarchical analytics à
X
group_level)
X
X
X
X
X
X
X
Sort by distance
Result 200 results max
s
Filter on fields at query time
X
X
X
X
X
X
X
X
X
limit
X
X
skip
X
X
bookmark
stale=ok
X
X
Group by range/facet
Sortin Sort by key(s)
g
Adhoc sort on multiple keys
X
X
X
X
X