Yandex Rich Content API. Developer`s guide

Yandex Rich Content API
Developer's guide
11.06.2015
Yandex Rich Content API. Developer's guide. Version 0.2
Document build date: 11.06.2015.
This volume is a part of Yandex technical documentation.
Yandex helpdesk site: http://help.yandex.ru
© 2008—2015 Yandex LLC. All rights reserved.
Copyright Disclaimer
Yandex (and its applicable licensor) has exclusive rights for all results of intellectual activity and equated to them means of individualization, used for development, support,
and usage of the service Yandex Rich Content API. It may include, but not limited to, computer programs (software), databases, images, texts, other works and inventions, utility
models, trademarks, service marks, and commercial denominations. The copyright is protected under provision of Part 4 of the Russian Civil Code and international laws.
You may use Yandex Rich Content API or its components only within credentials granted by the Terms of Use of Yandex Rich Content API or within an appropriate Agreement.
Any infringements of exclusive rights of the copyright owner are punishable under civil, administrative or criminal Russian laws.
Contact information
Yandex LLC
http://www.yandex.com
Phone: +7 495 739 7000
Email: [email protected]
Headquarters: 16 L'va Tolstogo St., Moscow, Russia 119021
Contents
Overview of the API ............................................................................................................................................................................ 4
Yandex Rich Content API
Developer's guide
Developer's guide
Overview of the API
Request format
To get data about a web page, send an HTTP GET request of the following type:
GET http://rca.yandex.com/?key=yourKEY&url=targetURL[&callback=yourFUNCTION]
[&full=1]
Input data:
Parameter
Value
Mandatory
key
Unique API key.
To get your free API key, please fill out a simple form.
url
The URL that data is being requested for. The URL can be passed in any format (as a shortened link,
as a URL with parameters, or others). The service will automatically expand it and canonize it, if
necessary.
Optional
callback
The name of your callback function.
img
The mode for detecting images on the page:
content
full
•
best (default) — Detects one or several of the main images on the page (no more than four
images).
•
no — Images will not be detected on the page.
The mode for detecting the page's text description:
•
short (default) — Detects a short page summary (snippet).
•
full — Detects the full page text.
•
no — Page content is not detected.
Add full=1 if you want to get the full text of the page with links to images.
Response format
The service responds in JSON format. For example:
"url": "http://blogs.wsj.com/tech-europe/2012/10/22/yandex-throws-open-globalsearch/",
"finalurl": "http://blogs.wsj.com/tech-europe/2012/10/22/yandex-throws-openglobal-search/",
"confidence": {
"img": "high",
"content": "high"
},
"title": "Yandex Throws Open Global Search",
"img": ["http://s.wsj.net/public/resources/images/OBVA973_volz10_A_20121022081329.jpg"],
"mime": "text/html",
"content": "Yandex, Russia’s leading search engine, has thrown open its
Internal global index to international start-ups, CEO Arkady Volozh said.
\nYandex, often described as Russia’s Google, built the index for Russian users
searching in Russia but looking for non-Russian content, Mr. Volozh said at the
recent F.ounders event in Dublin. It was a huge effort for a very small section
of the firm’s market.\n“We decided to open it up to see what people can do with
it,” Mr. Volozh said."
Yandex Rich Content API
Developer's guide
4
Developer's guide
Sample response with information about a page containing images and video:
{
"video": [
{
"duration":2179,
"url":"http://vimeo.com/moogaloop.swf?clip_id=58443905"
}
],
"url":"http://vimeo.com/58443905",
"title":"TES 2013 - Visions of Search",
"finalurl":"http://vimeo.com/58443905",
"img": [
"http://b.vimeocdn.com/ts/405/627/405627351_1280.jpg"
]
}
Note:
If a callback function is specified, the service responds in JSONP format.
Field
Value
Mandatory
url
Page address, extracted from the request.
finalurl
Page address in canonical format.
Optional
title
Page title.
content
Text summary (snippet) or full page text (if the full parameter is specified).
img
List of links to the main images displayed on the page, or links to all the images (if the
full parameter is specified). The list of images is returned as an array of URLs.
video
Paired list (url, duration) for the main video clips on the page.
mime
The page's MIME type.
confidence
The degree of confidence in the quality of the selected page summary (content)
and images (img).
The following values are used: high, medium, low.
Since data is constantly being updated for pages that are stored in the Yandex content system, we do
not recommend client-side caching of the service's responses.
Error codes
JSON/JSONP Reason
error codes
HTTP response body
400 Bad request The mandatory url parameter is missing, {"error_type":"external","url":"requested
the URL format is invalid, or the
url","error_message":"'url' parameter is
callback parameter is invalid.
missing"}
or
{"error_type":"internal","url":"requested
url","error_message":"Invalid 'callback'
parameter"}
Yandex Rich Content API
Developer's guide
5
Developer's guide
JSON/JSONP Reason
error codes
401
Unauthorized
Invalid key.
403 Forbidden
The mandatory key parameter is missing.
500 Internal
Internal error.
server error/200
ОК
HTTP response body
{"error_type":"internal","url":"requested
url","error_message":"Couldn't
build
preview"}
or
jsonpCallback({"error_type":"internal","url
":"requested
url","error_message":"Couldn't
build
preview"})
502
Bad gateway/
200 OK
External server responds with an error
or is not reachable.
{"error_type":"external","error_code":"erro
r
code","url":"url
with
error","error_message":"error description"}
or
jsonpCallback({"error_type":"external","err
or_code":"error
code","url":"url
with
error","error_message":"error
description"})
Example of an error when the requested URL was not found:
# URL not found HTTP Response
HTTP/1.1 502 Bad gateway
Server: nginx/1.2.1
Date: Thu, 05 Sep 2013 12:40:57 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
{"error_type":"external","error_code":"404","url":"http://yandex.com/
rca","error_message":"Not found"}
Yandex Rich Content API
Developer's guide
6
Yandex Rich Content API
Developer's guide
11.06.2015