new development version
Search() and Search_uri() gain new parameter ignore_unavailable to determine what happens if an index name does not exist (#273)connect() gains new parameter ignore_version. Internally, elastic sometimes checks the Elasticsearch version that the user is connected to to determine what to do. may be useful when it's not possible to check the Elasticsearch version, e.g., when its not possible to ping the root route of the API (#275)digits that is passed down to jsonlite::toJSON() used internally. thus, digits will control the number of decimal digits used in the JSON the package creates to be bulk loaded into Elasticsearch (#279)index_shrink() for index shrinking (#192)docs_bulk() to allow pipline attachments to work, all docs_bulk methods that do http requests (i.e, not prep fxns) gain the parameter query to pass through query parameters to the http request, including for example pipeline, _source etc. (#253)Search() and Search_uri() gain the parameter track_total_hits (default: TRUE) (#262) thanks @orenovwarn parameter in connect() was not being used across the entire package; now all methods should capture any warnings returned in the Elasticsearch HTTP API headers (#261)connect() does not create a DBI like connection object (#265)index_analyze() function where as is method I() should only be applied if the input parameter is not NULL - to avoid a warning (#269)docs_bulk_update(): subsetting data.frame's was not working correctly when data.frame's had only 1 column; fixed (#260)es_ver() in the Elasticsearch class to be more flexible in capturing Elasticsearch version (#268)crul version, helps fix a problem with passing along authentication details (#267)(#87) The connect() function is essentially the same, with some changes, but now you pass the connection object to each function all. This indeed will break code. That's why this is a major version bump.
There is one very big downside to this: breaks existing code. That's the big one. I do apologize for this, but I believe that is outweighed by the upsides: passing the connection object matches behavior in similar R packages (e.g., all the SQL database clients); you can now manage as many different connection objects as you like in the same R session; having the connection object as an R6 class allows us to have some simple methods on that object to ping the server, etc. In addition, all functions will error with an informative message if you don't pass the connection object as the first thing.
pipeline_create, pipeline_delete, pipeline_get, pipeline_simulate, and pipeline_attachment() (#191) (#226)docs_delete_by_query() and docs_update_by_query() to delete or update multiple documents at once, respectively; and new function reindex() to reindex all documents from one index to another (#237) (#195)crul for HTTP requests. this only should matter with respect to passing in curl options (#168)connect() (#241)docs_bulk_create(), docs_bulk_delete(), docs_bulk_index(). each of which are tailored to doing the operation in the function name: creating docs, deleting docs, or indexing docs (#183)type_remover() as a utility function to help users remove types from their files to use for bulk loading; could be used on example files in this package or user supplied files (#180)alias_rename() to rename aliasesscroll() example that wasn't working (#228)alias_create() (#230)docs_get gains new parameters source_includes and source_excludes to include or exclude certain fields in the returned document (#246) thanks @Jensxyindex_create() (#211)Search() and Search_uri() docs of how to use profiles (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-profile.html) (#194)docs_bulk_prep() for doing a mix of actions (i.e., delete, create, etc.)include_type_name param in mappings fxns (#250)docs_bulk_update() was not handling boolean values correctly. now fixed (#239) (#240) thanks to @dpmccabeinfo() method has been moved inside of the connection object. after calling x = connect() you can call x$info()ping() method has been marked as deprecated; instead, call ping() on the connection object created by a call to connect()docs_bulk_update() to do bulk updates to documents (#169)id is now optional in docs_create() - if you don't pass a document identifier Elasticsearch generates one for you (#216) thanks @jbrantdocs_bulk() gains new parameter quiet to optionally turn off the progress bar (#202)docs_bulk() for encoding in different locales (#223) (#224) thanks @Lchiffonindex_get(): you can now only pass in one value to the features parameter (one of settings, mappings, or aliases) (#218) thanks @happyshowsindex_create() to handle a list body, in addition to a JSON body (#214) thanks @emillykkejensendocs_bulk() for document IDs as factors (#212) thanks @AMR-KELEGdocs_bulk() (and taking up disk space) are cleaned up now (deleted), though if you pass in your own file paths you have to clean them up (#208) thanks @emillykkejensencharacter and
list.scroll() and scroll_clear() is now x, should
only matter if you specified the parameter name for the first parameterscroll parameter in scroll() function is now time_scrollasdf (for "as data.frame") to scroll() to give back a
data.frame (#163)scroll(), see parameter stream_opts in the
docs and examples (#160)tasks and tasks_cancel for the tasks API (#145)Search(), see parameter stream_opts in the
docs and examples. scroll parameter in Search() is now time_scroll
(#160)field_caps (for field capabilities) - in ES v5.4 and
greaterreindex for the reindex ES API (#134)index_template_get, index_template_put,
index_template_exists, and index_template_delete for the indices
templates ES API (#133)index_forcemerge for the ES index _forcemerge
route (#176)Search and Search_uri for how
to show progress bar (#162)docs_bulk to clarify what's allowed as first
parameter input (#173)docs_bulk change to internal JSON preparation to use
na = "null" and auto_unbox = TRUE in the jsonlite::toJSON
call. This means that NA's in R become null in the JSON
and atomic vectors are unboxed (#174) thanks @pieterprovoostmapping_create gains update_all_types parameter; and new man
file to explain how to enable fielddata if sorting needed (#164)suggest is used through query DSL instead of a route, added
example to Search (#102)ping() calls - so that after the first one
we used the cached version if called again within the same R session.
Should help speed up some code with respect to http calls (#184)
thanks @henfibercontent-type headers, for the most part
application/json (#197), though functions that work with the bulk API
use application/x-ndjson (#186)mapping_create egs (#199)type_exists to work on ES versions less to and greater than
v5 (#189)field_stats to indicate that its no longer avail. in
ES v5.4 and above - and that the fields parameter in ES >= v5 is
gone (#190)docs_update() to do partial document updates (#152)docs_bulk_prep() to prepare bulk format files
that you can use to load into Elasticsearch with this package, on the
command line, or in any other context (Python, Ruby, etc.) (#154)elastic works with Elasticsearch
v5. Note that not all v5 features are included here yet. (#153)docs_bulk() was not working on single column data.frame's. now is
working. (#151) thanks @gustavobiodocs_* functions now support ids with whitespace in them. (#155)docs_mget() to fix requesting certain fields back.es_base parameter in connect() - Now, instead of
stop() on es_base usage, we use its value for es_host. Only
pass in one or the other of es_base and es_host, not both.
(#146) thanks @MarcinKosinskiSearch_template(), Search_template_register(), Search_template_get(),
Search_template_delete(), and Search_template_render() (#101)docs_delete, docs_get and docs_create
to list correctly that numeric and character values are accepted for
the id parameter - before stated that numeric values allowed only (#144)
thanks @dominoFireSearch and related functions where
wildcards in indeces didn't work. Turned out we url escaped twice
unintentionally. Fixed now, and more tests added for wildcards.
(#143) thanks @martijnvanbeersdocs_bulk() to always return a list, whether it's given a file,
data.frame, or list. For a file, a named list is returned, while for a
data.frame or list an unnamed list is returned as many chunks can be processed
and we don't attempt to wrangle the list output. Inputs of data.frame and list
used to return NULL as we didn't return anything from the internal for loop.
You can wrap docs_bulk in invisible() if you don't want the list printed
(#142)docs_bulk() and msearch() in which base URL construction
was not done correctly (#141) thanks @steeled !scroll_clear() to clear search contexts created when
using scroll() (#140)ping() to ping an Elasticsearch server to see if
it is up (#138)connect() gains new parameter es_path to specify a context path,
e.g., the bar in http://foo.com/bar (#137)httr::content() calls to parse to plain text
and UTF-8 encoding (#118)scroll() all scores are
zero b/c scores are not calculated/tracked (#127)connect() no longer pings the ES server when run, but can
now be done separately with ping() (#139)connect() (#129)transport_schema param to connect() to specify
http or https (#130)docs_bulk() (#125)docs_bulk() function so that user supplied doc_ids
are not changed at all now (#123)Compatibility for many Elasticsearch versions has improved. We've tested on ES versions
from the current (v2.1.1) back to v1.0.0, and elastic works with all versions.
There are some functions that stop with a message with some ES versions simply
because older versions may not have had particular ES features. Please do let us
know if you have problems with older versions of ES, so we can improve compatibility.
index_settings_update() function to allow updating index settings (#66)JSON.
Error parsing has thus changed in elastic. We now have two levels of error
behavior: 'simple' and 'complete'. These can be set in connect() with the
errors parameter. Simple errors give back often just that there was an error,
sometimes a message with explanation is supplied. Complete errors give
more explanation and even the ES stack trace if supplied in the ES error
response (#92) (#93)msearch() to do multi-searches. This works by defining queries
in a file, much like is done for a file to be used in bulk loading. (#103)validate() to validate a search. (#105)percolate_count(),
percolate_delete(), percolate_list(), percolate_match(), percolate_register().
The percolator works by first storing queries into an index and then you define
documents in order to retrieve these queries. (#106)field_stats() to find statistical properties of a field without
executing a search (#107)cat_nodeattrs()index_recreate() as a convenience function that detects if an
index exists, and if so, deletes it first, then creates it again.docs_bulk() now supports passing in document ids (to the _id field)
via the parameter doc_ids for each input data.frame or list & supports using ids
already in data.frame's or lists (#83)cat_*() functions cleaned up. previously, some functions had parameters
that were essentially silently ignored. Those parameters dropped now
from the functions. (#96)/_search/exists),
but have removed that in favor of using regular _search with size=0 and
terminate_after=1 instead. (#104)lenient in Search() and Search_uri to allow format based
failures to be ignored, or not ignored.docs_get() when gthe document isn't founddocs_bulk() in the use case where users use
the function in a for loop, for example, and indexing started over,
replacing documents with the same id (#83)cat_() functions in which they sometimes failed
when parse=TRUE (#88)docs_bulk() in which user supplied document IDs weren't being
passed correctly internally (#90)Search() and Search_uri() where multiple indices weren't
supported, whereas they should have been - supported now (#115)mlt(), nodes_shutdown(), index_status(),
and mapping_delete() (#94) (#98) (#99) (#110)index_settings_update() function to allow updating index settings (#66)RCurl::curlEscape() with curl::curl_escape() (#81)v1 of httrSearch_uri() where the search is defined entirely in the URL itself.
Especially useful for cases in which POST requests are forbidden, e.g, on a server
that prevents POST requests (which the function Search() uses). (#58)nodes_shutdown() (#23)docs_bulk() gains ability to push data into Elasticsearch via the bulk http API
from data.frame or list objects. Previously, this function only would accept a file
formatted correctly. In addition, gains new parameters: index - The index name to use.
type - The type name to use. chunk_size - Size of each chunk. (#60) (#67) (#68)cat_*() functions gain new parameters: h to specify what fields to return; help to
output available columns, and their meanings; bytes to give numbers back machine
friendly; parse Parse to a data.frame or notcat_*() functions can now optionally capture data returned in to a data.frame (#64)Search() gains new parameter search_path to set the path that is used for searching.
The default is _search, but sometimes in your configuration you've setup so that
you don't need that path, or it's a different path. (023d28762e7e1028fcb0ad17867f08b5e2c92f93)docs_mget() added internal checker to make sure user passes in the right combination of
index, type, and id parameters, or index and type_id, or just index_type_id (#42)index, type, and id parameters required in the function docs_get() (#43)scroll() to allow long scroll_id's by passing scroll ids in the body instead
of as query parameter (#44)Search() function, in the error_parser() error parser function, check to see if
error element returned in response body from Elasticsearch, and if so, parse error, if not,
pass on body (likely empty) (#45)Search() function, added helper function to check size and from parameter
values passed in to make sure they are numbers. (#46)index and type parameters used, now using RCurl::curlEscape()
to URL escape. Other parameters passed in are go through httr CRUD methods, and do URL escaping
for us. (#49)First version to go to CRAN.
scroll() and a scroll parameter to the Search() function (#36)explain() to easily get at explanation of search results.?units-time and
?units=distance?searchapistokenizer_set() to set tokenizersconnect() run on package load to set default base url of localhost and port of 9200 -
you can override this by running that fxn yourself, or storing es_base, es_port, etc.
in your .Rprofile file.es_search() changed to Search().\dontrun instead of \donttest so they don't fail on CRAN checks.es_search_body() removed - body based queries using the query DSL moved to the Search()
function, passed into the body parameter.elastic more in line with the official Elasticsearch
Python client (http://elasticsearch-py.readthedocs.org/en/master/).index manual
page, and all functions prefixed with index_(). Thematic manual files are: index, cat,
cluster, alias, cdbriver, connect, documents, mapping, nodes, and search.es_cat() was changed to cat_() - we avoided cat() because as
you know there is already a widely used function in base R, see base::cat().cat functions to separate functions for each command, instead of passing
the command in as an argument. For example, cat('aliases') becomes cat_aliases().es_ prefix remains only for es_search(), as we have to avoid conflict with
base::search().assertthat package import, using stopifnot() instead (#14)