QueryStringSearch

QueryStringSearch#

class eodag.plugins.search.qssearch.QueryStringSearch(provider, config)[source]#

A plugin that helps implementing any kind of search protocol that relies on query strings (e.g: opensearch). Most of the other search plugins inherit from this plugin.

Parameters:
  • provider (str) – provider name

  • config (PluginConfig) –

    Search plugin configuration:

    • result_type (str): One of json or xml, depending on the representation of the provider’s search results. The default is json.

    • results_entry (str) (mandatory): The name of the key in the provider search result that gives access to the result entries

    • api_endpoint (str) (mandatory): The endpoint of the provider’s search interface

    • need_auth (bool): if authentication is needed for the search request; default: False

    • auth_error_code (int): which error code is returned in case of an authentication error; only used if need_auth=true

    • ssl_verify (bool): if the ssl certificates should be verified in requests; default: True

    • asset_key_from_href (bool): guess assets keys using their href. Use their original key if False; default: True

    • dont_quote (list[str]): characters that should not be quoted in the url params

    • timeout (int): time to wait until request timeout in seconds; default: 5

    • retry_total (int): urllib3.util.Retry total parameter, total number of retries to allow; default: 3

    • retry_backoff_factor (int): urllib3.util.Retry backoff_factor parameter, backoff factor to apply between attempts after the second try; default: 2

    • retry_status_forcelist (list[int]): urllib3.util.Retry status_forcelist parameter, list of integer HTTP status codes that we should force a retry on; default: [401, 429, 500, 502, 503, 504]

    • literal_search_params (dict[str, str]): A mapping of (search_param => search_value) pairs giving search parameters to be passed as is in the search url query string. This is useful for example in situations where the user wants to add a fixed search query parameter exactly as it is done on the provider interface.

    • pagination (Pagination) (mandatory): The configuration of how the pagination is done on the provider. It is a tree with the following nodes:

      • next_page_url_tpl (str) (mandatory): The template for pagination requests. This is a simple Python format string which will be resolved using the following keywords: url (the base url of the search endpoint), search (the query string corresponding to the search request), items_per_page (the number of items to return per page), skip (the number of items to skip) or skip_base_1 (the number of items to skip, starting from 1) and page (which page to return).

      • total_items_nb_key_path (str): An XPath or JsonPath leading to the total number of results satisfying a request. This is used for providers which provides the total results metadata along with the result of the query and don’t have an endpoint for querying the number of items satisfying a request, or for providers for which the count endpoint returns a json or xml document

      • count_endpoint (str): The endpoint for counting the number of items satisfying a request

      • count_tpl (str): template for the count parameter that should be added to the search request

      • next_page_url_key_path (str): A JsonPath expression used to retrieve the URL of the next page in the response of the current page.

      • max_items_per_page (int): The maximum number of items per page that the provider can handle; default: 50

      • start_page (int): number of the first page; default: 1

    • discover_product_types (DiscoverProductTypes): configuration for product type discovery based on information from the provider; It contains the keys:

    • sort (Sort): configuration for sorting the results. It contains the keys:

      • sort_by_default (list[Tuple(str, Literal["ASC", "DESC"])]): parameter and sort order by which the result will be sorted by default (if the user does not enter a sort_by parameter); if not given the result will use the default sorting of the provider; Attention: for some providers sorting might cause a timeout if no filters are used. In that case no default sort parameters should be given. The format is:

        sort_by_default:
            - !!python/tuple [<param>, <sort order> (ASC or DESC)]
        
      • sort_by_tpl (str): template for the sort parameter that is added to the request; It contains the parameters sort_param and sort_order which will be replaced by user input or default value. If the parameters are added as query params to a GET request, the string should start with &, otherwise it should be a valid json string surrounded by {{ }}.

      • sort_param_mapping (Dict [str, str]): mapping for the parameters available for sorting

      • sort_order_mapping (dict[Literal["ascending", "descending"], str]): mapping for the sort order

      • max_sort_params (int): maximum number of sort parameters supported by the provider; used to validate the user input to avoid failed requests or unexpected behaviour (not all parameters are used in the request)

    • metadata_mapping (dict[str, Any]): The search plugins of this kind can detect when a metadata mapping is “query-able”, and get the semantics of how to format the query string parameter that enables to make a query on the corresponding metadata. To make a metadata query-able, just configure it in the metadata mapping to be a list of 2 items, the first one being the specification of the query string search formatting. The later is a string following the specification of Python string formatting, with a special behaviour added to it. For example, an entry in the metadata mapping of this kind:

      completionTimeFromAscendingNode:
          - 'f=acquisition.endViewingDate:lte:{completionTimeFromAscendingNode#timestamp}'
          - '$.properties.acquisition.endViewingDate'
      

      means that the search url will have a query string parameter named f with a value of acquisition.endViewingDate:lte:1543922280.0 if the search was done with the value of completionTimeFromAscendingNode being 2018-12-04T12:18:00. What happened is that {completionTimeFromAscendingNode#timestamp} was replaced with the timestamp of the value of completionTimeFromAscendingNode. This example shows all there is to know about the semantics of the query string formatting introduced by this plugin: any eodag search parameter can be referenced in the query string with an additional optional conversion function that is separated from it by a # (see format_metadata() for further details on the available converters). Note that for the values in the free_text_search_operations configuration parameter follow the same rule. If the metadata_mapping is not a list but only a string, this means that the parameters is not queryable but it is included in the result obtained from the provider. The string indicates how the provider result should be mapped to the eodag parameter.

    • discover_metadata (DiscoverMetadata): configuration for the auto-discovery of queryable parameters as well as parameters returned by the provider which are not in the metadata mapping. It has the attributes:

      • auto_discovery (bool): if the automatic discovery of metadata is activated; default: False; if false, the other parameters are not used;

      • metadata_pattern (str): regex string a parameter in the result should match so that is used

      • search_param (Union [str, dict[str, Any]]): format to add a query param given by the user and not in the metadata mapping to the requests, ‘metadata’ will be replaced by the search param; can be a string or a dict containing free_text_search_operations (see ODataV4Search)

      • metadata_path (str): path where the queryable properties can be found in the provider result

    • discover_queryables (DiscoverQueryables): configuration to fetch the queryables from a provider queryables endpoint; It has the following keys:

      • fetch_url (str): url to fetch the queryables valid for all product types

      • product_type_fetch_url (str): url to fetch the queryables for a specific product type

      • result_type (str): type of the result (currently only json is used)

      • results_entry (str): json path to retrieve the queryables from the provider result

    • constraints_file_url (str): url to fetch the constraints for a specific product type, can be an http url or a path to a file; the constraints are used to build queryables

    • constraints_entry (str): key in the json result where the constraints can be found; if not given, it is assumed that the constraints are on top level of the result, i.e. the result is an array of constraints

__init__(provider, config)[source]#
Parameters:

Methods

__init__(provider, config)

build_query_string(product_type, query_dict)

Build The query string using the search parameters

build_sort_by(sort_by_arg)

Build the sorting part of the query string or body by transforming the sort_by argument into a provider-specific string or dictionary

clear()

Clear search context

collect_search_urls([prep])

Build paginated urls

count_hits(count_url[, result_type])

Count the number of results satisfying some criteria

discover_product_types(**kwargs)

Fetch product types list from provider using discover_product_types conf

discover_product_types_per_page(**kwargs)

Fetch product types list from provider using discover_product_types conf using paginated kwargs["fetch_url"]

discover_queryables(**kwargs)

Fetch queryables list from provider using discover_queryables conf

do_search([prep])

Perform the actual search request.

get_assets_from_mapping(provider_item)

Create assets based on the assets_mapping in the provider's config and an item returned by the provider

get_collections(prep, **kwargs)

Get the collection to which the product belongs

get_metadata_mapping([product_type])

Get the plugin metadata mapping configuration (product type specific if exists)

get_product_type_cfg_value(key[, default])

Get the value of a configuration option specific to the current product type.

get_product_type_def_params(product_type[, ...])

Get the provider product type definition parameters and specific settings

get_sort_by_arg(kwargs)

Extract the sort_by argument from the kwargs or the provider default sort configuration

list_queryables(filters, ...[, ...])

Get queryables

map_product_type(product_type, **kwargs)

Get the provider product type from eodag product type

normalize_results(results, **kwargs)

Build EOProducts from provider results

query([prep])

Perform a search on an OpenSearch-like interface

queryables_from_metadata_mapping([...])

Extract queryable parameters from product type metadata mapping.

validate(search_params, auth)

Validate a search request.

Attributes

extract_properties

plugins

auth

next_page_url

next_page_query_obj

total_items_nb

need_count