LogsProtoIndexingCrawlerIdCrawlerIdProto

GoogleApi.ContentWarehouse.V1.Model.LogsProtoIndexingCrawlerIdCrawlerIdProto


Table of Contents ▼

Jump to a specific part of the page:

Description

Proto-representation of the Crawler-ID in Web-Search (Alexandria-Scope). The string-representation (covered in //indexing/crawler_id/scope/alexandria/crawler_id.h) and the proto-representation are identical in meaning. For more information in regard to the crawler_id, please look at //depot/google3/indexing/crawler_id Used within the following components: - WebMirror: To understand the parsed crawler-ID and apply attributes within their own tables. - Serving : to identify the crawler-ID within the GenericSearchResponse, which implies being stored in the MDU and returned by ascorer to Superroot. - QSessions: To store the crawler-ID in all logged events for analysis. The default values represent the 'empty string' crawler-ID for the Alexandria-scope.

Attributes List

This module has the following attributes (case-insensitive ascending order):

View Attributes

Attributes

  1. country (type: String.t, default: nil)
    - The country to crawl the country from, defaults to the default non-specified crawling node (which is interpreted by most web-servers as USA). When specified, the crawling will fetch the document from a node in that country instead.
  2. deviceType (type: String.t, default: nil)
    - The device type, which maps into the useragent to be set when initiating the fetch-request, e.g. desktop-googlebot vs. smartphone-googlebot.
  3. indexGrowthExptType (type: String.t, default: nil)
    - Specifies whether the document is a duplicated document from the index growth experiment, detailed at go/indexsize_exp, defaults to not in any experiment.
  4. language (type: String.t, default: nil)
    - The language being set by the crawler. Defaults to UNKNOWN_LANGUAGE which indicates to not apply an accept-language header on the FetchRequest. When a language is specified, on crawling this language is converted into an accept-language header (e.g. GERMAN -> "Accept-language: de"). Script variations, e.g. ZH-HANS vs. ZH-HANT, are handled as different enum values (e.g. CHINESE vs. CHINESE_T).
  5. languageCode (type: String.t, default: nil)
    - Language-code used for identifying the locale of the document. 'language' and 'country' above are used for web-based documents, representing the detected language of the document and the country it was crawled from. The language code here, however, rather represents an artifical language_code applied to manually translated webpages (e.g. feeds), for instance for the pidgin-usecase. They are limited to the set of III-codes being supported by the client, yet are beyond the enum in 'language', e.g. to support variants of English across different countries.

Type

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.LogsProtoIndexingCrawlerIdCrawlerIdProto{
country: String.t() | nil,
deviceType: String.t() | nil,
indexGrowthExptType: String.t() | nil,
language: String.t() | nil,
languageCode: String.t() | nil
}

Function

@spec decode(struct(), keyword()) :: struct()

Data sourced from HexDocs : GoogleApi.ContentWarehouse.V1.Model.LogsProtoIndexingCrawlerIdCrawlerIdProto