DocProperties

GoogleApi.ContentWarehouse.V1.Model.DocProperties


Table of Contents ▼

Jump to a specific part of the page:

Description

NOTE: In segindexer, the docproperties of a document may be reused from a previous cycle if its content is not changed. If you add a new field to DocProperties, make sure it is taken care (i.e., gets copied from a previous cycle to the current document) in CDocProperties::EndDocument().

Attributes List

This module has the following attributes (case-insensitive ascending order):

View Attributes

Attributes

  1. avgTermWeight (type: integer(), default: nil)
    - The average weighted font size of a term in the doc body
  2. badTitle (type: boolean(), default: nil)
    - Missing or meaningless title
  3. badtitleinfo (type: list(GoogleApi.ContentWarehouse.V1.Model.DocPropertiesBadTitleInfo), default: nil)
    -
  4. languages (type: list(integer()), default: nil)
    - A Language enum value. See: go/language-enum
  5. leadingtext (type: GoogleApi.ContentWarehouse.V1.Model.SnippetsLeadingtextLeadingTextInfo, default: nil)
    - Leading text information generated by google3/quality/snippets/leadingtext/leadingtext-detector.cc
  6. numPunctuations (type: integer(), default: nil)
    -
  7. numTags (type: integer(), default: nil)
    -
  8. numTokens (type: integer(), default: nil)
    - The number of tokens, tags and punctuations in the tokenized contents. This is an approximation of the number of tokens, tags and punctuations we end up with in mustang, but is inexact since we drop some tokens in mustang and also truncate docs at a max cap.
  9. proseRestrict (type: list(String.t), default: nil)
    - The restricts for CSE structured search.
  10. restricts (type: list(String.t), default: nil)
    -
  11. timestamp (type: String.t, default: nil)
    - The time CDocProperties::StartDocument() is called, encoded as seconds past the epoch (Jan 1, 1970). This value is always refreshed and not reused.
  12. title (type: String.t, default: nil)
    - Extracted from the title tag of the content. This is typically extracted by TitleMetaCollector defined at google3/segindexer/title-meta-collector.h. Please see its documentation for the format and other caveats.

Type

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.DocProperties{
avgTermWeight: integer() | nil,
badTitle: boolean() | nil,
badtitleinfo: [GoogleApi.ContentWarehouse.V1.Model.DocPropertiesBadTitleInfo.t()] | nil,
languages: [integer()] | nil,
leadingtext: GoogleApi.ContentWarehouse.V1.Model.SnippetsLeadingtextLeadingTextInfo.t() | nil,
numPunctuations: integer() | nil,
numTags: integer() | nil,
numTokens: integer() | nil,
proseRestrict: [String.t()] | nil,
restricts: [String.t()] | nil,
timestamp: String.t() | nil,
title: String.t() | nil
}

Function

@spec decode(struct(), keyword()) :: struct()

Data sourced from HexDocs : GoogleApi.ContentWarehouse.V1.Model.DocProperties