IndexingDocjoinerAnchorStatistics

GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics


Table of Contents ▼

Jump to a specific part of the page:

Description

Statistics of the anchors in a docjoin. Next available tag ID: 63.

Attributes List

This module has the following attributes (case-insensitive ascending order):

View Attributes

Attributes

  1. penguinLastUpdate (type: integer(), default: nil)
    - BEGIN: Penguin related fields. Timestamp when penguin scores were last updated. Measured in days since Jan. 1st 1995.
  2. anchorCount (type: integer(), default: nil)
    -
  3. badbacklinksPenalized (type: boolean(), default: nil)
    - Whether this doc is penalized by BadBackLinks, in which case we should not use improvanchor score in mustang ascorer.
  4. penguinPenalty (type: number(), default: nil)
    - Page-level penguin penalty (0 = good, 1 = bad).
  5. minHostHomePageLocalOutdegree (type: integer(), default: nil)
    - Minimum local outdegree of all anchor sources that are host home pages as well as on the same host as the current target URL.
  6. droppedRedundantAnchorCount (type: integer(), default: nil)
    - Sum of anchors_dropped in the repeated group RedundantAnchorInfo, but can go higher if the latter reaches the cap of kMaxRecordsToKeep. (indexing/docjoiner/anchors/anchor-loader.cc), currently 10,000
  7. nonLocalAnchorCount (type: integer(), default: nil)
    -
  8. mediumCorpusAnchorCount (type: integer(), default: nil)
    -
  9. penguinEarlyAnchorProtected (type: boolean(), default: nil)
    - Doc is protected by goodness of early anchors.
  10. droppedHomepageAnchorCount (type: integer(), default: nil)
    -
  11. redundantanchorinfoforphrasecap (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap), default: nil)
    -
  12. forwardedOffdomainAnchorCount (type: integer(), default: nil)
    -
  13. droppedNonLocalAnchorCount (type: integer(), default: nil)
    -
  14. perdupstats (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats), default: nil)
    -
  15. onsiteAnchorCount (type: integer(), default: nil)
    -
  16. droppedLocalAnchorCount (type: integer(), default: nil)
    -
  17. penguinTooManySources (type: boolean(), default: nil)
    - Doc not scored because it has too many anchor sources. END: Penguin related fields.
  18. forwardedAnchorCount (type: integer(), default: nil)
    -
  19. anchorSpamInfo (type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo, default: nil)
    - This structure contains signals and penalties of AnchorSpamPenalizer. It replaces phrase_anchor_spam_info above, that is deprecated.
  20. lowCorpusAnchorCount (type: integer(), default: nil)
    -
  21. lowCorpusOffdomainAnchorCount (type: integer(), default: nil)
    -
  22. baseAnchorCount (type: integer(), default: nil)
    -
  23. minDomainHomePageLocalOutdegree (type: integer(), default: nil)
    - Minimum local outdegree of all anchor sources that are domain home pages as well as on the same domain as the current target URL.
  24. skippedAccumulate (type: integer(), default: nil)
    - A count of the number of times anchor accumulation has been skipped for this document. Note: Only used when canonical.
  25. topPrOnsiteAnchorCount (type: integer(), default: nil)
    - According to anchor quality bucket, anchor with pagrank > 51000 is the best anchor. anchors with pagerank < 47000 are all same.
  26. pageMismatchTaggedAnchors (type: integer(), default: nil)
    -
  27. spamLog10Odds (type: number(), default: nil)
    - The log base 10 odds that this set of anchors exhibits spammy behavior. Computed in the AnchorLocalizer.
  28. redundantanchorinfo (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo), default: nil)
    -
  29. pageFromExpiredTaggedAnchors (type: integer(), default: nil)
    - Set in SignalPenalizer::FillInAnchorStatistics.
  30. baseOffdomainAnchorCount (type: integer(), default: nil)
    -
  31. phraseAnchorSpamInfo (type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo, default: nil)
    - Following signals identify spike of spammy anchor phrases. Anchors created during the spike are tagged with LINK_SPAM_PHRASE_SPIKE.
  32. anchorPhraseCount (type: integer(), default: nil)
    - The number of unique anchor phrases. Capped by the constant kMaxAnchorPhraseCountInStats (=5000) defined in indexing/docjoiner/anchors/anchor-manager.cc.
  33. ondomainAnchorCount (type: integer(), default: nil)
    -
  34. totalDomainsAbovePhraseCap (type: integer(), default: nil)
    - Number of domains above per domain phrase cap. We see too many phrases in the domains.
  35. totalDomainsSeen (type: integer(), default: nil)
    - Number of domains seen in total.
  36. topPrOffdomainAnchorCount (type: integer(), default: nil)
    -
  37. scannedAnchorCount (type: integer(), default: nil)
    - The total number of anchors being scanned from storage.
  38. localAnchorCount (type: integer(), default: nil)
    -
  39. linkBeforeSitechangeTaggedAnchors (type: integer(), default: nil)
    -
  40. globalAnchorDelta (type: integer(), default: nil)
    - Metric of number of changed global anchors computed as, size(union(previous, new)
    - intersection(previous, new)).
  41. topPrOndomainAnchorCount (type: integer(), default: nil)
    -
  42. mediumCorpusOffdomainAnchorCount (type: integer(), default: nil)
    -
  43. offdomainAnchorCount (type: integer(), default: nil)
    -
  44. totalDomainPhrasePairsSeenApprox (type: integer(), default: nil)
    - Number of domain/phrase pairs in total -- i.e. how many anchors we would have if the domain/phrase cutoff was set to 1 instead of 200. This is "approx" for large anchor clusters because there can be double counting when the LRU cache forgets about rare domain/phrase pairs.
  45. skippedOrReusedReason (type: String.t, default: nil)
    - Reason to skip accumulate, when skipped, or Reason for reprocessing when not skipped.
  46. anchorsWithDedupedImprovanchors (type: integer(), default: nil)
    - The number of anchors for which some ImprovAnchors phrases have been removed due to duplication within source org.
  47. fakeAnchorCount (type: integer(), default: nil)
    -
  48. redundantAnchorForPhraseCapCount (type: integer(), default: nil)
    - Total anchor dropped due to exceed per domain phrase cap. Equals to sum of anchors_dropped in the repeated group RedundantAnchorInfoForPhraseCap, but can go higher if the latter reaches the cap of kMaxDomainsToKeepForPhraseCap (indexing/docjoiner/anchors/anchor-loader.h), currently 1000.
  49. totalDomainPhrasePairsAboveLimit (type: integer(), default: nil)
    - The following should be equal to the size of the following repeated group, except that it can go higher than 10,000.
  50. timestamp (type: integer(), default: nil)
    - Walltime of when anchors were accumulated last.

Type

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics{
anchorCount: integer() | nil,
anchorPhraseCount: integer() | nil,
anchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t() | nil,
anchorsWithDedupedImprovanchors: integer() | nil,
badbacklinksPenalized: boolean() | nil,
baseAnchorCount: integer() | nil,
baseOffdomainAnchorCount: integer() | nil,
droppedHomepageAnchorCount: integer() | nil,
droppedLocalAnchorCount: integer() | nil,
droppedNonLocalAnchorCount: integer() | nil,
droppedRedundantAnchorCount: integer() | nil,
fakeAnchorCount: integer() | nil,
forwardedAnchorCount: integer() | nil,
forwardedOffdomainAnchorCount: integer() | nil,
globalAnchorDelta: integer() | nil,
linkBeforeSitechangeTaggedAnchors: integer() | nil,
localAnchorCount: integer() | nil,
lowCorpusAnchorCount: integer() | nil,
lowCorpusOffdomainAnchorCount: integer() | nil,
mediumCorpusAnchorCount: integer() | nil,
mediumCorpusOffdomainAnchorCount: integer() | nil,
minDomainHomePageLocalOutdegree: integer() | nil,
minHostHomePageLocalOutdegree: integer() | nil,
nonLocalAnchorCount: integer() | nil,
offdomainAnchorCount: integer() | nil,
ondomainAnchorCount: integer() | nil,
onsiteAnchorCount: integer() | nil,
pageFromExpiredTaggedAnchors: integer() | nil,
pageMismatchTaggedAnchors: integer() | nil,
penguinEarlyAnchorProtected: boolean() | nil,
penguinLastUpdate: integer() | nil,
penguinPenalty: number() | nil,
penguinTooManySources: boolean() | nil,
perdupstats: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t() ] | nil,
phraseAnchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t() | nil,
redundantAnchorForPhraseCapCount: integer() | nil,
redundantanchorinfo: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t() ] | nil,
redundantanchorinfoforphrasecap: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t() ] | nil,
scannedAnchorCount: integer() | nil,
skippedAccumulate: integer() | nil,
skippedOrReusedReason: String.t() | nil,
spamLog10Odds: number() | nil,
timestamp: integer() | nil,
topPrOffdomainAnchorCount: integer() | nil,
topPrOndomainAnchorCount: integer() | nil,
topPrOnsiteAnchorCount: integer() | nil,
totalDomainPhrasePairsAboveLimit: integer() | nil,
totalDomainPhrasePairsSeenApprox: integer() | nil,
totalDomainsAbovePhraseCap: integer() | nil,
totalDomainsSeen: integer() | nil
}

Function

@spec decode(struct(), keyword()) :: struct()

Data sourced from HexDocs : GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics