GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics
Table of Contents ▼
Jump to a specific part of the page:
Description
Statistics of the anchors in a docjoin. Next available tag ID: 63.
Attributes List
This module has the following attributes (case-insensitive ascending order):
View Attributes
- anchorCount
- anchorPhraseCount
- anchorSpamInfo
- anchorsWithDedupedImprovanchors
- badbacklinksPenalized
- baseAnchorCount
- baseOffdomainAnchorCount
- droppedHomepageAnchorCount
- droppedLocalAnchorCount
- droppedNonLocalAnchorCount
- droppedRedundantAnchorCount
- fakeAnchorCount
- forwardedAnchorCount
- forwardedOffdomainAnchorCount
- globalAnchorDelta
- linkBeforeSitechangeTaggedAnchors
- localAnchorCount
- lowCorpusAnchorCount
- lowCorpusOffdomainAnchorCount
- mediumCorpusAnchorCount
- mediumCorpusOffdomainAnchorCount
- minDomainHomePageLocalOutdegree
- minHostHomePageLocalOutdegree
- nonLocalAnchorCount
- offdomainAnchorCount
- ondomainAnchorCount
- onsiteAnchorCount
- pageFromExpiredTaggedAnchors
- pageMismatchTaggedAnchors
- penguinEarlyAnchorProtected
- penguinLastUpdate
- penguinPenalty
- penguinTooManySources
- perdupstats
- phraseAnchorSpamInfo
- redundantAnchorForPhraseCapCount
- redundantanchorinfo
- redundantanchorinfoforphrasecap
- scannedAnchorCount
- skippedAccumulate
- skippedOrReusedReason
- spamLog10Odds
- timestamp
- topPrOffdomainAnchorCount
- topPrOndomainAnchorCount
- topPrOnsiteAnchorCount
- totalDomainPhrasePairsAboveLimit
- totalDomainPhrasePairsSeenApprox
- totalDomainsAbovePhraseCap
- totalDomainsSeen
Attributes
-
penguinLastUpdate
(type:integer()
, default:nil
)
- BEGIN: Penguin related fields. Timestamp when penguin scores were last updated. Measured in days since Jan. 1st 1995. -
anchorCount
(type:integer()
, default:nil
)
- -
badbacklinksPenalized
(type:boolean()
, default:nil
)
- Whether this doc is penalized by BadBackLinks, in which case we should not use improvanchor score in mustang ascorer. -
penguinPenalty
(type:number()
, default:nil
)
- Page-level penguin penalty (0 = good, 1 = bad). -
minHostHomePageLocalOutdegree
(type:integer()
, default:nil
)
- Minimum local outdegree of all anchor sources that are host home pages as well as on the same host as the current target URL. -
droppedRedundantAnchorCount
(type:integer()
, default:nil
)
- Sum of anchors_dropped in the repeated group RedundantAnchorInfo, but can go higher if the latter reaches the cap of kMaxRecordsToKeep. (indexing/docjoiner/anchors/anchor-loader.cc), currently 10,000 -
nonLocalAnchorCount
(type:integer()
, default:nil
)
- -
mediumCorpusAnchorCount
(type:integer()
, default:nil
)
- -
penguinEarlyAnchorProtected
(type:boolean()
, default:nil
)
- Doc is protected by goodness of early anchors. -
droppedHomepageAnchorCount
(type:integer()
, default:nil
)
- -
redundantanchorinfoforphrasecap
(type:list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap)
, default:nil
)
- -
forwardedOffdomainAnchorCount
(type:integer()
, default:nil
)
- -
droppedNonLocalAnchorCount
(type:integer()
, default:nil
)
- -
perdupstats
(type:list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats)
, default:nil
)
- -
onsiteAnchorCount
(type:integer()
, default:nil
)
- -
droppedLocalAnchorCount
(type:integer()
, default:nil
)
- -
penguinTooManySources
(type:boolean()
, default:nil
)
- Doc not scored because it has too many anchor sources. END: Penguin related fields. -
forwardedAnchorCount
(type:integer()
, default:nil
)
- -
anchorSpamInfo
(type:GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo
, default:nil
)
- This structure contains signals and penalties of AnchorSpamPenalizer. It replaces phrase_anchor_spam_info above, that is deprecated. -
lowCorpusAnchorCount
(type:integer()
, default:nil
)
- -
lowCorpusOffdomainAnchorCount
(type:integer()
, default:nil
)
- -
baseAnchorCount
(type:integer()
, default:nil
)
- -
minDomainHomePageLocalOutdegree
(type:integer()
, default:nil
)
- Minimum local outdegree of all anchor sources that are domain home pages as well as on the same domain as the current target URL. -
skippedAccumulate
(type:integer()
, default:nil
)
- A count of the number of times anchor accumulation has been skipped for this document. Note: Only used when canonical. -
topPrOnsiteAnchorCount
(type:integer()
, default:nil
)
- According to anchor quality bucket, anchor with pagrank > 51000 is the best anchor. anchors with pagerank < 47000 are all same. -
pageMismatchTaggedAnchors
(type:integer()
, default:nil
)
- -
spamLog10Odds
(type:number()
, default:nil
)
- The log base 10 odds that this set of anchors exhibits spammy behavior. Computed in the AnchorLocalizer. -
redundantanchorinfo
(type:list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo)
, default:nil
)
- -
pageFromExpiredTaggedAnchors
(type:integer()
, default:nil
)
- Set in SignalPenalizer::FillInAnchorStatistics. -
baseOffdomainAnchorCount
(type:integer()
, default:nil
)
- -
phraseAnchorSpamInfo
(type:GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo
, default:nil
)
- Following signals identify spike of spammy anchor phrases. Anchors created during the spike are tagged with LINK_SPAM_PHRASE_SPIKE. -
anchorPhraseCount
(type:integer()
, default:nil
)
- The number of unique anchor phrases. Capped by the constant kMaxAnchorPhraseCountInStats (=5000) defined in indexing/docjoiner/anchors/anchor-manager.cc. -
ondomainAnchorCount
(type:integer()
, default:nil
)
- -
totalDomainsAbovePhraseCap
(type:integer()
, default:nil
)
- Number of domains above per domain phrase cap. We see too many phrases in the domains. -
totalDomainsSeen
(type:integer()
, default:nil
)
- Number of domains seen in total. -
topPrOffdomainAnchorCount
(type:integer()
, default:nil
)
- -
scannedAnchorCount
(type:integer()
, default:nil
)
- The total number of anchors being scanned from storage. -
localAnchorCount
(type:integer()
, default:nil
)
- -
linkBeforeSitechangeTaggedAnchors
(type:integer()
, default:nil
)
- -
globalAnchorDelta
(type:integer()
, default:nil
)
- Metric of number of changed global anchors computed as, size(union(previous, new)
- intersection(previous, new)). -
topPrOndomainAnchorCount
(type:integer()
, default:nil
)
- -
mediumCorpusOffdomainAnchorCount
(type:integer()
, default:nil
)
- -
offdomainAnchorCount
(type:integer()
, default:nil
)
- -
totalDomainPhrasePairsSeenApprox
(type:integer()
, default:nil
)
- Number of domain/phrase pairs in total -- i.e. how many anchors we would have if the domain/phrase cutoff was set to 1 instead of 200. This is "approx" for large anchor clusters because there can be double counting when the LRU cache forgets about rare domain/phrase pairs. -
skippedOrReusedReason
(type:String.t
, default:nil
)
- Reason to skip accumulate, when skipped, or Reason for reprocessing when not skipped. -
anchorsWithDedupedImprovanchors
(type:integer()
, default:nil
)
- The number of anchors for which some ImprovAnchors phrases have been removed due to duplication within source org. -
fakeAnchorCount
(type:integer()
, default:nil
)
- -
redundantAnchorForPhraseCapCount
(type:integer()
, default:nil
)
- Total anchor dropped due to exceed per domain phrase cap. Equals to sum of anchors_dropped in the repeated group RedundantAnchorInfoForPhraseCap, but can go higher if the latter reaches the cap of kMaxDomainsToKeepForPhraseCap (indexing/docjoiner/anchors/anchor-loader.h), currently 1000. -
totalDomainPhrasePairsAboveLimit
(type:integer()
, default:nil
)
- The following should be equal to the size of the following repeated group, except that it can go higher than 10,000. -
timestamp
(type:integer()
, default:nil
)
- Walltime of when anchors were accumulated last.
Type
@type t() :: %GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics{
anchorCount: integer() | nil,
anchorPhraseCount: integer() | nil,
anchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t() | nil,
anchorsWithDedupedImprovanchors: integer() | nil,
badbacklinksPenalized: boolean() | nil,
baseAnchorCount: integer() | nil,
baseOffdomainAnchorCount: integer() | nil,
droppedHomepageAnchorCount: integer() | nil,
droppedLocalAnchorCount: integer() | nil,
droppedNonLocalAnchorCount: integer() | nil,
droppedRedundantAnchorCount: integer() | nil,
fakeAnchorCount: integer() | nil,
forwardedAnchorCount: integer() | nil,
forwardedOffdomainAnchorCount: integer() | nil,
globalAnchorDelta: integer() | nil,
linkBeforeSitechangeTaggedAnchors: integer() | nil,
localAnchorCount: integer() | nil,
lowCorpusAnchorCount: integer() | nil,
lowCorpusOffdomainAnchorCount: integer() | nil,
mediumCorpusAnchorCount: integer() | nil,
mediumCorpusOffdomainAnchorCount: integer() | nil,
minDomainHomePageLocalOutdegree: integer() | nil,
minHostHomePageLocalOutdegree: integer() | nil,
nonLocalAnchorCount: integer() | nil,
offdomainAnchorCount: integer() | nil,
ondomainAnchorCount: integer() | nil,
onsiteAnchorCount: integer() | nil,
pageFromExpiredTaggedAnchors: integer() | nil,
pageMismatchTaggedAnchors: integer() | nil,
penguinEarlyAnchorProtected: boolean() | nil,
penguinLastUpdate: integer() | nil,
penguinPenalty: number() | nil,
penguinTooManySources: boolean() | nil,
perdupstats: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t() ] | nil,
phraseAnchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t() | nil,
redundantAnchorForPhraseCapCount: integer() | nil,
redundantanchorinfo: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t() ] | nil,
redundantanchorinfoforphrasecap: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t() ] | nil,
scannedAnchorCount: integer() | nil,
skippedAccumulate: integer() | nil,
skippedOrReusedReason: String.t() | nil,
spamLog10Odds: number() | nil,
timestamp: integer() | nil,
topPrOffdomainAnchorCount: integer() | nil,
topPrOndomainAnchorCount: integer() | nil,
topPrOnsiteAnchorCount: integer() | nil,
totalDomainPhrasePairsAboveLimit: integer() | nil,
totalDomainPhrasePairsSeenApprox: integer() | nil,
totalDomainsAbovePhraseCap: integer() | nil,
totalDomainsSeen: integer() | nil
}
anchorCount: integer() | nil,
anchorPhraseCount: integer() | nil,
anchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t() | nil,
anchorsWithDedupedImprovanchors: integer() | nil,
badbacklinksPenalized: boolean() | nil,
baseAnchorCount: integer() | nil,
baseOffdomainAnchorCount: integer() | nil,
droppedHomepageAnchorCount: integer() | nil,
droppedLocalAnchorCount: integer() | nil,
droppedNonLocalAnchorCount: integer() | nil,
droppedRedundantAnchorCount: integer() | nil,
fakeAnchorCount: integer() | nil,
forwardedAnchorCount: integer() | nil,
forwardedOffdomainAnchorCount: integer() | nil,
globalAnchorDelta: integer() | nil,
linkBeforeSitechangeTaggedAnchors: integer() | nil,
localAnchorCount: integer() | nil,
lowCorpusAnchorCount: integer() | nil,
lowCorpusOffdomainAnchorCount: integer() | nil,
mediumCorpusAnchorCount: integer() | nil,
mediumCorpusOffdomainAnchorCount: integer() | nil,
minDomainHomePageLocalOutdegree: integer() | nil,
minHostHomePageLocalOutdegree: integer() | nil,
nonLocalAnchorCount: integer() | nil,
offdomainAnchorCount: integer() | nil,
ondomainAnchorCount: integer() | nil,
onsiteAnchorCount: integer() | nil,
pageFromExpiredTaggedAnchors: integer() | nil,
pageMismatchTaggedAnchors: integer() | nil,
penguinEarlyAnchorProtected: boolean() | nil,
penguinLastUpdate: integer() | nil,
penguinPenalty: number() | nil,
penguinTooManySources: boolean() | nil,
perdupstats: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t() ] | nil,
phraseAnchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t() | nil,
redundantAnchorForPhraseCapCount: integer() | nil,
redundantanchorinfo: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t() ] | nil,
redundantanchorinfoforphrasecap: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t() ] | nil,
scannedAnchorCount: integer() | nil,
skippedAccumulate: integer() | nil,
skippedOrReusedReason: String.t() | nil,
spamLog10Odds: number() | nil,
timestamp: integer() | nil,
topPrOffdomainAnchorCount: integer() | nil,
topPrOndomainAnchorCount: integer() | nil,
topPrOnsiteAnchorCount: integer() | nil,
totalDomainPhrasePairsAboveLimit: integer() | nil,
totalDomainPhrasePairsSeenApprox: integer() | nil,
totalDomainsAbovePhraseCap: integer() | nil,
totalDomainsSeen: integer() | nil
}
Function
@spec decode(struct(), keyword()) :: struct()Data sourced from HexDocs : GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics