TrawlerFetchReplyDataPartialResponse

———————————————————————- PartialResponse is used with streaming responses in LargeFileFetchAdapter. Rather than fitting entirely in a single FetchReply, there is a series of FetchReplies until IsFinalResponse. Each group of responses will have a unique FetchID to link them.

TrawlerFetchReplyDataRedirects

The sequence of redirects fetched, if applicable. This includes url plus stats for each hop after the first hop. NOTE: This can be one redirect longer than the chain of redirects followed, in the case where there was a redirect at the end of the chain that the fetcher detected but did not follow.

TrawlerPolicyData

Trawler can add a policy label to a FetchReply. The two main cases are: – "spam" label added for specific spammer IPs listed in trawler_site_info, which most crawls auto-reject. – "roboted:useragent" (e.g. "roboted:googlebot") if InfoOnlyUserAgents field is set in FetchParams

TrawlerTCPIPInfo

To keep track of fetch connection endpoints. Note: You can use trawler::SourceIP(info) or trawler::DestinationIP(info) (as well as HasSourceIP/HasDestinationIP) in basictypes.h instead of accessing the packed strings directly. This will return a proper IPAddress. Never use the fixed32 based Source/Destination-IP in new code as they will go away (only IPv4).