-
Russell Ballestrini authored
Ported from Discord bot's async web fetcher improvements. Add comprehensive logging to understand crawl behavior: - Log crawl parameters at start (max_depth, URLs, keywords) - Debug log for crawl queue state during processing - Detailed link extraction stats with skip reasons: - Total links found - Links added to crawl queue - Links skipped by robots.txt - Links skipped (wrong domain) - Links skipped (max depth reached) Applied to both: - Fresh page fetching and link extraction - Cached page link extraction for depth traversal This diagnostic logging helps identify why crawlers find fewer pages than expected (e.g., robots.txt blocking, domain filtering, depth limits). No crawl logic changes - purely diagnostic visibility.
05891383