Crawler Detection

Search and recommendation features automatically block crawlers to prevent bots from triggering expensive database queries.

Why Block Crawlers?

Storefront search and recommendations require complex queries:

Full-text search across issues, taxonomies, and metadata
Recommendation algorithms based on user behavior and content relationships

When search engine crawlers (Googlebot, Bingbot, etc.) index a storefront, they follow every link—including search forms and recommendation widgets. Without protection, a single crawl session could trigger thousands of expensive queries.

Affected Features

Feature	Why Blocked
`storefront_search`	Prevents bots from executing full-text search queries
`storefront_search_terms`	Blocks taxonomy term searches
`storefront_recommended_issues`	Avoids recommendation algorithm execution

How It Works

The system uses LaravelCrawlerDetect to identify bot User-Agents during feature resolution:

Request → Feature Check → Crawler Detection → Allow/Block

Request comes in for a protected feature
Feature resolver checks platform-level status first (global kill switch)
If platform-enabled, checks for crawler via LaravelCrawlerDetect
If crawler detected → feature returns disabled
Otherwise → normal feature resolution continues

This happens in the feature access trait after the platform-level check but before tenant-specific resolution.

Bypass Mechanism

Some legitimate tools need to access these features while appearing as crawlers:

Header	Purpose
`x-farfalla-bypass-crawler-detection`	Monitoring tools (PageSpeed Insights, Oh Dear)
`X-CustomFenice-Tenant-Id`	Fenice desktop app (uses custom User-Agent)

These headers are checked in RequestMacrosProvider which adds the isCrawler() macro to the Request class.

SEO Impact

None. Crawlers don't need search functionality to index content:

Content pages are directly accessible via URLs and internal links
Search results are dynamic and not meant for indexing anyway
Recommendations are personalized and would vary per-request

Google and other search engines discover content by following links, not by using site search. The actual content (issues, collections, etc.) remains fully indexable.

File	Purpose
`TenantResolverUnifiedQueryService`	Resolves feature availability when tenant context is established
`app/Traits/CanAccessFeature.php`	Feature resolution with crawler check
`app/Providers/RequestMacrosProvider.php`	`isCrawler()` macro and bypass logic

Why Block Crawlers?​

Affected Features​

How It Works​

Bypass Mechanism​

SEO Impact​

Related Code​

Graph View

Why Block Crawlers?

Affected Features

How It Works

Bypass Mechanism

SEO Impact

Related Code