You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Constraint: firecrawl_crawl starts a server-side crawl and polls until a terminal result before returning.
Rejected: Describing crawl as only returning an operation ID | that misleads agents into unnecessary status polling.
Confidence: high
Scope-risk: narrow
Directive: Keep tool descriptions synchronized with execution semantics.
Tested: npm run build.
Not-tested: pnpm run build is blocked by existing pnpm-workspace.yaml missing packages field in this checkout.
@@ -680,7 +680,7 @@ and small metadata objects. Do not include raw scrape/parse outputs.
680
680
681
681
### 6. Crawl Tool (`firecrawl_crawl`)
682
682
683
-
Starts an asynchronous crawl job on a website and extract content from all pages.
683
+
Starts a crawl job, polls until it reaches a terminal state, and returns the final crawl status/data.
684
684
685
685
**Best for:**
686
686
@@ -689,14 +689,14 @@ Starts an asynchronous crawl job on a website and extract content from all pages
689
689
**Not recommended for:**
690
690
691
691
- Extracting content from a single page (use scrape)
692
-
- When token limits are a concern (use map + batch_scrape)
692
+
- When token limits are a concern (use map + scrape for tighter control)
693
693
- When you need fast results (crawling can be slow)
694
694
695
-
**Warning:** Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control.
695
+
**Warning:** Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + scrape for tighter control.
696
696
697
697
**Common mistakes:**
698
698
699
-
- Setting limit or maxDepth too high (causes token overflow)
699
+
- Setting limit or maxDiscoveryDepth too high (causes token overflow)
700
700
- Using crawl for a single page (use scrape instead)
701
701
702
702
**Prompt Example:**
@@ -710,33 +710,22 @@ Starts an asynchronous crawl job on a website and extract content from all pages
710
710
"name": "firecrawl_crawl",
711
711
"arguments": {
712
712
"url": "https://example.com/blog/*",
713
-
"maxDepth": 2,
713
+
"maxDiscoveryDepth": 2,
714
714
"limit": 100,
715
715
"allowExternalLinks": false,
716
716
"deduplicateSimilarURLs": true
717
717
}
718
718
}
719
719
```
720
720
721
-
**Returns:**
722
721
723
-
- Response includes operation ID for status checking:
722
+
**Returns:**
724
723
725
-
```json
726
-
{
727
-
"content": [
728
-
{
729
-
"type": "text",
730
-
"text": "Started crawl for: https://example.com/* with job ID: 550e8400-e29b-41d4-a716-446655440000. Use firecrawl_check_crawl_status to check progress."
731
-
}
732
-
],
733
-
"isError": false
734
-
}
735
-
```
724
+
- Final crawl status and data after internal polling, including `id`, `status`, `completed`, `total`, `creditsUsed`, `expiresAt`, `next`, and `data`. Use the returned `id` with `firecrawl_check_crawl_status` if you need to re-check the job later.
736
725
737
726
### 7. Check Crawl Status (`firecrawl_check_crawl_status`)
738
727
739
-
Check the status of a crawl job.
728
+
Check the status and results of an existing crawl job by ID.
Copy file name to clipboardExpand all lines: src/index.ts
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -1823,17 +1823,17 @@ Do not store multi-MB outputs in feedback. Use concise notes, issue codes, URLs,
1823
1823
server.addTool({
1824
1824
name: 'firecrawl_crawl',
1825
1825
annotations: {
1826
-
title: 'Start a site crawl',
1827
-
readOnlyHint: false,// Starts an asynchronous crawl job, creating a persistent server-side job.
1826
+
title: 'Run a site crawl',
1827
+
readOnlyHint: false,// Starts a server-side crawl job and polls until the job reaches a terminal state.
1828
1828
openWorldHint: true,// Crawls user-specified URLs across the public web.
1829
1829
destructiveHint: false,// Reads pages from target sites; does not delete or alter external websites.
1830
1830
},
1831
1831
description: `
1832
-
Starts a crawl job on a websiteand extracts content from all pages.
1832
+
Starts a crawl job on a website, polls until it reaches a terminal state, and returns the final crawl status/data.
1833
1833
1834
1834
**Best for:** Extracting content from multiple related pages, when you need comprehensive coverage.
1835
-
**Not recommended for:** Extracting content from a single page (use scrape); when token limits are a concern (use map + batch_scrape); when you need fast results (crawling can be slow).
1836
-
**Warning:** Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control.
1835
+
**Not recommended for:** Extracting content from a single page (use scrape); when token limits are a concern (use map + scrape for tighter control); when you need fast results (crawling can be slow).
1836
+
**Warning:** Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + scrape for tighter control.
1837
1837
**Common mistakes:** Setting limit or maxDiscoveryDepth too high (causes token overflow) or too low (causes missing pages); using crawl for a single page (use scrape instead). Using a /* wildcard is not recommended.
1838
1838
**Prompt Example:** "Get all blog posts from the first two levels of example.com/blog."
1839
1839
**Usage Example:**
@@ -1850,7 +1850,7 @@ server.addTool({
1850
1850
}
1851
1851
}
1852
1852
\`\`\`
1853
-
**Returns:** Operation ID for status checking; use firecrawl_check_crawl_status to check progress.
1853
+
**Returns:** Final crawl status and data after internal polling, including the crawl id. Use firecrawl_check_crawl_status only when you need to re-check an existing crawl ID later.
1854
1854
${
1855
1855
SAFE_MODE
1856
1856
? '**Safe Mode:** Read-only crawling. Webhooks and interactive actions are disabled for security.'
0 commit comments