Skip to content

Commit bdef511

Browse files
authored
Merge pull request #34 from ScrapeGraphAI/removed-render_heavy_ks
feat: remove render_heavy_js
2 parents 680069a + d8ccd72 commit bdef511

11 files changed

Lines changed: 13 additions & 259 deletions

File tree

api-reference/endpoint/smartcrawler/start.mdx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@ Content-Type: `application/json`
3636
"same_domain": "boolean"
3737
},
3838
"sitemap": "boolean",
39-
"render_heavy_js": "boolean",
4039
"stealth": "boolean"
4140
"webhook_url": str
4241
}
@@ -58,7 +57,6 @@ Content-Type: `application/json`
5857
| schema | object | No | - | JSON Schema object for structured output |
5958
| rules | object | No | - | Crawl rules for filtering URLs. Object with optional fields: `exclude` (array of regex URL patterns), `include_paths` (array of path patterns to include, supports wildcards `*` and `**`), `exclude_paths` (array of path patterns to exclude, takes precedence over `include_paths`), `same_domain` (boolean, default: true). See Rules section below for details. |
6059
| sitemap | boolean | No | false | Use sitemap.xml for discovery |
61-
| render_heavy_js | boolean | No | false | Enable heavy JavaScript rendering |
6260
| stealth | boolean | No | false | Enable stealth mode to bypass bot protection using advanced anti-detection techniques. Adds +4 credits to the request cost |
6361
| webhook_url | str | No | None | Webhook URL to send the job result to. When provided, a signed webhook notification will be sent upon job completion. See [Webhook Signature Verification](#webhook-signature-verification) below.
6462

api-reference/endpoint/smartscraper/start.mdx

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -54,12 +54,6 @@ SmartScraper allows you to extract specific information from any webpage using A
5454
Range: 0-50
5555
</ParamField>
5656

57-
<ParamField body="render_heavy_js" type="boolean">
58-
Optional parameter to enable enhanced JavaScript rendering for heavy JS websites (React, Vue, Angular, SPAs). Use when standard rendering doesn't capture all content.
59-
60-
Default: false
61-
</ParamField>
62-
6357
<ParamField body="mock" type="boolean">
6458
Optional parameter to enable mock mode. When set to true, the request will return mock data instead of performing an actual extraction. Useful for testing and development.
6559

@@ -117,7 +111,6 @@ curl -X POST 'https://api.scrapegraphai.com/v1/smartscraper' \
117111
"user_prompt": "Extract all the headlines from this section into a table with the date and URL of the news",
118112
"total_pages": 2,
119113
"stealth": true,
120-
"render_heavy_js": true,
121114
"headers": {
122115
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
123116
"Cookie": "cookie1=value1; cookie2=value2"

integrations/x402.mdx

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,6 @@ curl -X POST 'https://x402.orth.sh/scrapegraph/v1/scrape' \
6868
-H 'X-Payment: {{paymentHeader}}' \
6969
-d '{
7070
"website_url": "example",
71-
"render_heavy_js": true,
7271
"branding": true,
7372
"stealth": "example"
7473
}'
@@ -102,7 +101,6 @@ curl -X POST 'https://x402.orth.sh/scrapegraph/v1/crawl' \
102101
"schema": "",
103102
"rules": "",
104103
"sitemap": "example",
105-
"render_heavy_js": "example",
106104
"stealth": "example"
107105
}'
108106
```
@@ -133,7 +131,6 @@ curl -X POST 'https://x402.orth.sh/scrapegraph/v1/smartscraper' \
133131
" website_markdown": "example",
134132
"total_pages": 123,
135133
" number_of_scrolls": 123,
136-
" render_heavy_js": true,
137134
" mock": true,
138135
" cookies": "",
139136
" steps": ""

sdks/javascript.mdx

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,6 @@ const response = await smartScraper(
9494
| websiteUrl | string | Yes | The URL of the webpage that needs to be scraped. |
9595
| prompt | string | Yes | A textual description of what you want to achieve. |
9696
| schema | object | No | The Pydantic or Zod object that describes the structure and format of the response. |
97-
| renderHeavyJs | boolean | No | Enable enhanced JavaScript rendering for heavy JS websites (React, Vue, Angular, etc.). Default: false |
9897

9998
<Accordion title="Basic Schema Example" icon="code">
10099
Define a simple schema using Zod:
@@ -201,8 +200,7 @@ try {
201200
apiKey,
202201
'https://example-react-store.com/products/123',
203202
'Extract product details including name, price, description, and availability',
204-
ProductSchema,
205-
true // Enable render_heavy_js for JavaScript-heavy sites
203+
ProductSchema
206204
);
207205

208206
console.log('Product:', response.result.name);
@@ -214,13 +212,6 @@ try {
214212
}
215213
```
216214

217-
**When to use `renderHeavyJs`:**
218-
- React, Vue, or Angular applications
219-
- Single Page Applications (SPAs)
220-
- Sites with heavy client-side rendering
221-
- Dynamic content loaded via JavaScript
222-
- Interactive elements that depend on JavaScript execution
223-
224215
</Accordion>
225216

226217
### SearchScraper

sdks/mocking.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -422,7 +422,7 @@ async function basicMockUsage() {
422422

423423
try {
424424
// Test scrape endpoint
425-
const scrapeResult = await scrape(API_KEY, 'https://example.com', { renderHeavyJs: true });
425+
const scrapeResult = await scrape(API_KEY, 'https://example.com');
426426
console.log('Scrape result:', scrapeResult);
427427

428428
// Test smartScraper endpoint

sdks/python.mdx

Lines changed: 0 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,6 @@ response = client.smartscraper(
6969
| website_url | string | Yes | The URL of the webpage that needs to be scraped. |
7070
| user_prompt | string | Yes | A textual description of what you want to achieve. |
7171
| output_schema | object | No | The Pydantic object that describes the structure and format of the response. |
72-
| render_heavy_js | boolean | No | Enable enhanced JavaScript rendering for heavy JS websites (React, Vue, Angular, etc.). Default: False |
7372

7473
<Accordion title="Basic Schema Example" icon="code">
7574
Define a simple schema for basic data extraction:
@@ -142,43 +141,6 @@ for office in response.offices:
142141
```
143142
</Accordion>
144143

145-
<Accordion title="Enhanced JavaScript Rendering Example" icon="code">
146-
For modern web applications built with React, Vue, Angular, or other JavaScript frameworks:
147-
148-
```python
149-
from scrapegraph_py import Client
150-
from pydantic import BaseModel, Field
151-
152-
class ProductInfo(BaseModel):
153-
name: str = Field(description="Product name")
154-
price: str = Field(description="Product price")
155-
description: str = Field(description="Product description")
156-
availability: str = Field(description="Product availability status")
157-
158-
client = Client(api_key="your-api-key")
159-
160-
# Enable enhanced JavaScript rendering for a React-based e-commerce site
161-
response = client.smartscraper(
162-
website_url="https://example-react-store.com/products/123",
163-
user_prompt="Extract product details including name, price, description, and availability",
164-
output_schema=ProductInfo,
165-
render_heavy_js=True # Enable for React/Vue/Angular sites
166-
)
167-
168-
print(f"Product: {response['result']['name']}")
169-
print(f"Price: {response['result']['price']}")
170-
print(f"Available: {response['result']['availability']}")
171-
```
172-
173-
**When to use `render_heavy_js`:**
174-
- React, Vue, or Angular applications
175-
- Single Page Applications (SPAs)
176-
- Sites with heavy client-side rendering
177-
- Dynamic content loaded via JavaScript
178-
- Interactive elements that depend on JavaScript execution
179-
180-
</Accordion>
181-
182144
### SearchScraper
183145

184146
Search and extract information from multiple web sources using AI:

services/additional-parameters/proxy.mdx

Lines changed: 1 addition & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -122,11 +122,6 @@ The following parameters in API requests can affect proxy behavior:
122122
- **Default**: No specific country (uses optimal routing)
123123
- **Format**: ISO 3166-1 alpha-2 (e.g., `us`, `gb`, `de`)
124124

125-
### `render_heavy_js` (optional)
126-
- **Type**: Boolean
127-
- **Description**: Whether to render JavaScript-heavy pages. This may affect which proxy provider is used.
128-
- **Default**: `false`
129-
130125
## Usage Examples
131126

132127
### Basic Request (Automatic Proxy Selection)
@@ -204,46 +199,6 @@ const response = await smartScraper(
204199

205200
</CodeGroup>
206201

207-
### Request with JavaScript Rendering and Country Code
208-
209-
<CodeGroup>
210-
211-
```python Python
212-
from scrapegraph_py import Client
213-
214-
client = Client(api_key="your-api-key")
215-
216-
# Combine JavaScript rendering with geotargeting
217-
response = client.smartscraper(
218-
website_url="https://example.com",
219-
user_prompt="Extract product information",
220-
render_heavy_js=True,
221-
country_code="uk"
222-
)
223-
```
224-
225-
```javascript JavaScript
226-
import { smartScraper } from 'scrapegraph-js';
227-
228-
const apiKey = 'your-api-key';
229-
230-
// Combine JavaScript rendering with geotargeting
231-
const response = await smartScraper(
232-
apiKey,
233-
'https://example.com',
234-
'Extract product information',
235-
null, // schema
236-
null, // numberOfScrolls
237-
null, // totalPages
238-
null, // cookies
239-
{ country_code: 'uk' }, // options
240-
false, // plain_text
241-
true // renderHeavyJs
242-
);
243-
```
244-
245-
</CodeGroup>
246-
247202
### Real-World Use Cases
248203

249204
#### Accessing Geo-Restricted Content
@@ -355,8 +310,7 @@ If your scraping request fails:
355310
1. **Verify the URL**: Make sure the URL is correct and accessible
356311
2. **Check the website**: Some websites may block automated access regardless of proxy
357312
3. **Retry the request**: The system uses automatic retries, but you can manually retry after a delay
358-
4. **Try different parameters**: Sometimes using `render_heavy_js: true` can help with JavaScript-heavy sites
359-
5. **Try a different country**: If geo-restriction is the issue, try a different `country_code`
313+
4. **Try a different country**: If geo-restriction is the issue, try a different `country_code`
360314
</Accordion>
361315

362316
### Rate Limiting

services/additional-parameters/wait-ms.mdx

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -150,25 +150,13 @@ async def scrape_with_wait():
150150

151151
2. **Test incrementally** - If the default doesn't capture all content, try increasing in 1000ms increments (4000, 5000, etc.) rather than setting a very high value.
152152

153-
3. **Combine with other parameters** - Use `wait_ms` together with `render_heavy_js` for JavaScript-heavy pages:
154-
155-
```python
156-
response = client.smartscraper(
157-
website_url="https://heavy-js-site.com",
158-
user_prompt="Extract all products",
159-
wait_ms=8000,
160-
render_heavy_js=True
161-
)
162-
```
163-
164-
4. **Balance speed and completeness** - Higher wait times ensure more content is captured but increase response time and resource usage.
153+
3. **Balance speed and completeness** - Higher wait times ensure more content is captured but increase response time and resource usage.
165154

166155
## Troubleshooting
167156

168157
<Accordion title="Content still missing after increasing wait_ms" icon="exclamation-triangle">
169158
If increasing `wait_ms` doesn't capture all content:
170159

171-
- Try enabling `render_heavy_js=True` for JavaScript-heavy pages
172160
- Check if the content requires user interaction (clicks, scrolls) - use `number_of_scrolls` for infinite scroll pages
173161
- Verify the content isn't behind authentication - use custom headers/cookies
174162
</Accordion>

services/mcp-server.mdx

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -182,8 +182,7 @@ AI‑powered extraction with optional infinite scrolls.
182182
smartscraper(
183183
user_prompt: str,
184184
website_url: str,
185-
number_of_scrolls: int | None = None,
186-
render_heavy_js: bool | None = None
185+
number_of_scrolls: int | None = None
187186
)
188187
```
189188

@@ -199,10 +198,10 @@ searchscraper(
199198
```
200199

201200
### 4. scrape
202-
Fetch raw HTML with optional heavy JS rendering.
201+
Fetch raw HTML from a URL.
203202

204203
```python
205-
scrape(website_url: str, render_heavy_js: bool | None = None)
204+
scrape(website_url: str)
206205
```
207206

208207
### 5. sitemap

services/scrape.mdx

Lines changed: 5 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,7 @@ sgai_client = Client(api_key="your-api-key")
3434
# Scrape request
3535
response = sgai_client.htmlify(
3636
website_url="https://example.com",
37-
render_heavy_js=False, # Set to True for heavy JavaScript rendering
38-
branding=True # Set to True to extract brand design and metadata
37+
branding=True # Set to True to extract brand design and metadata
3938
)
4039

4140
print("HTML Content:", response.html)
@@ -53,8 +52,7 @@ const apiKey = 'your-api-key';
5352
const url = 'https://example.com';
5453

5554
try {
56-
// htmlify(apiKey, url, renderHeavyJs = false, options = { branding: true })
57-
const response = await htmlify(apiKey, url, false, { branding: true }); // enable branding extraction
55+
const response = await htmlify(apiKey, url, { branding: true }); // enable branding extraction
5856
console.log('HTML Content:', response.html);
5957
console.log('Request ID:', response.scrape_request_id);
6058
console.log('Status:', response.status);
@@ -72,7 +70,6 @@ curl -X POST https://api.scrapegraphai.com/v1/scrape \
7270
-H "SGAI-APIKEY: your-api-key" \
7371
-d '{
7472
"website_url": "https://example.com",
75-
"render_heavy_js": false,
7673
"branding": true
7774
}'
7875
```
@@ -85,7 +82,6 @@ curl -X POST https://api.scrapegraphai.com/v1/scrape \
8582
|-----------|------|----------|-------------|
8683
| apiKey | string | Yes | The ScrapeGraph API Key. |
8784
| website_url | string | Yes | The URL of the webpage to scrape. |
88-
| render_heavy_js | boolean | No | Set to true for heavy JavaScript rendering. Default: false |
8985
| branding | boolean | No | Return extracted brand design and metadata. Default: false |
9086
| stealth | boolean | No | Enable stealth mode for anti-bot protection. Adds additional credits. Default: false |
9187

@@ -171,9 +167,6 @@ When `branding=true` is passed, the response includes a `branding` object with b
171167
<Card title="Raw HTML Access" icon="code">
172168
Get complete HTML structure including all elements
173169
</Card>
174-
<Card title="JavaScript Rendering" icon="bolt">
175-
Optional support for heavy JavaScript rendering
176-
</Card>
177170
<Card title="Branding Extraction" icon="palette">
178171
Optionally extract brand colors, fonts, typography, UI components, images, and metadata
179172
</Card>
@@ -209,24 +202,6 @@ When `branding=true` is passed, the response includes a `branding` object with b
209202
Want to learn more about our AI-powered scraping technology? Visit our [main website](https://scrapegraphai.com) to discover how we're revolutionizing web data extraction.
210203
</Note>
211204

212-
## JavaScript Rendering
213-
214-
The `render_heavy_js` parameter controls whether JavaScript should be executed on the target page:
215-
216-
### When to Use JavaScript Rendering
217-
218-
- **Single Page Applications (SPAs)**: React, Vue, Angular apps
219-
- **Dynamic Content**: Content loaded via AJAX/fetch
220-
- **Interactive Elements**: Dropdowns, modals, infinite scroll
221-
- **Client-side Routing**: Hash-based or history API routing
222-
223-
### When to Skip JavaScript Rendering
224-
225-
- **Static HTML Pages**: Traditional server-rendered content
226-
- **Performance**: Faster processing for simple pages
227-
- **Cost Optimization**: Lower API usage for basic scraping
228-
- **Reliability**: More predictable results for static content
229-
230205
## Advanced Usage
231206

232207
### Async Support
@@ -240,8 +215,7 @@ import asyncio
240215
async def main():
241216
async with AsyncClient(api_key="your-api-key") as client:
242217
response = await client.htmlify(
243-
website_url="https://example.com",
244-
render_heavy_js=True
218+
website_url="https://example.com"
245219
)
246220
print(response)
247221

@@ -271,7 +245,7 @@ async def main():
271245
"https://github.com/ScrapeGraphAI/Scrapegraph-ai",
272246
]
273247

274-
tasks = [sgai_client.htmlify(website_url=url, render_heavy_js=False) for url in urls]
248+
tasks = [sgai_client.htmlify(website_url=url) for url in urls]
275249

276250
# Execute requests concurrently
277251
responses = await asyncio.gather(*tasks, return_exceptions=True)
@@ -304,8 +278,7 @@ if __name__ == "__main__":
304278
## Best Practices
305279

306280
### Performance Optimization
307-
1. Use `render_heavy_js=false` for static content
308-
2. Process multiple URLs concurrently
281+
1. Process multiple URLs concurrently
309282
3. Cache results when possible
310283
4. Monitor API usage and costs
311284

0 commit comments

Comments
 (0)