Skip to content

AttributeError: 'BrowserContext' object has no attribute 'contexts' when using Camoufox with persistent_context=True #1959

@vlastv

Description

@vlastv

When using a custom PlaywrightBrowserPlugin with Camoufox's AsyncNewBrowser and passing persistent_context=True along with user_data_dir, the crawler crashes with an AttributeError during browser initialization.

Steps to Reproduce

  1. Create a custom CamoufoxPlugin extending PlaywrightBrowserPlugin
  2. Override new_browser() to launch Camoufox with a persistent context
  3. Pass persistent_context=True and user_data_dir to AsyncNewBrowser
  4. Run the crawler
from camoufox import AsyncNewBrowser
from typing_extensions import override

from crawlee._utils.context import ensure_context
from crawlee.browsers import (
    PlaywrightBrowserPlugin,
    PlaywrightBrowserController,
    BrowserPool,
)
from crawlee.crawlers import PlaywrightCrawler
from .routes import router


class CamoufoxPlugin(PlaywrightBrowserPlugin):
    @ensure_context
    @override
    async def new_browser(self) -> PlaywrightBrowserController:
        if not self._playwright:
            raise RuntimeError("Playwright browser plugin is not initialized.")

        return PlaywrightBrowserController(
            browser=await AsyncNewBrowser(
                self._playwright,
                headless=False,
                persistent_context=self._user_data_dir is not None,
                user_data_dir=self._user_data_dir,
            ),
            max_open_pages_per_browser=1,  #  Increase, if camoufox can handle it in your usecase.
            header_generator=None,  #  This turns off the crawlee header_generation. Camoufox has its own.
        )


async def main() -> None:
    """The crawler entry point."""
    crawler = PlaywrightCrawler(
        max_requests_per_crawl=10,
        request_handler=router,
        browser_pool=BrowserPool(
            plugins=[CamoufoxPlugin(user_data_dir="./user_data_dir")]
        ),
        ignore_http_error_status_codes=[401],
        max_request_retries=0,
    )

    await crawler.run(
        [
            "https://crawlee.dev/",
        ]
    )

The crawler should launch Camoufox with a persistent context and run successfully, reusing the specified user_data_dir.

The crawler crashes with the following traceback:

 Traceback (most recent call last):
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/crawlers/_basic/_context_pipeline.py", line 100, in __call__
    result = await middleware_instance.action()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/crawlers/_basic/_context_pipeline.py", line 40, in action
    self.output_context = await self.generator.__anext__()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/crawlers/_playwright/_playwright_crawler.py", line 334, in _open_page
    crawlee_page = await self._browser_pool.new_page(proxy_info=context.proxy_info)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/_utils/context.py", line 45, in async_wrapper
    return await method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/browsers/_browser_pool.py", line 281, in new_page
    return await self._get_new_page(page_id, plugin, proxy_info)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/browsers/_browser_pool.py", line 313, in _get_new_page
    browser_controller = await asyncio.wait_for(self._launch_new_browser(page_id, plugin), timeout)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/asyncio/tasks.py", line 507, in wait_for
    return await fut
           ^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/browsers/_browser_pool.py", line 365, in _launch_new_browser
    browser = await plugin.new_browser()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/_utils/context.py", line 45, in async_wrapper
    return await method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/crawler/example/main.py", line 21, in new_browser
    return PlaywrightBrowserController(
        browser=await AsyncNewBrowser(
    ...<6 lines>...
        header_generator=None,  #  This turns off the crawlee header_generation. Camoufox has its own.
    )
  File "/home/user/crawler/.venv/lib64/python3.13/site-packages/crawlee/browsers/_playwright_browser_controller.py", line 98, in __init__
    self._browser.contexts[0] if len(self._browser.contexts) > 0 else None
                                     ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'BrowserContext' object has no attribute 'contexts'

The issue appears to be in PlaywrightBrowserController.init. When Camoufox is launched with persistent_context=True, AsyncNewBrowser returns a BrowserContext object rather than a Browser object. The controller then tries to access .contexts on what it assumes is a Browser instance, but since it's actually a BrowserContext, the attribute doesn't exist.

This behavior differs from launching without a persistent context, where AsyncNewBrowser returns a standard Playwright Browser object that does have a .contexts attribute.

Metadata

Metadata

Assignees

No one assigned

    Labels

    t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions