Skip to content

[fix] Fixed command/connection resurrection and cache interference in CI#1392

Merged
nemesifier merged 15 commits into
masterfrom
flaky-ci
Jun 4, 2026
Merged

[fix] Fixed command/connection resurrection and cache interference in CI#1392
nemesifier merged 15 commits into
masterfrom
flaky-ci

Conversation

@nemesifier
Copy link
Copy Markdown
Member

Checklist

  • I have read the OpenWISP Contributing Guidelines.
  • I have manually tested the changes proposed in this pull request.
  • I have written new test cases for new code and/or updated existing tests for changes to existing code.
  • I have updated the documentation (not applicable: test and CI-only changes, no user-facing behavior change).

Reference to Existing Issue

There is no dedicated open issue. This addresses flaky failures observed across recent CI builds (failures by org members that passed on retry, plus a couple of master builds).

Description of Changes

Fixes the most frequent flaky tests found by analyzing failed CI builds that later passed on retry:

  • VpnClient not switching on template change (subnet_division ... test_vpn_template_switch_checksum_db): manage_vpn_clients created the VpnClient only for the templates in the current m2m_changed cycle's pk_set. A single template change fires several m2m cycles (SortedManyToManyField clears then re-adds, plus the group-template remove/add in ConfigForm.save), so a missed or mis-ordered cycle could leave the old VPN client. Creation now reconciles against the full current set of VPN templates, guarded by the existing existence check, so it is idempotent and symmetric with the deletion side.

  • FOREIGN KEY constraint failed from background commands (connection ... test_update_config_hostname_changed_on_reregister): a background launch_command could run connect() against a DeviceConnection whose row had already been deleted; save() then found no row to update and fell back to an INSERT, resurrecting the connection with a dangling device foreign key. connect() now records the attempt only if the connection still exists. This also removes the noisy caught tracebacks that appeared even in green runs.

  • Redis websocket consumer timeout (connection ... test_multiple_connections_receive_updates_with_redis): the test did not mock paramiko.SSHClient.connect (unlike the sibling test_new_command_created), so eager command execution could block on a real SSH attempt and push the test past the 5s pytest-timeout. The SSH connect is now mocked.

  • Flaky 401 responses in API tests (config ... test_multiple_vpn_client_templates_same_vpn): sessions were stored in the default cache, which the whois tests flush with cache.clear() and which parallel workers share, so a clear in one place wiped the authenticated session of a test running concurrently. Sessions now use a dedicated cache.

  • Flaky assertNumQueries (geo.estimated_location ... test_estimated_location_creation_and_update): an extra OrganizationGeoSettings query appeared when the cached entry was dropped between subtests. The relevant cache is now warmed before the assertion.

Regression tests were added where the failure could be reproduced deterministically.

Screenshot

N/A

manage_vpn_clients created the VpnClient only for the templates present
in the current m2m_changed cycle's pk_set. A single template change
fires several m2m cycles: SortedManyToManyField clears then re-adds the
templates, plus the group-template remove/add performed in
ConfigForm.save. The new VpnClient was therefore created only in the one
cycle whose pk_set contained the template, so a missed or mis-ordered
cycle could leave the config with the old VPN client. This surfaced as
the flaky test_vpn_template_switch_checksum_db failure, where switching
a device template from vpn1 to vpn2 intermittently kept vpn1
(<Vpn: vpn1> != <Vpn: vpn2>).

The deletion side already reconciled against the full current template
set. Make creation symmetric by iterating the currently linked VPN
templates, guarded by the existing existence check, so the handler
becomes an idempotent, self-healing reconcile across all the m2m cycles
a single change triggers.
DeviceConnection.connect() records the connection attempt with save().
When a background command (launch_command) runs against a connection
whose row was already deleted by a concurrent deletion or test teardown,
Django's save() found no row to UPDATE and fell back to an INSERT,
resurrecting the connection with a dangling device foreign key. That
raised "FOREIGN KEY constraint failed"; launch_command caught it, but its
recovery command.save() then failed too and the second error escaped.
This surfaced as the flaky test_update_config_hostname_changed_on_reregister
error, and as the noisy caught tracebacks seen even in green runs.

Record the attempt only when the connection still exists, so a stale
in-memory object can no longer resurrect a deleted row. Added a
regression test that deletes the connection row before calling connect().
test_multiple_connections_receive_updates_with_redis triggered real
command execution without mocking paramiko.SSHClient.connect, unlike the
sibling test_new_command_created. Command execution runs eagerly when the
command is created, so an un-mocked SSH connection attempt to the test
device can block until its timeout, pushing the whole test past the 5s
pytest-timeout and producing the flaky "Timeout (>5.0s) from
pytest-timeout" failures seen in CI.

Mock paramiko.SSHClient.connect so command execution is deterministic and
fast, matching the sibling test and keeping real SSH out of the unit test.
API tests authenticate with force_login, which stores the session in the
cache because SESSION_ENGINE is the cache backend. The session used the
"default" cache, which the whois tests flush with cache.clear() and which
parallel test workers share. A cache.clear() in one place therefore wiped
the authenticated session of a test running concurrently, so a later
request returned 401 instead of the expected status. This produced the
flaky "401 != 400" failures in
test_multiple_vpn_client_templates_same_vpn.

Keep sessions in a dedicated "sessions" cache so clearing the default
cache can no longer drop them. Added a regression test that clears the
default cache between two authenticated requests.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

Caution

Review failed

The pull request is closed.

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0381a909-a50f-4be4-b6bd-b2ce4cee7043

📥 Commits

Reviewing files that changed from the base of the PR and between 07c1a59 and 1fcd4ba.

📒 Files selected for processing (8)
  • openwisp_controller/config/base/config.py
  • openwisp_controller/config/tests/test_api.py
  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tasks.py
  • openwisp_controller/connection/tests/pytest.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • tests/openwisp2/settings.py

📝 Walkthrough

Walkthrough

This PR prevents DeviceConnection/Command rows from being resurrected by concurrent save() calls by adding a resurrection-safe helper and using it in connect()/execute(); isolates test session storage to a dedicated Redis cache and updates tests; makes VPN client auto-cert creation reconcile against the instance’s VPN templates; and warms OrganizationGeoSettings cache in geo tests to stabilize query-count assertions.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant AbstractDeviceConnection
  participant ParamikoSSHClient
  participant Database
  Caller->>AbstractDeviceConnection: dc.connect()
  AbstractDeviceConnection->>ParamikoSSHClient: SSHClient.connect(...)
  ParamikoSSHClient-->>AbstractDeviceConnection: success or exception
  alt new instance (_state.adding)
    AbstractDeviceConnection->>Database: save() (INSERT)
  else existing instance
    AbstractDeviceConnection->>Database: save(force_update=True) (UPDATE)
    Database-->>AbstractDeviceConnection: success or DatabaseError
    alt DatabaseError and row missing
      Database-->>AbstractDeviceConnection: row not found
      AbstractDeviceConnection-->>Caller: ignore (no resurrection)
    else DatabaseError and row exists
      AbstractDeviceConnection-->>Caller: re-raise DatabaseError
    end
  end
  AbstractDeviceConnection-->>Caller: return / exception
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

enhancement


Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Bug Fixes ❌ Error VPN template switch fix lacks regression test. Other bugs (connection resurrection, API cache, query count) have proper deterministic tests, but manage_vpn_clients idempotency test is missing. Add test verifying VPN template switch idempotency - ensure manage_vpn_clients reconciles correctly across multiple m2m_changed cycles.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly follows the required format [type] with 'fix' prefix and describes the main changes: fixing command/connection resurrection and cache interference causing CI failures.
Description check ✅ Passed The description is comprehensive and follows the template structure with completed checklist items, issue reference, and detailed explanation of five distinct flaky test fixes with regression tests added.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch flaky-ci

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openwisp_controller/config/base/config.py (1)

381-385: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid N existence queries while reconciling full VPN template set.

This now scans the full VPN template queryset and runs one .exists() query per template, which is repeated DB I/O in a signal hot path. Prefetch existing template IDs once, then create only missing clients.

Proposed optimization
-            for template in instance.templates.filter(type="vpn"):
-                # Create VPN client if needed
-                if not vpn_client_model.objects.filter(
-                    config=instance, vpn=template.vpn, template=template
-                ).exists():
-                    client = vpn_client_model(
-                        config=instance,
-                        vpn=template.vpn,
-                        template=template,
-                        auto_cert=template.auto_cert,
-                    )
-                    client.full_clean()
-                    client.save()
+            vpn_templates = instance.templates.filter(type="vpn").select_related("vpn")
+            existing_template_ids = set(
+                vpn_client_model.objects.filter(config=instance).values_list(
+                    "template_id", flat=True
+                )
+            )
+            for template in vpn_templates:
+                if template.pk in existing_template_ids:
+                    continue
+                client = vpn_client_model(
+                    config=instance,
+                    vpn=template.vpn,
+                    template=template,
+                    auto_cert=template.auto_cert,
+                )
+                client.full_clean()
+                client.save()

As per coding guidelines, "Flag obvious performance regressions, such as heavy loops, repeated I/O, or unoptimized queries."

Also applies to: 386-393

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openwisp_controller/config/base/config.py` around lines 381 - 385, The loop
over instance.templates.filter(type="vpn") does an .exists() DB query per
template; instead, fetch all VPN templates and prefetch existing
vpn_client_model entries in one query: build template_ids =
list(instance.templates.filter(type="vpn").values_list('id', flat=True')), get
existing_template_ids = set(vpn_client_model.objects.filter(config=instance,
template_id__in=template_ids).values_list('template_id', flat=True')), then
iterate the templates and only create vpn_client_model objects for templates
whose id is not in existing_template_ids; apply the same pattern for the other
block (lines 386-393) to avoid N additional queries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openwisp_controller/connection/base/models.py`:
- Around line 357-365: The current exists() pre-check is TOCTOU-prone because
Django may fall back to INSERT; replace the conditional self.save() call with a
save(force_update=True) to prevent INSERT fallback, and wrap it in a try/except
that catches Django's NotUpdated exception in addition to DatabaseError (or the
DB-specific exception your Django version raises) to handle the no-row-updated
case gracefully; adjust the block around _state.adding and
type(self)._base_manager.filter(pk=self.pk).exists() so you still skip when
adding, but when not adding call save(force_update=True) and handle
NotUpdated/DatabaseError accordingly.

In `@openwisp_controller/connection/tests/pytest.py`:
- Around line 81-84: The patched async test functions decorated with
`@mock.patch`("paramiko.SSHClient.connect") are missing the injected mock
parameter in their signatures, which causes a positional-args TypeError; update
the signatures of test_multiple_connections_receive_updates_with_redis and
test_new_command_created to accept the injected mock (e.g., add a parameter like
mock_connect as the first argument after self) so the patched mock is received
and the tests run correctly.

---

Outside diff comments:
In `@openwisp_controller/config/base/config.py`:
- Around line 381-385: The loop over instance.templates.filter(type="vpn") does
an .exists() DB query per template; instead, fetch all VPN templates and
prefetch existing vpn_client_model entries in one query: build template_ids =
list(instance.templates.filter(type="vpn").values_list('id', flat=True')), get
existing_template_ids = set(vpn_client_model.objects.filter(config=instance,
template_id__in=template_ids).values_list('template_id', flat=True')), then
iterate the templates and only create vpn_client_model objects for templates
whose id is not in existing_template_ids; apply the same pattern for the other
block (lines 386-393) to avoid N additional queries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a905a9bb-6dab-47c9-9195-26d23f2378a5

📥 Commits

Reviewing files that changed from the base of the PR and between 07c1a59 and 2cb4ce9.

📒 Files selected for processing (7)
  • openwisp_controller/config/base/config.py
  • openwisp_controller/config/tests/test_api.py
  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • tests/openwisp2/settings.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Python==3.13 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=4.2.0
  • GitHub Check: Python==3.11 | django~=4.2.0
  • GitHub Check: Python==3.12 | django~=4.2.0
  • GitHub Check: Python==3.12 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=5.1.0
  • GitHub Check: Python==3.11 | django~=5.2.0
  • GitHub Check: Python==3.12 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=5.1.0
  • GitHub Check: Python==3.13 | django~=5.2.0
  • GitHub Check: Python==3.10 | django~=5.2.0
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}

📄 CodeRabbit inference engine (Custom checks)

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}: Flag potential security vulnerabilities in code
Avoid unnecessary comments or docstrings for code that is already clear
Code formatting is compact and readable. Do not add excessive blank lines, especially inside function or method bodies
Flag unused or redundant code
Ensure variables, functions, classes, and files have descriptive and consistent names
New code must handle errors properly: log errors that cannot be resolved by the user with error level, log unusual conditions with warning level, log important background actions with info level, and provide user-facing messages for errors that the user can solve autonomously

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • tests/openwisp2/settings.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sql}

📄 CodeRabbit inference engine (Custom checks)

Flag obvious performance regressions, such as heavy loops, repeated I/O, or unoptimized queries

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • tests/openwisp2/settings.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sh,bash,sql}

📄 CodeRabbit inference engine (Custom checks)

Cryptic or non-obvious code (regex, complex bash commands, or hard-to-read code) must include a concise comment explaining why it is needed and why the complexity is acceptable

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • tests/openwisp2/settings.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
**/*.{py,html}

📄 CodeRabbit inference engine (Custom checks)

For Django pull requests, ensure all user-facing strings are marked as translatable using the Django i18n framework

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • tests/openwisp2/settings.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Mark user-facing strings for translation with Django i18n helpers in Django code
Avoid unnecessary blank lines inside function and method bodies
Be careful with authentication, authorization, queryset filtering, serializers, admin behavior, cache invalidation, signals, Celery tasks, and websocket updates in Django code
Preserve validation around templates, VPN/PKI material, SSH credentials, device commands, uploaded files, URLs, and subnet/IP data
Write comments and docstrings only when they explain why code is shaped a certain way, placing them before the relevant code block instead of scattering them inside it

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • tests/openwisp2/settings.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
🧠 Learnings (4)
📚 Learning: 2026-01-15T15:05:49.557Z
Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/management/commands/clear_last_ip.py:38-42
Timestamp: 2026-01-15T15:05:49.557Z
Learning: In Django projects, when using select_related() to traverse relations (for example, select_related("organization__config_settings")), the traversed relation must not be deferred. If you also use .only() in the same query, include the relation name or FK field (e.g., "organization" or "organization_id") in the .only() list to avoid the error "Field X cannot be both deferred and traversed using select_related at the same time." Apply this guideline to Django code in openwisp_controller/config/management/commands/clear_last_ip.py and similar modules by ensuring any select_related with an accompanying only() includes the related field names to prevent deferred/traversed conflicts.

Applied to files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
📚 Learning: 2026-02-17T19:13:10.088Z
Learnt from: nemesifier
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/whois/commands.py:0-0
Timestamp: 2026-02-17T19:13:10.088Z
Learning: In reviews for the openwisp/openwisp-controller repository, do not propose changes based on Ruff warnings. The project does not use Ruff as its linter; ignore Ruff-related suggestions and follow the repository’s established linting and configuration rules. This guidance applies to all Python files under the openwisp_controller directory.

Applied to files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
📚 Learning: 2026-01-15T15:07:17.354Z
Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/geo/estimated_location/tests/tests.py:172-175
Timestamp: 2026-01-15T15:07:17.354Z
Learning: In this repository, flake8 enforces E501 (line too long) via setup.cfg (max-line-length = 88) while ruff ignores E501 via ruff.toml. Therefore, use '# noqa: E501' on lines that intentionally exceed 88 characters to satisfy flake8 without affecting ruff checks. This applies to Python files across the project (any .py) and is relevant for tests as well. Use sparingly and only where breaking lines is not feasible without hurting readability or functionality.

Applied to files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/config/tests/test_api.py
  • tests/openwisp2/settings.py
  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/geo/estimated_location/tests/tests.py
  • openwisp_controller/config/base/config.py
  • openwisp_controller/connection/tests/pytest.py
📚 Learning: 2026-03-27T20:50:26.240Z
Learnt from: nemesifier
Repo: openwisp/openwisp-controller PR: 1315
File: openwisp_controller/geo/estimated_location/service.py:70-76
Timestamp: 2026-03-27T20:50:26.240Z
Learning: In openwisp-controller’s WHOIS and estimated-location services (openwisp_controller/config/whois/ and openwisp_controller/geo/estimated_location/), these components only process public IP addresses. When reviewing logs/error/debug messages in this area, treat logging the IP address as acceptable and do not flag it as a privacy/security concern—unless the logged value can originate from non-public/private IPs in that specific code path.

Applied to files:

  • openwisp_controller/geo/estimated_location/tests/tests.py
🔇 Additional comments (4)
tests/openwisp2/settings.py (1)

169-180: LGTM!

Also applies to: 191-191

openwisp_controller/config/tests/test_api.py (1)

1345-1355: LGTM!

openwisp_controller/connection/tests/test_tasks.py (1)

185-196: LGTM!

openwisp_controller/geo/estimated_location/tests/tests.py (1)

243-251: LGTM!

Also applies to: 459-460, 484-485, 509-510

Comment thread openwisp_controller/connection/base/models.py Outdated
Comment thread openwisp_controller/connection/tests/pytest.py
…ce_update

The previous exists() pre-check was time-of-check-to-time-of-use racy:
between the check and save() the row could still be deleted, and save()
would then fall back to an INSERT and resurrect the connection. Use
save(force_update=True) so Django performs an UPDATE only and never an
INSERT, and ignore the DatabaseError raised when there is no row to update.
New connections (self._state.adding) still use a normal save().
@mock.patch injects the mock as a positional argument, which was binding
to admin_user instead of being captured explicitly. It was harmless only
because admin_user was unused, but it left the wrong value in that
parameter. Add an explicit mocked_connect parameter so the fixtures
admin_user, admin_client and settings receive their real values.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openwisp_controller/connection/base/models.py`:
- Around line 365-368: In AbstractDeviceConnection.connect(), don’t swallow all
DatabaseError around the self.save(force_update=True) call; instead catch
DatabaseError as e, detect the specific deleted/missing-row case by checking
self.__class__.objects.filter(pk=self.pk).exists(), handle that case as before
(graceful return) but for any other DatabaseError log at error level (use module
logger = logging.getLogger(__name__)) with context (e and object id) and
re-raise the exception so real DB failures surface; ensure logging import is
present and only the missing-row branch is treated as non-fatal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 254ef9e3-2826-4d16-8c3b-0c2aded29415

📥 Commits

Reviewing files that changed from the base of the PR and between 2cb4ce9 and 8d9cfa0.

📒 Files selected for processing (2)
  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Python==3.13 | django~=5.1.0
  • GitHub Check: Python==3.13 | django~=5.2.0
  • GitHub Check: Python==3.12 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=5.2.0
  • GitHub Check: Python==3.12 | django~=4.2.0
  • GitHub Check: Python==3.12 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=4.2.0
  • GitHub Check: Python==3.11 | django~=4.2.0
  • GitHub Check: Python==3.10 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=5.1.0
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}

📄 CodeRabbit inference engine (Custom checks)

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}: Flag potential security vulnerabilities in code
Avoid unnecessary comments or docstrings for code that is already clear
Code formatting is compact and readable. Do not add excessive blank lines, especially inside function or method bodies
Flag unused or redundant code
Ensure variables, functions, classes, and files have descriptive and consistent names
New code must handle errors properly: log errors that cannot be resolved by the user with error level, log unusual conditions with warning level, log important background actions with info level, and provide user-facing messages for errors that the user can solve autonomously

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sql}

📄 CodeRabbit inference engine (Custom checks)

Flag obvious performance regressions, such as heavy loops, repeated I/O, or unoptimized queries

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sh,bash,sql}

📄 CodeRabbit inference engine (Custom checks)

Cryptic or non-obvious code (regex, complex bash commands, or hard-to-read code) must include a concise comment explaining why it is needed and why the complexity is acceptable

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
**/*.{py,html}

📄 CodeRabbit inference engine (Custom checks)

For Django pull requests, ensure all user-facing strings are marked as translatable using the Django i18n framework

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Mark user-facing strings for translation with Django i18n helpers in Django code
Avoid unnecessary blank lines inside function and method bodies
Be careful with authentication, authorization, queryset filtering, serializers, admin behavior, cache invalidation, signals, Celery tasks, and websocket updates in Django code
Preserve validation around templates, VPN/PKI material, SSH credentials, device commands, uploaded files, URLs, and subnet/IP data
Write comments and docstrings only when they explain why code is shaped a certain way, placing them before the relevant code block instead of scattering them inside it

Files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
🧠 Learnings (3)
📚 Learning: 2026-01-15T15:05:49.557Z
Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/management/commands/clear_last_ip.py:38-42
Timestamp: 2026-01-15T15:05:49.557Z
Learning: In Django projects, when using select_related() to traverse relations (for example, select_related("organization__config_settings")), the traversed relation must not be deferred. If you also use .only() in the same query, include the relation name or FK field (e.g., "organization" or "organization_id") in the .only() list to avoid the error "Field X cannot be both deferred and traversed using select_related at the same time." Apply this guideline to Django code in openwisp_controller/config/management/commands/clear_last_ip.py and similar modules by ensuring any select_related with an accompanying only() includes the related field names to prevent deferred/traversed conflicts.

Applied to files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
📚 Learning: 2026-02-17T19:13:10.088Z
Learnt from: nemesifier
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/whois/commands.py:0-0
Timestamp: 2026-02-17T19:13:10.088Z
Learning: In reviews for the openwisp/openwisp-controller repository, do not propose changes based on Ruff warnings. The project does not use Ruff as its linter; ignore Ruff-related suggestions and follow the repository’s established linting and configuration rules. This guidance applies to all Python files under the openwisp_controller directory.

Applied to files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
📚 Learning: 2026-01-15T15:07:17.354Z
Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/geo/estimated_location/tests/tests.py:172-175
Timestamp: 2026-01-15T15:07:17.354Z
Learning: In this repository, flake8 enforces E501 (line too long) via setup.cfg (max-line-length = 88) while ruff ignores E501 via ruff.toml. Therefore, use '# noqa: E501' on lines that intentionally exceed 88 characters to satisfy flake8 without affecting ruff checks. This applies to Python files across the project (any .py) and is relevant for tests as well. Use sparingly and only where breaking lines is not feasible without hurting readability or functionality.

Applied to files:

  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/pytest.py
🔇 Additional comments (1)
openwisp_controller/connection/tests/pytest.py (1)

71-72: LGTM!

Also applies to: 81-83

Comment thread openwisp_controller/connection/base/models.py Outdated
The previous "except DatabaseError: pass" swallowed every database error,
hiding real persistence failures while connect() still returned based on
the in-memory state. Only the deleted-row race should be ignored: re-raise
(after logging) when the connection row still exists, and ignore the error
only when the row is gone. Added a regression test asserting that a genuine
database error is propagated.
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 4, 2026
Removed an em dash from a comment in manage_vpn_clients. Moved a cache
import to the module top and dropped a blank line inside a test body, per
the project style. Corrected the cache-eviction explanation in the
estimated-location test helper: the shared cache is cleared by other
tests (e.g. whois), it is not LocMemCache culling.
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 4, 2026
The earlier connect() fix only covered DeviceConnection resurrection. A
background launch_command also resurrects a deleted Command through the
final save() in execute(), and that FOREIGN KEY IntegrityError fires inside
the selenium suite (in the live-server thread), where on SQLite it can leave
the shared test database in a broken or locked state and make the live
server hang, producing the flaky selenium timeouts that the retry mechanism
cannot recover from.

Extract the resurrection-safe save into a shared helper and use it in both
connect() and execute(), so neither path can fall back to a resurrecting
INSERT. Added a regression test for the command path.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openwisp_controller/connection/base/models.py (1)

549-576: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Deleted commands can still be sent to the device.

launch_command() fetches the Command and then calls execute(). If that row is deleted after the fetch but before this method starts, _exec_command() still runs and can perform the remote action; the new helper only prevents the trailing save() from recreating the row. Please re-check that the command still exists before _exec_command() and bail out when it has already been deleted.

As per coding guidelines, "Be careful with authentication, authorization, queryset filtering, serializers, admin behavior, cache invalidation, signals, Celery tasks, and websocket updates in Django code".


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c258e49c-9962-4a89-880d-0c1c45750428

📥 Commits

Reviewing files that changed from the base of the PR and between e548648 and 48403b2.

📒 Files selected for processing (2)
  • openwisp_controller/connection/base/models.py
  • openwisp_controller/connection/tests/test_tasks.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Python==3.11 | django~=5.2.0
  • GitHub Check: Python==3.12 | django~=4.2.0
  • GitHub Check: Python==3.12 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=5.1.0
  • GitHub Check: Python==3.11 | django~=5.1.0
  • GitHub Check: Python==3.13 | django~=5.2.0
  • GitHub Check: Python==3.13 | django~=5.1.0
  • GitHub Check: Python==3.10 | django~=4.2.0
  • GitHub Check: Python==3.12 | django~=5.2.0
  • GitHub Check: Python==3.10 | django~=5.2.0
  • GitHub Check: Python==3.11 | django~=4.2.0
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}

📄 CodeRabbit inference engine (Custom checks)

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}: Flag potential security vulnerabilities in code
Avoid unnecessary comments or docstrings for code that is already clear
Code formatting is compact and readable. Do not add excessive blank lines, especially inside function or method bodies
Flag unused or redundant code
Ensure variables, functions, classes, and files have descriptive and consistent names
New code must handle errors properly: log errors that cannot be resolved by the user with error level, log unusual conditions with warning level, log important background actions with info level, and provide user-facing messages for errors that the user can solve autonomously

Files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sql}

📄 CodeRabbit inference engine (Custom checks)

Flag obvious performance regressions, such as heavy loops, repeated I/O, or unoptimized queries

Files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sh,bash,sql}

📄 CodeRabbit inference engine (Custom checks)

Cryptic or non-obvious code (regex, complex bash commands, or hard-to-read code) must include a concise comment explaining why it is needed and why the complexity is acceptable

Files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
**/*.{py,html}

📄 CodeRabbit inference engine (Custom checks)

For Django pull requests, ensure all user-facing strings are marked as translatable using the Django i18n framework

Files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Mark user-facing strings for translation with Django i18n helpers in Django code
Avoid unnecessary blank lines inside function and method bodies
Be careful with authentication, authorization, queryset filtering, serializers, admin behavior, cache invalidation, signals, Celery tasks, and websocket updates in Django code
Preserve validation around templates, VPN/PKI material, SSH credentials, device commands, uploaded files, URLs, and subnet/IP data
Write comments and docstrings only when they explain why code is shaped a certain way, placing them before the relevant code block instead of scattering them inside it

Files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
🧠 Learnings (3)
📚 Learning: 2026-01-15T15:05:49.557Z
Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/management/commands/clear_last_ip.py:38-42
Timestamp: 2026-01-15T15:05:49.557Z
Learning: In Django projects, when using select_related() to traverse relations (for example, select_related("organization__config_settings")), the traversed relation must not be deferred. If you also use .only() in the same query, include the relation name or FK field (e.g., "organization" or "organization_id") in the .only() list to avoid the error "Field X cannot be both deferred and traversed using select_related at the same time." Apply this guideline to Django code in openwisp_controller/config/management/commands/clear_last_ip.py and similar modules by ensuring any select_related with an accompanying only() includes the related field names to prevent deferred/traversed conflicts.

Applied to files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
📚 Learning: 2026-02-17T19:13:10.088Z
Learnt from: nemesifier
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/whois/commands.py:0-0
Timestamp: 2026-02-17T19:13:10.088Z
Learning: In reviews for the openwisp/openwisp-controller repository, do not propose changes based on Ruff warnings. The project does not use Ruff as its linter; ignore Ruff-related suggestions and follow the repository’s established linting and configuration rules. This guidance applies to all Python files under the openwisp_controller directory.

Applied to files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py
📚 Learning: 2026-01-15T15:07:17.354Z
Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/geo/estimated_location/tests/tests.py:172-175
Timestamp: 2026-01-15T15:07:17.354Z
Learning: In this repository, flake8 enforces E501 (line too long) via setup.cfg (max-line-length = 88) while ruff ignores E501 via ruff.toml. Therefore, use '# noqa: E501' on lines that intentionally exceed 88 characters to satisfy flake8 without affecting ruff checks. This applies to Python files across the project (any .py) and is relevant for tests as well. Use sparingly and only where breaking lines is not feasible without hurting readability or functionality.

Applied to files:

  • openwisp_controller/connection/tests/test_tasks.py
  • openwisp_controller/connection/base/models.py

launch_command fetches the command and calls execute(); if the command was
deleted after being scheduled (e.g. by a user or a cascading delete) while
the background task was queued, _exec_command() still performed the remote
action on the device. Bail out of execute() when the command no longer
exists, before performing the remote action. The resurrection-safe save
remains as defense in depth for a deletion that races the execution.
…mand

launch_command's three exception handlers persisted the failed status with a
plain command.save(). If the command was deleted while execute() was running
(a long SSH run, a soft time limit, or a command timeout) and execute() then
raised, that save fell back to an INSERT and resurrected the deleted command:
the same FOREIGN KEY error that can corrupt the live-server DB on SQLite.
Route those saves through the _save_without_resurrecting helper (imported
lazily to avoid the models and tasks circular import) so the handlers update
the row only when it still exists. Added a regression test.
Added a _save_without_resurrecting method on DeviceConnection and Command
that delegates to the module-level helper, so launch_command can call it
without importing it (the lazy import is gone). Also used
instance._meta.model.objects instead of type(instance)._base_manager for the
existence check (more readable and consistent, and these models have no
custom manager), and converted the _warm_org_geo_settings_cache comment to a
docstring.
pandafy
pandafy previously approved these changes Jun 4, 2026
@github-project-automation github-project-automation Bot moved this from In progress to Reviewer approved in OpenWISP Priorities for next releases Jun 4, 2026
@nemesifier nemesifier changed the title [fix] Fixed several flaky tests causing intermittent CI failures [fix] Fixed command/connection resurrection and cache interference in CI Jun 4, 2026
@nemesifier
Copy link
Copy Markdown
Member Author

nemesifier commented Jun 4, 2026

@coderabbitai full review, please resolve if no new issues are found

@nemesifier nemesifier moved this from Backlog to Ready for review/testing in 26.06 Release Jun 4, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read all changed files

echo "=== config/base/config.py (around manage_vpn_clients) ==="
grep -n "manage_vpn_clients\|post_add\|pk_set\|vpn\|templates" openwisp_controller/config/base/config.py | head -80

echo ""
echo "=== Full manage_vpn_clients function ==="
grep -n "" openwisp_controller/config/base/config.py | grep -A 80 "def manage_vpn_clients"

echo ""
echo "=== connection/base/models.py - new helper and changes ==="
grep -n "" openwisp_controller/connection/base/models.py | grep -B5 -A 50 "_save_without_resurrecting\|DatabaseError"

echo ""
echo "=== connection/tests/test_tasks.py - new tests ==="
grep -n "" openwisp_controller/connection/tests/test_tasks.py | tail -80

echo ""
echo "=== connection/tests/pytest.py ==="
cat openwisp_controller/connection/tests/pytest.py

echo ""
echo "=== config/tests/test_api.py - new test ==="
grep -n "test_session_survives\|cache\|SESSION" openwisp_controller/config/tests/test_api.py | head -30

echo ""
echo "=== geo/estimated_location/tests/tests.py ==="
grep -n "_warm_org_geo_settings_cache\|assertNumQueries\|check_estimated_location_enabled" openwisp_controller/geo/estimated_location/tests/tests.py | head -30

echo ""
echo "=== tests/openwisp2/settings.py - sessions cache ==="
grep -n "CACHES\|SESSION\|sessions\|redis" tests/openwisp2/settings.py

Length of output: 33499


🏁 Script executed:

#!/bin/bash
# Read the full relevant sections of each changed file

echo "=== manage_vpn_clients full function ==="
python3 - <<'EOF'
import re

with open("openwisp_controller/config/base/config.py") as f:
    content = f.read()

# Find manage_vpn_clients
idx = content.find("def manage_vpn_clients")
if idx >= 0:
    # Print 100 lines from there
    lines = content[idx:idx+4000]
    print(lines)
EOF

echo ""
echo "=== connection/base/models.py full _save_without_resurrecting ==="
python3 - <<'EOF'
with open("openwisp_controller/connection/base/models.py") as f:
    content = f.read()

idx = content.find("_save_without_resurrecting")
if idx >= 0:
    # Find the function start
    start = content.rfind("\ndef ", 0, idx)
    if start < 0:
        start = idx
    print(content[start:start+2000])
EOF

echo ""
echo "=== connection/base/models.py connect() ==="
python3 - <<'EOF'
with open("openwisp_controller/connection/base/models.py") as f:
    content = f.read()

idx = content.find("def connect(")
if idx >= 0:
    print(content[idx:idx+2000])
EOF

echo ""
echo "=== connection/base/models.py execute() ==="
python3 - <<'EOF'
with open("openwisp_controller/connection/base/models.py") as f:
    content = f.read()

idx = content.find("def execute(")
if idx >= 0:
    print(content[idx:idx+2000])
EOF

Length of output: 11556


🏁 Script executed:

#!/bin/bash
echo "=== Full test_tasks.py new regression tests ==="
grep -n "" openwisp_controller/connection/tests/test_tasks.py | grep -A 200 "resurrect"

echo ""
echo "=== Full settings.py ==="
cat tests/openwisp2/settings.py | grep -A 50 "CACHES"

echo ""
echo "=== config/tests/test_api.py new test ==="
grep -n "" openwisp_controller/config/tests/test_api.py | grep -A 20 "test_session_survives"

echo ""
echo "=== geo tests warmup helper ==="
grep -n "" openwisp_controller/geo/estimated_location/tests/tests.py | grep -B2 -A 30 "_warm_org_geo_settings_cache"

Length of output: 12712



@nemesifier: I've completed a full review of all changed files. The implementation is solid across the board — here's a brief summary of what I checked:

config/base/config.pymanage_vpn_clients
The switch from iterating over the signal pk_set delta to instance.templates.filter(type="vpn") (full current set) is correct and idempotent, guarded by the if not vpn_client_model.objects.filter(...).exists() check. ✅

connection/base/models.py_save_without_resurrecting
The module-level helper with force_update=True, DatabaseError catch, and existence re-check correctly distinguishes a deleted-row scenario (suppress) from a genuine DB failure (log and re-raise). The two instance-method delegates (AbstractDeviceConnection and AbstractCommand) cleanly dispatch to it. ✅

One very minor observation: the existence check uses instance._meta.model.objects.filter(pk=instance.pk).exists() rather than _base_manager. If a model ever gains a custom filtering manager (e.g., soft-delete), a genuine write failure on a still-present row could be silently swallowed. For current models this is not a concern, but _base_manager would be marginally safer.

connection/tests/pytest.py@mock.patch("paramiko.SSHClient.connect") correctly added to both consumer tests, and argument order is right (outermost decorator → first parameter). ✅

connection/tests/test_tasks.py — All four regression tests (test_connect_does_not_resurrect_deleted_connection, test_connect_reraises_genuine_db_error, test_execute_skips_deleted_command, test_launch_command_handler_does_not_resurrect_deleted_command) clearly exercise the intended scenarios. ✅

config/tests/test_api.pytest_session_survives_default_cache_clear is clean and well-documented. ✅

geo/estimated_location/tests/tests.py_warm_org_geo_settings_cache is called correctly before each sensitive assertNumQueries subtest. ✅

tests/openwisp2/settings.py — Sessions on Redis DB 8 (separate from default on DB 6), SESSION_CACHE_ALIAS = "sessions" — clean isolation. ✅

No new issues found. Resolving all open comments.

[resolve]

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the PR diff and changed files
gh pr view 1392 --repo openwisp/openwisp-controller --json files,commits,body | jq '.files[].path'

Length of output: 564


🏁 Script executed:

#!/bin/bash
# Read changed files
cat openwisp_controller/config/base/config.py | grep -n "manage_vpn_clients" -A 50 | head -100

Length of output: 2669


🏁 Script executed:

#!/bin/bash
# Read the connection models file
cat openwisp_controller/connection/base/models.py | grep -n "_save_without_resurrecting\|DatabaseError\|def connect\|def execute" -A 30 | head -200

Length of output: 7481


🏁 Script executed:

#!/bin/bash
# Read test_tasks.py for new tests
cat openwisp_controller/connection/tests/test_tasks.py | grep -n "resurrect\|does_not_resurrect\|reraises" -A 30 | head -150

Length of output: 3847


🏁 Script executed:

#!/bin/bash
# Read settings.py for sessions cache changes
cat tests/openwisp2/settings.py

Length of output: 11008


🏁 Script executed:

#!/bin/bash
# Read the pytest.py changes
cat openwisp_controller/connection/tests/pytest.py | grep -n "paramiko\|mocked_connect\|test_new_command\|test_multiple_connections" -A 15 | head -100

Length of output: 2346


🏁 Script executed:

#!/bin/bash
# Read geo tests
cat openwisp_controller/geo/estimated_location/tests/tests.py | grep -n "_warm_org_geo_settings_cache\|test_estimated_location_creation_and_update" -A 20 | head -100

Length of output: 5171


🏁 Script executed:

#!/bin/bash
# Read api test
cat openwisp_controller/config/tests/test_api.py | grep -n "test_session_survives\|cache.clear\|import cache" -A 15 | head -60

Length of output: 2252


🏁 Script executed:

#!/bin/bash
# Read tasks.py - especially the launch_command function
cat openwisp_controller/connection/tasks.py

Length of output: 3851


🏁 Script executed:

#!/bin/bash
# Check the full _save_without_resurrecting implementation and nearby context
cat openwisp_controller/connection/base/models.py | sed -n '38,60p'

Length of output: 938


🏁 Script executed:

#!/bin/bash
# Check the full execute() method
cat openwisp_controller/connection/base/models.py | grep -n "def execute" -A 60 | head -70

Length of output: 2790


🏁 Script executed:

#!/bin/bash
# Check the manage_vpn_clients post_add section more fully
cat openwisp_controller/config/base/config.py | sed -n '360,400p'

Length of output: 2107



@nemesifier I've conducted a thorough review of all 8 changed files. The fixes are well-targeted and the regression tests are comprehensive. Here's a summary:

openwisp_controller/connection/base/models.py_save_without_resurrecting

  • Using force_update=True is the right mechanism to prevent the save()UPDATE → 0 rows affected → INSERT fallback.
  • The DatabaseError catch correctly distinguishes a genuinely missing row from a real DB failure: if the row still exists, it logs and re-raises; if the row is gone, it silently skips.
  • One very minor nit: logger.error(...) on line 51 does not include exc_info=True, so a genuine database error's traceback won't appear in the logs. Worth adding for debuggability, but not blocking.

openwisp_controller/connection/tasks.pylaunch_command

  • Exception handlers for SoftTimeLimitExceeded, CommandTimeoutException, and the catch-all Exception now correctly use command._save_without_resurrecting() — consistent with the rest of the fix and properly prevents resurrection from the outer layer.

openwisp_controller/config/base/config.pymanage_vpn_clients

  • Iterating over the full current VPN template set for post_add (instead of just the delta pk_set) is idempotent thanks to the if not vpn_client_model.objects.filter(config=instance, vpn=template.vpn, template=template).exists() guard. Correctly handles mis-ordered m2m_changed cycles.

tests/openwisp2/settings.py

  • DB 6 (default), DB 7 (channels), DB 8 (sessions) — no namespace conflicts. The isolation cleanly prevents session loss from cache.clear() calls in other tests.

All regression tests — well-structured, covering both the resurrection-prevention path and the genuine-error re-raise path (test_connect_reraises_genuine_db_error).

No new issues found. Resolving all comments.

✅ Action performed

Full review finished.

@nemesifier nemesifier merged commit 55c3777 into master Jun 4, 2026
17 of 18 checks passed
@nemesifier nemesifier deleted the flaky-ci branch June 4, 2026 17:06
@github-project-automation github-project-automation Bot moved this from Reviewer approved to Done in OpenWISP Priorities for next releases Jun 4, 2026
@github-project-automation github-project-automation Bot moved this from Ready for review/testing to Done in 26.06 Release Jun 4, 2026
@nemesifier
Copy link
Copy Markdown
Member Author

/backport 1.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants