Skip to content

Refactor repositories to support caller-owned AsyncSession flows #750

@phernandez

Description

@phernandez

Summary

Basic Memory's repositories still own their own SQLAlchemy sessions for most high-level methods. That makes it awkward to orchestrate one logical write transaction across multiple repositories.

This is showing up clearly in Basic Memory Cloud's accepted-write path for note materialization, where the service needs one shared tenant DB transaction to:

  • resolve Project
  • load or create Entity
  • load or upsert NoteContent
  • flush related ORM changes together
  • commit once

Because the current repository methods open their own sessions, cloud has to keep several transaction-scoped query helpers local in the service layer instead of reusing the repository classes.

Current pain

Examples:

  • ProjectRepository.get_by_external_id() opens its own session
  • EntityRepository.get_by_external_id() opens its own session
  • EntityRepository.get_by_file_path() opens its own session
  • NoteContentRepository.get_by_entity_id() opens its own session
  • NoteContentRepository.upsert() opens its own session

That makes it hard to compose these operations safely inside an existing caller-owned transaction.

Proposed direction

Add transaction-friendly repository methods that accept an existing AsyncSession.

For example:

  • ProjectRepository.get_by_external_id(session, external_id)
  • EntityRepository.get_by_external_id(session, external_id, *, load_relations=True)
  • EntityRepository.get_by_file_path(session, file_path, *, load_relations=True)
  • NoteContentRepository.get_by_entity_id(session, entity_id)
  • NoteContentRepository.upsert(..., session=...) or a dedicated upsert_accepted_state(session, ...)

The service layer should own transaction boundaries. Repositories should provide query/persistence helpers that can either:

  1. work with a caller-owned session, or
  2. provide a convenience wrapper that opens a session when the caller does not care.

Why this matters

This lets orchestration-heavy flows stay simple and correct:

  • accepted note writes in cloud
  • directory deletes / other multi-step content mutations
  • any future flow that needs multiple repositories to participate in one commit

It also reduces duplicated query helpers in cloud code.

Scope

Start small. We do not need to redesign every repository at once.

Suggested first pass:

  • ProjectRepository.get_by_external_id(session, ...)
  • EntityRepository.get_by_external_id(session, ...)
  • EntityRepository.get_by_file_path(session, ...)
  • NoteContentRepository.get_by_entity_id(session, ...)
  • one note-content upsert/update method that can participate in an existing session

Then update callers to use the shared-session variants where they are already orchestrating a transaction.

Out of scope

  • changing user-facing API behavior
  • changing note materialization semantics
  • broad repository cleanup unrelated to transaction ownership

Acceptance criteria

  • caller-owned transaction flows can use repository methods without opening nested sessions
  • existing convenience behavior can remain for simple call sites if useful
  • cloud can delete its local transaction-scoped lookup/upsert helpers once the new repository API is available

Metadata

Metadata

Assignees

No one assigned

    Labels

    cloudBasic Memory CloudenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions