Skip to content

[DNM] Add OpenSearch deployment#2013

Draft
jrbourbeau wants to merge 2 commits into
rapidsai:mainfrom
jrbourbeau:deploy-opensearch
Draft

[DNM] Add OpenSearch deployment#2013
jrbourbeau wants to merge 2 commits into
rapidsai:mainfrom
jrbourbeau:deploy-opensearch

Conversation

@jrbourbeau
Copy link
Copy Markdown
Member

This PR adds a docker-compose setup for deploying opensearch and it's new remote index build service. It's also set up to be able to run cuvs-bench tests against the opensearch backend being added over in #2012

Marking as DNM as I don't know where we want this to actually live (and I'd want to clean things up a bit). I could see this being incorporated into a CI build that runs cuvs-bench at some point. Mostly wanted to push this up for visibility and to aid in reviewing #2012

cc @cjnolet @singhmanas1 @janakivamaraju @afourniernv

Signed-off-by: James Bourbeau <jbourbeau@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@aamijar aamijar added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Apr 21, 2026
Signed-off-by: James Bourbeau <jbourbeau@nvidia.com>
@cjnolet cjnolet moved this to In Progress in Unstructured Data Processing May 13, 2026
Comment thread deploy/bench/Dockerfile
# nvcc, and cuvs_bench declares a hard dep on the `cuvs` CUDA package).
# The opensearch backend is pure Python and needs neither, so we add the
# package directly to PYTHONPATH and install only the actual runtime deps.
RUN git clone --depth=1 --filter=blob:none --sparse --branch cuvs-bench-opensearch https://github.com/jrbourbeau/cuvs.git /opt/cuvs \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just set to your fork while we don't yet have the OpenSearch backend for cuvs-bench merged?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- I needed to point to a branch that had the opensearch cuvs-bench backend present

python -m cuvs_bench.run --data-export \
--dataset "$DATASET" \
--dataset-path /data/datasets \
--algorithms opensearch_faiss_hnsw \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script seems backend specific. Should we maybe have a top-level directory for opensearch (for e.g.(deploy/opensearch/bench, etc...)? Or do you foresee many backends using the same runner scripts and whatnot? For example, coudl we just make opensearch_faiss_hnsw an additional argument to keep this script agnostic of the backend?

Comment thread deploy/README.md
@@ -0,0 +1,216 @@
# OpenSearch kNN Benchmark
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, but I think more importantly we'll want this info in the docs (and we should probably point to the docs for this just to eliminate duplication? I know there's a few areas in the codebase where we could say this same thing, but probably don't want to continue to introduce more as we work to fix the others.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to, we can wait until we get the Fern docs merged (should be soon) and then add this to the cuvs-bench/backends docs in the new docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants