[DNM] Add OpenSearch deployment#2013
Conversation
Signed-off-by: James Bourbeau <jbourbeau@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
| # nvcc, and cuvs_bench declares a hard dep on the `cuvs` CUDA package). | ||
| # The opensearch backend is pure Python and needs neither, so we add the | ||
| # package directly to PYTHONPATH and install only the actual runtime deps. | ||
| RUN git clone --depth=1 --filter=blob:none --sparse --branch cuvs-bench-opensearch https://github.com/jrbourbeau/cuvs.git /opt/cuvs \ |
There was a problem hiding this comment.
Is this just set to your fork while we don't yet have the OpenSearch backend for cuvs-bench merged?
There was a problem hiding this comment.
Yeah -- I needed to point to a branch that had the opensearch cuvs-bench backend present
| python -m cuvs_bench.run --data-export \ | ||
| --dataset "$DATASET" \ | ||
| --dataset-path /data/datasets \ | ||
| --algorithms opensearch_faiss_hnsw \ |
There was a problem hiding this comment.
This script seems backend specific. Should we maybe have a top-level directory for opensearch (for e.g.(deploy/opensearch/bench, etc...)? Or do you foresee many backends using the same runner scripts and whatnot? For example, coudl we just make opensearch_faiss_hnsw an additional argument to keep this script agnostic of the backend?
| @@ -0,0 +1,216 @@ | |||
| # OpenSearch kNN Benchmark | |||
There was a problem hiding this comment.
This is great, but I think more importantly we'll want this info in the docs (and we should probably point to the docs for this just to eliminate duplication? I know there's a few areas in the codebase where we could say this same thing, but probably don't want to continue to introduce more as we work to fix the others.
There was a problem hiding this comment.
If we need to, we can wait until we get the Fern docs merged (should be soon) and then add this to the cuvs-bench/backends docs in the new docs.
This PR adds a docker-compose setup for deploying opensearch and it's new remote index build service. It's also set up to be able to run cuvs-bench tests against the opensearch backend being added over in #2012
Marking as
DNMas I don't know where we want this to actually live (and I'd want to clean things up a bit). I could see this being incorporated into a CI build that runscuvs-benchat some point. Mostly wanted to push this up for visibility and to aid in reviewing #2012cc @cjnolet @singhmanas1 @janakivamaraju @afourniernv