data_structures: add progressive_set_intersection in disjoint_set#14490
Closed
Devanik21 wants to merge 1 commit intoTheAlgorithms:masterfrom
Closed
data_structures: add progressive_set_intersection in disjoint_set#14490Devanik21 wants to merge 1 commit intoTheAlgorithms:masterfrom
Devanik21 wants to merge 1 commit intoTheAlgorithms:masterfrom
Conversation
This function computes the intersection of multiple sets efficiently by sorting them by size and using early termination.
Closing this pull request as invalid@Devanik21, this pull request is being closed as none of the checkboxes have been marked. It is important that you go through the checklist and mark the ones relevant to this pull request. Please read the Contributing guidelines. If you're facing any problem on how to mark a checkbox, please read the following instructions:
NOTE: Only |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR adds
progressive_set_intersection()to the repository.Problem it addresses
Python's built-in
set.intersection(*others)is already highly optimized in C and automatically iterates over the smallest set. However, when intersecting many sets (50–100+) or dealing with highly imbalanced sizes (e.g., one tiny set of 10 elements vs. several sets with millions of elements), a naive approach can waste time.This implementation demonstrates the "smallest-first + progressive pruning" heuristic:
This pattern significantly reduces unnecessary membership checks in practice.
Why add this to the repo?
Note: For most everyday use cases, the built-in
set.intersection()remains the best choice. This module is primarily for learning and for scenarios with many/imbalanced sets.Algorithm Details
kis the number of sets, but much faster in practice due to early pruning.Related Issue
Closes #14368
Files Changed
data_structures/disjoint_set/progressive_set_intersection.py(new file)Testing
python3 -m doctest -v progressive_set_intersection.py)Example Usage
Would be happy to add more functions (e.g., sorted array intersection using two-pointer technique) or a bitmap version in a follow-up PR if needed.
Thanks to @Starglen and @dinakars777 for the discussion in the issue!