-
Notifications
You must be signed in to change notification settings - Fork 257
[feat] Add component directory scripts #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 01-13-_feat_add_jsonschemas
Are you sure you want to change the base?
[feat] Add component directory scripts #135
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
2481cc2 to
11f9985
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a comprehensive infrastructure for managing component submissions with automated validation, enrichment, and ranking capabilities.
Changes:
- Added Python scripts for validating component submissions, building compiled catalogs, enriching with external metrics (GitHub, PyPI, pypistats), and computing ranking scores
- Implemented utility modules for HTTP requests, GitHub API interaction, time handling, JSON I/O, and enrichment orchestration
- Added ranking configuration with configurable weights for stars, recency, contributors, and downloads
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements.txt | Added jsonschema and requests dependencies |
| directory/scripts/validate.py | Component and compiled catalog validation with policy checks |
| directory/scripts/run_pipeline.py | Orchestrates full pipeline with configurable steps |
| directory/scripts/enrich_images.py | Validates image URLs for stability and fetchability |
| directory/scripts/enrich.py | Coordinates enrichment from multiple services |
| directory/scripts/compute_ranking.py | Computes ranking scores based on configured weights |
| directory/scripts/build_catalog.py | Compiles individual submissions into single catalog |
| directory/scripts/_utils/*.py | Shared utility modules for common operations |
| directory/scripts/_enrichers/*.py | Service-specific enrichers for GitHub, PyPI, and pypistats |
| directory/ranking_config.json | Ranking algorithm configuration |
| .gitattributes | Marks compiled files as generated |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
11f9985 to
c290ecc
Compare
a9e06e1 to
91e2fa0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
91e2fa0 to
a2a0b92
Compare
c290ecc to
648f054
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 24 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
648f054 to
afb4c3d
Compare
a2a0b92 to
518eaab
Compare
afb4c3d to
a6e0687
Compare
| "weights": { | ||
| "stars": 1.0, | ||
| "recency": 2.0, | ||
| "contributors": 0.5, | ||
| "downloads": 0.35 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: similar question to a previous PR.. are we just vibe-weighting here? Any reason downloads is much less weighted?
| _LINK_LAST_RE = re.compile(r'<([^>]+)>;\s*rel="last"') | ||
|
|
||
|
|
||
| def _parse_last_page_from_link_header(link: str | None) -> int | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: On the whole, I like how you've encapsulated the scraping/parsing of these various sources into this _enrichers pattern. I wish some of it could be less painful like this function, but I think that's just the nature of scraping, and if it works it works. 🙈

TL;DR
Add component directory infrastructure with scripts for validation, enrichment, and ranking.
What changed?
.gitattributesto mark compiled files as generatedHow to test?
pip install -r requirements.txtpython directory/scripts/run_pipeline.pyexport GH_TOKEN=your_tokenpython directory/scripts/validate.pypython directory/scripts/enrich_images.py --check-onlypython directory/scripts/enrich.pyWhy make this change?
This infrastructure enables automated processing of component submissions with:
The system is designed to be maintainable, with separate modules for different concerns and configurable parameters for flexibility.