Conversation
There was a problem hiding this comment.
Pull request overview
Updates the UI embedding/UMAP experience by switching category assignment to a “top 20 + Other” scheme and aligning legend/pinning behavior with the new categorization.
Changes:
- Reworked
assay_category/cell_line_categoryassignment to rank by frequency and cap at 20 (“Other”). - Updated legend querying/rendering to group long tails into “Other”, show counts, and add “Pin All / Unpin All”.
- Bumped UI package version to
0.12.5.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/contexts/mosaic-coordinator-context.tsx | Recomputes assay/cell line category IDs based on frequency and caps at “Other”. |
| ui/src/const.ts | Keeps the current UMAP URL and adds a commented-out alternative. |
| ui/src/components/umap/bed-embedding-view.tsx | Updates legend query + UI (counts, Other bucket, Pin All/Unpin All) and category color handling. |
| ui/src/components/umap/bed-embedding-plot.tsx | Extends the color palette to include an “Other” color. |
| ui/package.json | Version bump for the UI package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| CASE WHEN ac.rank < 20 THEN ac.rank ELSE 20 END AS assay_category, | ||
| CASE WHEN cc.rank < 20 THEN cc.rank ELSE 20 END AS cell_line_category | ||
| FROM data d | ||
| JOIN assay_counts ac ON d.assay = ac.assay | ||
| JOIN cell_line_counts cc ON d.cell_line = cc.cell_line` as any, |
There was a problem hiding this comment.
The categorization query uses inner joins on assay and cell_line. If either column can be NULL in the UMAP JSON, rows with NULL values will be dropped because NULL = NULL does not match, causing points to disappear from the embedding. Consider using LEFT JOIN plus a default rank/category, or join with IS NOT DISTINCT FROM to preserve NULL-valued rows.
| CASE WHEN ac.rank < 20 THEN ac.rank ELSE 20 END AS assay_category, | |
| CASE WHEN cc.rank < 20 THEN cc.rank ELSE 20 END AS cell_line_category | |
| FROM data d | |
| JOIN assay_counts ac ON d.assay = ac.assay | |
| JOIN cell_line_counts cc ON d.cell_line = cc.cell_line` as any, | |
| CASE | |
| WHEN ac.rank IS NULL OR ac.rank >= 20 THEN 20 | |
| ELSE ac.rank | |
| END AS assay_category, | |
| CASE | |
| WHEN cc.rank IS NULL OR cc.rank >= 20 THEN 20 | |
| ELSE cc.rank | |
| END AS cell_line_category | |
| FROM data d | |
| LEFT JOIN assay_counts ac ON d.assay IS NOT DISTINCT FROM ac.assay | |
| LEFT JOIN cell_line_counts cc ON d.cell_line IS NOT DISTINCT FROM cc.cell_line` as any, |
| setPinnedCategories(allCategories); | ||
| setPinGrouping(colorGrouping); | ||
| const categories = Array.from(allCategories); | ||
| const predicate = vg.or(...categories.map((cat: any) => vg.eq(colorGrouping, cat))); |
There was a problem hiding this comment.
Pin All builds the filter predicate with vg.or(...) unconditionally. Elsewhere in this file, single-category cases are handled with vg.eq(...) to avoid relying on vg.or behavior with a single argument. Update Pin All to mirror that logic so pinning works correctly when the legend only contains one category.
| const predicate = vg.or(...categories.map((cat: any) => vg.eq(colorGrouping, cat))); | |
| const predicates = categories.map((cat: any) => vg.eq(colorGrouping, cat)); | |
| const predicate = | |
| predicates.length === 1 ? predicates[0] : vg.or(...predicates); |
| const EXAMPLE_URL = `${API_BASE}/bed/example`; | ||
|
|
||
| export const UMAP_URL = 'https://huggingface.co/databio/bedbase-umap/resolve/main/hg38_umap_3_13.json'; | ||
| // export const UMAP_URL = `${window.location.origin}/feb08_3_13.json`; |
There was a problem hiding this comment.
Avoid committing commented-out alternative UMAP_URL values (especially ones that reference window.location). If you need an override for local testing, prefer an env var (e.g. VITE_UMAP_URL) or remove this line before release to prevent confusion about the canonical data source.
| // export const UMAP_URL = `${window.location.origin}/feb08_3_13.json`; |
| "name": "bedhost-ui-2", | ||
| "private": true, | ||
| "version": "0.12.4", | ||
| "version": "0.12.5", |
There was a problem hiding this comment.
PR description TODOs mention updating pepdbagent __version__.py and the changelog, but this PR only shows UI changes (UMAP logic + UI version bump). Either update the PR description to match or include the missing version/changelog changes so the release notes are accurate.
Changes:
TODO:
__version__.pyfile