Conversation
|
I find this strange; this problem didn't occur in my local testing. I'll investigate what's causing this later. |
cmatKhan
left a comment
There was a problem hiding this comment.
My inclination is this isn't worth working on further. It is making very large changes and its hard for me to follow why some of them are being made.
I would suggest that rather than continuing on with this, it would be better to take that issue and make reproducible examples of how the current parsing method fails.
| # Concatenate results, filling NaN for missing columns | ||
| return pd.concat(results, ignore_index=True, sort=False) | ||
|
|
||
| def query_dto( |
There was a problem hiding this comment.
Functions in virtualDB shouldn't be this specific. From the point of view of how the data is stored, DTO isn't meaningfully different from spearman correlation
|
Understood, so currently we should focus on identifying the cause and specific examples of the bug, rather than making these changes. Also, is VirtualDB specifically responsible for basic functionalities? Is it necessary to encapsulate functions like retrieving DTO data? |
|
yes, and I think one of the problems is the way the comparative analysis dataset is configured. I'm playing with moving it out to the same level as the other repos, and adding a "links to" field which lists other configured datasets. |
I've made these improvements, hoping they will be useful:
_join_comparative_analysesfunction to_build_metadata_tableto incorporate comparative datasets;_join_comparative_analysesqueries the comparative dataset using SQL and then prepares for matching;_parse_composite_identifier: parses the ID from the comparative dataset for matching._join_comparative_analysesto try both uppercase and lowercase beginnings for the repo ID.query_dtofunction to specifically handle the intersection of specified binding and perturbation datasets.BrentLab/rossi_2021/rossi_2021_af_combined
Some use semicolons, others use slashes.
Some use uppercase, others use lowercase:
BrentLab/Hackett_2020;hackett_2020;34 BrentLab/harbison_2004;harbison_2004;3
Do we need to unify them? Or should we handle them separately with functions?
5. I failed to read the calling cards data; the program crashed several times, but I haven't found the reason yet, so I haven't continued with the analysis.