Skip to content

Added method for prompt generation from raw data#272

Open
alright-code wants to merge 4 commits intomainfrom
dev/unstructured-translation
Open

Added method for prompt generation from raw data#272
alright-code wants to merge 4 commits intomainfrom
dev/unstructured-translation

Conversation

@alright-code
Copy link
Copy Markdown

Two phase approach for prompt generation:

  1. Extract the relevant information from the raw dataset
  2. Simplify extracted entries into a concise list
    phase2_direct_gen.yaml shows how this can be configured for medical kdma.

extract_cacher = ub.Cacher(
"generated_system_prompt_template_extract",
depends=self._cache_repr(self.dataset, self.extract_prompt),
enabled=self.enable_caching,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. So this enabled flag turns the caching on or off? Didn't know that was a feature and could simplify the caching code I have elsewhere.

dialog.insert(0, DialogElement(
role='system',
content=self.per_attribute_templates[attribute]['system_prompt']))
system_prompt_template = self.per_attribute_templates[attribute].get('system_prompt_template')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this change (expecting a system_prompt_template instead of system_prompt) might break some existing configs. Would much prefer to support both as we do with the baseline component here: https://github.com/ITM-Kitware/align-system/blob/main/align_system/algorithms/outlines_baseline_adm_component.py#L74-L86

if callable(system_prompt_template):
system_prompt = call_with_coerced_args(
system_prompt_template,
{'model':self.structured_inference_engine.model}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering why we aren't initializing this component with structured_inference_engine and instead opting to pass in the model and create a new generator (looks like we're just passing in the structured_inference_engine.model anyway). Are there parameters that need to be different or are we otherwise invoking this differently that makes this necessary? The main detractor to doing it this way is that we can't freely swap out the inference engine backend.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file probably makes more sense to go into the prompt_engineering directory (instead of algorithms)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants