Releases: ITM-Kitware/align-system
Releases · ITM-Kitware/align-system
0.5.10
0.5.10
Added
- Added more details and examples to the pipeline ADM documentation
- Added a new Phase 2 medical urgency alignment function (weighted) to reflect program collaboration updates
- Added option for ICL example choice ordering - fixed, swapped, or random
- Added option to swap choice ordering for comparative regression pipeline ADM component
- Added capability to resolve pipeline ADM step output conflicts with a custom function
- Added inference engine for spectrum tuned LLMs that appropriately reformats chat template roles
- Added alignment function based on a random-effects model
- Added "always choose index X" ADM
- Added direct regression ADM component and pipeline
- Added DecisionFlow integration with fine-grained value prompts, unstructured objective function parsing, and JSON retry utilities
- Added ICL similarity score and ICL examples to choice info
- Added July 2025 evaluation configs
- Added Feb 2026 evaluation configs
- Added caching for ICL step and BertRelevance component
- Added alignment info (with source) to choice_info output
- Added per-step timing info for pipeline ADM
- Added ALIGN App link and updated pipeline ADM documentation
- Added support for raw text system prompts for the pipeline baseline ADM
- Added tagging ADM configs
Changed
- Refactored ICL selection strategies to reduce duplication; factored out similarity strategies
- Factored out scenario driver component and updated configs
- Updated TA3 client version and ICL databases; copied Phase 1 TA3 models for backward compatibility
- Removed deprecated
tagsproperty from Dialog - Removed lots of old pre-Hydra/Outlines code
- Made comparative regression reasoning length configurable
Fixed
- Removed non-determinism from midpoint alignment functions
0.5.9
0.5.9
Added
- Added new (optional) domain argument for the TA3 interface
- Added support for the TA3 interface domain
p2triage - Added
state_hydration_domaintoIncontextExampleGeneratorandInputOutputFileInterface. Providing a value ofNoneorp1results in Phase 1 behavior while a value ofp2triageis meant for hydrating states from the corresponding domain on the TA3 server. - Added multi-KDMA evaluation support with dedicated baseline multi config and BERT-relevance config for Phase 2 June evaluation
- Added configurations for Phase 2 June evauation
- Added caching functionality for baseline and comparative regression ADM components
- Added ICL-based relevance prediction ADM with BERT similarity
- Added least_similar_examples option for ICL
- Added Phase 2 midpoint based alignment function (with relevance support) along with unit tests
- Added medical-only alignment function
- Added dedicated Phase 2 regression component config for Kaleido
- Added Kaleido ADM experiment config variants including a "mashup" for Phase 2
- Added pipeline components: relevance aggregation, regression oracle, and relevance oracle
- Added
state_hydration_domainoption to ICL and input/output interface - Added pipeline component for post-hoc rule-based regression value correction
- Added new KDMAs for Phase 2 June collaboration: "personal_safety" and "search"
- Added local copies of Phase 1 data models from TA3 code to support backward compatability
- Added Phase 2 alignment target config files
- Added
ubeltdependency for caching support
Changed
- Changed some Phase 1 components and templates to point at local Phase 1 data models
0.5.8
0.5.8
Fixed
- Fixed non-determinism issues with outlines_adm baseline with shuffle_choices arg and outlines seed
- Fixed configs referenced in top-level README for Phase 1 Evaluation runs
Added
- Added PipelineADM and many pipeline ADM components, configs, and integration tests
- Added new data models for Attributes and Dialogs
- Added
call_with_coerced_argsutility function to handle calling functions with different input requirements - Added documentation for new pipeline ADM developement: Pipeline ADMs
- Added a specialized Hydra instantiation function to support shared objects
- Added shuffle_choices inference_kwarg argument for outlines_adm baseline
- Added option to set outlines RNG seed for outlines_adm
0.5.7
0.5.7
Fixed
- Fixed alignment target parsing for KaleidoADM
- Fixed issue where action parameters aren't set if there is only 1 remaining heuristic option.
- Fixed crash when using
save_last_unstructuredflag without alignment target
Added
- Added raw log output file (on by default) in addition to rich formatted log output file
- Added choice_info output to KaleidoADM to support MSE analysis
- Added multi-KDMA alignment targets for testing
- Added Outlines based Personas ADM
- Added dedicated utility function for inferring alignment target type
- Added multi-KDMA alignment function that weights distance by relevance
- Added a relevance oracle ADM
- Added an optional relevance prediction step with ICL to the comparative regression ADM
- Added options to use cumulative KDE and/or relevance weighted alignment function to Kaleido ADM
- Added integration testing script (
tests/run_integration_test.py) and associated configs and data files - Added basic support for NAACL24 and OpinionQA datasets
- Initial API support for ALIGN-App
Changed
- Changed KaleidoADM prompt log output to "info" instead of "debug"
- Exposed
get_dialogsas a static method foroutlines_adm.pyto support ALIGN-App
0.5.6-hotfix1
0.5.6.hotfix1
Changed
- Default to using KDE target when both KDE and scalar targets are present
Fixed
- Fixed README Phase1 Evaluation run invocations for aligned ADMs
0.5.6
0.5.5
0.5.5
Added
- Added Phase 1 Evaluation experiment configuration files
- Added ICL example selection method that gives larger weight to examples with the same characetr ids as the current probe. To use set
incontext.methodtomatching_characters. - Added ICL example selection method that gives larger weight to examples with the same action types as the current probe. To use set
incontext.methodtomatching_actions. - Added retrieved ICL examples to input-output.json
0.5.4
0.5.4
Changed
- Changed
incontextnormalizationsetting to be off (null/rawscores) incontext.leave_one_out=falseshould now be configured asincontext.leave_one_out_strategy=null. Default behavior is no leave one out behavior.
Previousincontext.leave_one_out=trueshould be specified asincontext.leave_one_out_strategy=scenario_description. Additionally, duplicate ICL examples,
based on the chosen similiarity strategy, are now removed.- Changed
training_sessionflag for TA3 interface from boolean to string (expecting "full" or "solo" or None) - Changed the comparative regression prompt to only include the structured chararcter information listed in
relevant_structured_character_infoinkdma_descriptions.yaml. To include all strucutured information that is unique across characters in the prompt (as was previously done automatically), specifyrelevant_structured_character_info = ['all_unique']. - Improved the QoL
descriptionandscore_examplesinkdma_descriptions.yaml - Changed default treatment parameter selection to use heuristic treatment options
- Updated to transformers>=4.46.2 (and added necessary dependencies) to support newer models
Added
- Added an option for sorting incontext examples responses:
incontext.sort_actions - Added character-based leave one out option:
incontext.leave_one_out_strategy=characters - Phase 1 experiments directory
- Added the option to filter out TAG CHARACTER responses by setting
filter_tag_characterto true - Added a history-based alignment function for scalar targets that uses distance to a running mean. To use specify
inference_kwargs.distribution_matchingascumulative_average - Added the option to enumerate the valid regression scores in the json schema by specifying
inference_kwargs.enum_scoresas true. Valid score options for each KDMA are added toalign_system/prompt_engineering/kdma_descriptions.yml.
Valid score options may be specifed as a list viavalues, or arangespecifed as dictionary ofmin(inclusive),max(inclusive),step - Added option to configure ICL example ordering:
incontext.most_similar_first=truefor the most similar ICL example first,falsefor most similar ICL example last. - Added the option to normalize KDE targets based on prior data. To use, set
adm.inference_kwargs.kde_norm=priornormandadm.inference_kwargs.priornorm_factorto the normalization weight you want (1 is fully normalized, 0 is no normalization orrawscores, default is 0.5. - Added KDMA scaling factor option. Scale factors for each KDMA are added to
align_system/prompt_engineering/kdma_descriptions.yml - Added heuristic treatment options component
Fixed
- Fixed issue where choice history was persisting across scenarios -- supporting new optional method for ADMs
reset_historycalled at the start of each new scenario
0.5.3
0.5.3
Changed
- Moved incontext learning functionality into
incontext_utils.pyand updated the base outlines and comparative regession ADMS to use this module. - Moved the
format_choices()function from theOutlinesTransformersADMclass inoutlines_adm.pyto a new utils file:adm_utils.pyso it can be used across ADMs. - Update example_data/input_output_files to use DRE training scenarios
- Changed default config to use
outlines_transformers_structured_baseline(rather than the oldersingle_kdma_baseline) - Adjusted
choose_action()to enable returning an ADM-specificchoice_infodictionary that is written to the resultinginput_output.jsonfile - When alignment target is optionally saved out in
run_align_systemsave as JSON instead of YAML
Added
- Added option to normalize KDMA values in incontext examples
- Added a probabilistic option to alignment utilities. Exposed this option in oracle, comparative regression, and
hybrid regression ADMs. - Example config for deterministic outlines-based ADM runs (
align_system/configs/experiment/examples/outlines_force_determinism.yaml). Requires settingforce_determinsimto true and using greedy sampler. - Added a history-based/cumulative KDE option to alignment utilities. Exposed this option in oracle and comparative regression.
- Added true and predicted KDMA values to the log and
input_output.jsonfile for comparative regression ADM. - Added Phase 1 eval alignment targets for SoarTech
Fixed
- Fixed KDE target samples to be between 0 and 1
- Fixed issue in alignment_utils logging (where kdma values can be a float/int rather than a list)
- Now properly hydrating the meta_info field of input_output files
- Fixed possible divide by zero during misaligned alignment
- Properly hydrate Aid list
Deprecated
- Removed old and unused command-line interface scripts
- Removed old template files for integrating custom ADMs
- Removed CLI builder functionality
- Removed old configuration files from before Hydra
0.5.2
0.5.2
Added
- Split out our experiment configuration for our aligned DRE ADM to specific configs for SoarTech and Adept
- Added logging for sampled KDMA target value, and estimated KDMA values in alignment_utils
Fixed
- Fixed issue in Oracle ADM which caused an key error exception when logging probabilities