Skip to content

Releases: ITM-Kitware/align-system

0.5.10

01 Apr 19:27

Choose a tag to compare

0.5.10

Added

  • Added more details and examples to the pipeline ADM documentation
  • Added a new Phase 2 medical urgency alignment function (weighted) to reflect program collaboration updates
  • Added option for ICL example choice ordering - fixed, swapped, or random
  • Added option to swap choice ordering for comparative regression pipeline ADM component
  • Added capability to resolve pipeline ADM step output conflicts with a custom function
  • Added inference engine for spectrum tuned LLMs that appropriately reformats chat template roles
  • Added alignment function based on a random-effects model
  • Added "always choose index X" ADM
  • Added direct regression ADM component and pipeline
  • Added DecisionFlow integration with fine-grained value prompts, unstructured objective function parsing, and JSON retry utilities
  • Added ICL similarity score and ICL examples to choice info
  • Added July 2025 evaluation configs
  • Added Feb 2026 evaluation configs
  • Added caching for ICL step and BertRelevance component
  • Added alignment info (with source) to choice_info output
  • Added per-step timing info for pipeline ADM
  • Added ALIGN App link and updated pipeline ADM documentation
  • Added support for raw text system prompts for the pipeline baseline ADM
  • Added tagging ADM configs

Changed

  • Refactored ICL selection strategies to reduce duplication; factored out similarity strategies
  • Factored out scenario driver component and updated configs
  • Updated TA3 client version and ICL databases; copied Phase 1 TA3 models for backward compatibility
  • Removed deprecated tags property from Dialog
  • Removed lots of old pre-Hydra/Outlines code
  • Made comparative regression reasoning length configurable

Fixed

  • Removed non-determinism from midpoint alignment functions

0.5.9

13 Jun 16:50

Choose a tag to compare

0.5.9

Added

  • Added new (optional) domain argument for the TA3 interface
  • Added support for the TA3 interface domain p2triage
  • Added state_hydration_domain to IncontextExampleGenerator and InputOutputFileInterface. Providing a value of None or p1 results in Phase 1 behavior while a value of p2triage is meant for hydrating states from the corresponding domain on the TA3 server.
  • Added multi-KDMA evaluation support with dedicated baseline multi config and BERT-relevance config for Phase 2 June evaluation
  • Added configurations for Phase 2 June evauation
  • Added caching functionality for baseline and comparative regression ADM components
  • Added ICL-based relevance prediction ADM with BERT similarity
  • Added least_similar_examples option for ICL
  • Added Phase 2 midpoint based alignment function (with relevance support) along with unit tests
  • Added medical-only alignment function
  • Added dedicated Phase 2 regression component config for Kaleido
  • Added Kaleido ADM experiment config variants including a "mashup" for Phase 2
  • Added pipeline components: relevance aggregation, regression oracle, and relevance oracle
  • Added state_hydration_domain option to ICL and input/output interface
  • Added pipeline component for post-hoc rule-based regression value correction
  • Added new KDMAs for Phase 2 June collaboration: "personal_safety" and "search"
  • Added local copies of Phase 1 data models from TA3 code to support backward compatability
  • Added Phase 2 alignment target config files
  • Added ubelt dependency for caching support

Changed

  • Changed some Phase 1 components and templates to point at local Phase 1 data models

0.5.8

02 May 19:07

Choose a tag to compare

0.5.8

Fixed

  • Fixed non-determinism issues with outlines_adm baseline with shuffle_choices arg and outlines seed
  • Fixed configs referenced in top-level README for Phase 1 Evaluation runs

Added

  • Added PipelineADM and many pipeline ADM components, configs, and integration tests
  • Added new data models for Attributes and Dialogs
  • Added call_with_coerced_args utility function to handle calling functions with different input requirements
  • Added documentation for new pipeline ADM developement: Pipeline ADMs
  • Added a specialized Hydra instantiation function to support shared objects
  • Added shuffle_choices inference_kwarg argument for outlines_adm baseline
  • Added option to set outlines RNG seed for outlines_adm

0.5.7

02 Apr 19:44

Choose a tag to compare

0.5.7

Fixed

  • Fixed alignment target parsing for KaleidoADM
  • Fixed issue where action parameters aren't set if there is only 1 remaining heuristic option.
  • Fixed crash when using save_last_unstructured flag without alignment target

Added

  • Added raw log output file (on by default) in addition to rich formatted log output file
  • Added choice_info output to KaleidoADM to support MSE analysis
  • Added multi-KDMA alignment targets for testing
  • Added Outlines based Personas ADM
  • Added dedicated utility function for inferring alignment target type
  • Added multi-KDMA alignment function that weights distance by relevance
  • Added a relevance oracle ADM
  • Added an optional relevance prediction step with ICL to the comparative regression ADM
  • Added options to use cumulative KDE and/or relevance weighted alignment function to Kaleido ADM
  • Added integration testing script (tests/run_integration_test.py) and associated configs and data files
  • Added basic support for NAACL24 and OpinionQA datasets
  • Initial API support for ALIGN-App

Changed

  • Changed KaleidoADM prompt log output to "info" instead of "debug"
  • Exposed get_dialogs as a static method for outlines_adm.py to support ALIGN-App

0.5.6-hotfix1

07 Jan 16:33

Choose a tag to compare

0.5.6.hotfix1

Changed

  • Default to using KDE target when both KDE and scalar targets are present

Fixed

  • Fixed README Phase1 Evaluation run invocations for aligned ADMs

0.5.6

21 Nov 13:41

Choose a tag to compare

0.5.6

Changed

  • Updated Phase 1 experiment configs for final Phase 1 Eval delivery

0.5.5

18 Nov 12:57

Choose a tag to compare

0.5.5

Added

  • Added Phase 1 Evaluation experiment configuration files
  • Added ICL example selection method that gives larger weight to examples with the same characetr ids as the current probe. To use set incontext.method to matching_characters.
  • Added ICL example selection method that gives larger weight to examples with the same action types as the current probe. To use set incontext.method to matching_actions.
  • Added retrieved ICL examples to input-output.json

0.5.4

14 Nov 00:36

Choose a tag to compare

0.5.4

Changed

  • Changed incontext normalization setting to be off (null/rawscores)
  • incontext.leave_one_out=false should now be configured as incontext.leave_one_out_strategy=null. Default behavior is no leave one out behavior.
    Previous incontext.leave_one_out=true should be specified as incontext.leave_one_out_strategy=scenario_description. Additionally, duplicate ICL examples,
    based on the chosen similiarity strategy, are now removed.
  • Changed training_session flag for TA3 interface from boolean to string (expecting "full" or "solo" or None)
  • Changed the comparative regression prompt to only include the structured chararcter information listed in relevant_structured_character_info in kdma_descriptions.yaml. To include all strucutured information that is unique across characters in the prompt (as was previously done automatically), specify relevant_structured_character_info = ['all_unique'].
  • Improved the QoL description and score_examples in kdma_descriptions.yaml
  • Changed default treatment parameter selection to use heuristic treatment options
  • Updated to transformers>=4.46.2 (and added necessary dependencies) to support newer models

Added

  • Added an option for sorting incontext examples responses: incontext.sort_actions
  • Added character-based leave one out option: incontext.leave_one_out_strategy=characters
  • Phase 1 experiments directory
  • Added the option to filter out TAG CHARACTER responses by setting filter_tag_character to true
  • Added a history-based alignment function for scalar targets that uses distance to a running mean. To use specify inference_kwargs.distribution_matching as cumulative_average
  • Added the option to enumerate the valid regression scores in the json schema by specifying inference_kwargs.enum_scores as true. Valid score options for each KDMA are added to align_system/prompt_engineering/kdma_descriptions.yml.
    Valid score options may be specifed as a list via values, or a range specifed as dictionary of min (inclusive), max (inclusive), step
  • Added option to configure ICL example ordering: incontext.most_similar_first=true for the most similar ICL example first, false for most similar ICL example last.
  • Added the option to normalize KDE targets based on prior data. To use, set adm.inference_kwargs.kde_norm=priornorm and adm.inference_kwargs.priornorm_factor to the normalization weight you want (1 is fully normalized, 0 is no normalization or rawscores, default is 0.5.
  • Added KDMA scaling factor option. Scale factors for each KDMA are added to align_system/prompt_engineering/kdma_descriptions.yml
  • Added heuristic treatment options component

Fixed

  • Fixed issue where choice history was persisting across scenarios -- supporting new optional method for ADMs reset_history called at the start of each new scenario

0.5.3

04 Nov 18:11

Choose a tag to compare

0.5.3

Changed

  • Moved incontext learning functionality into incontext_utils.py and updated the base outlines and comparative regession ADMS to use this module.
  • Moved the format_choices() function from the OutlinesTransformersADM class in outlines_adm.py to a new utils file: adm_utils.py so it can be used across ADMs.
  • Update example_data/input_output_files to use DRE training scenarios
  • Changed default config to use outlines_transformers_structured_baseline (rather than the older single_kdma_baseline)
  • Adjusted choose_action() to enable returning an ADM-specific choice_info dictionary that is written to the resulting input_output.json file
  • When alignment target is optionally saved out in run_align_system save as JSON instead of YAML

Added

  • Added option to normalize KDMA values in incontext examples
  • Added a probabilistic option to alignment utilities. Exposed this option in oracle, comparative regression, and
    hybrid regression ADMs.
  • Example config for deterministic outlines-based ADM runs (align_system/configs/experiment/examples/outlines_force_determinism.yaml). Requires setting force_determinsim to true and using greedy sampler.
  • Added a history-based/cumulative KDE option to alignment utilities. Exposed this option in oracle and comparative regression.
  • Added true and predicted KDMA values to the log and input_output.json file for comparative regression ADM.
  • Added Phase 1 eval alignment targets for SoarTech

Fixed

  • Fixed KDE target samples to be between 0 and 1
  • Fixed issue in alignment_utils logging (where kdma values can be a float/int rather than a list)
  • Now properly hydrating the meta_info field of input_output files
  • Fixed possible divide by zero during misaligned alignment
  • Properly hydrate Aid list

Deprecated

  • Removed old and unused command-line interface scripts
  • Removed old template files for integrating custom ADMs
  • Removed CLI builder functionality
  • Removed old configuration files from before Hydra

0.5.2

27 Aug 20:00

Choose a tag to compare

0.5.2

Added

  • Split out our experiment configuration for our aligned DRE ADM to specific configs for SoarTech and Adept
  • Added logging for sampled KDMA target value, and estimated KDMA values in alignment_utils

Fixed

  • Fixed issue in Oracle ADM which caused an key error exception when logging probabilities