Improve merging customization

We have a few different ways merging can be customized but the options for specifying them are not fully implemented in graph specs.

**Make treating properties as lists vs sets configurable**

When merging entities properties that are multivalued could be treated as lists (concatenate them) or sets (combine and deduplicate). We used to always treat them as lists with the exception of a couple hard coded exceptions, but we recently decided that it makes more sense to have the default be to treat all attributes as sets. What we'd really like is to make that configurable and have the option to specify in a graph spec how specific properties should be handled for a specific graph. 

**Improve the specification of merge key properties**

The key which is generated to determine whether edges should be merged is based on some properties subject/predicate/object etc, Hong implemented a way to add additional properties to be considered when making that key as seen [here](https://github.com/RobokopU24/ORION/blob/37fa773ba9efc682ae072a1dc766325b4cf31bb8/orion/merging.py#L42-L66). It would probably be even better to allow specifying the entire list of properties to use and not just additional ones. This is slightly complicated by the primary knowledge source being nested inside the "source" attribute when that format is used such as in translator KGs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve merging customization #385

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve merging customization #385

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions