Skip to content

Improve merging customization #385

@EvanDietzMorris

Description

@EvanDietzMorris

We have a few different ways merging can be customized but the options for specifying them are not fully implemented in graph specs.

Make treating properties as lists vs sets configurable

When merging entities properties that are multivalued could be treated as lists (concatenate them) or sets (combine and deduplicate). We used to always treat them as lists with the exception of a couple hard coded exceptions, but we recently decided that it makes more sense to have the default be to treat all attributes as sets. What we'd really like is to make that configurable and have the option to specify in a graph spec how specific properties should be handled for a specific graph.

Improve the specification of merge key properties

The key which is generated to determine whether edges should be merged is based on some properties subject/predicate/object etc, Hong implemented a way to add additional properties to be considered when making that key as seen here. It would probably be even better to allow specifying the entire list of properties to use and not just additional ones. This is slightly complicated by the primary knowledge source being nested inside the "source" attribute when that format is used such as in translator KGs.

Metadata

Metadata

Labels

Biological Context QCRequire validation of biological context to ensure accuracy and consistency

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions