2025.1 tests update, Excel ingestion fix, and SubjectPseudoIdentifier fix
- Main changes are related to the unit tests and they are not important to review
- I spotted a bug in the Excel ingestion which probably was introduced when we defined
ReferenceInterpretationResult
as a supporting concept. Due to the fact that Excel sheets can have only 31 characters max, there is an overlap with the conceptReferenceInterpretation
. I've updated the logic to handle this case and analogous cases. - There is an issue with the database schema as well. It is related to the
sphn_SubjectPseudoIdentifier
table. According to the schema, the propertysphn_hasSharedIdentifier
has cardinality0..*
. A case with cardinality higher than one cannot be inserted in the table as the primary key is defined by theid
field only. I therefore removed the primary key on the table (alternative would be to addsphn_hasSharedIdentifier
to the primary key). This leads to an issue when multiple values are defined because the JSON data forsphn:SubjectPseudoIdenfier
is then a list and not an object anymore. I therefore added some logic after the database extraction to unify the multiple objects in the list into a single object so that we do not need to change the RML mapping rules.