Add subset parameter to /Create_project endpoint and adapat logic to generate schema based on subset
First draft for the filtering of SPHN/project concepts based on configuration file. Let me know if you have any comments, suggestions, or things you would like to have differently. Thank you.
- New upload field in
/Create_project
endpoint namedconcepts_subset
. - The field expects a one-column CSV file without header. For example:
https://biomedit.ch/rdf/sphn-schema/sphn#Allergy
https://biomedit.ch/rdf/sphn-schema/sphn#SourceSystem
https://biomedit.ch/rdf/sphn-schema/sphn#SemanticMapping
https://biomedit.ch/rdf/sphn-schema/sphn#Sample
https://biomedit.ch/rdf/sphn-schema/sphn#Interpretation
https://biomedit.ch/rdf/sphn-schema/sphn#SourceData
https://biomedit.ch/rdf/sphn-schema/sphn#SemanticMapping
sphn:Nationality
sphn:SubjectPseudoIdentifier
sphn:TestConcept
https://biomedit.ch/rdf/sphn-schema/lucid#TestConcept
https://biomedit.ch/rdf/sphn-schema/lucid#QuestionnaireEvent
-
The code checks the provided CSV file and rejects it in case there are entries that do not start with
https://
orsphn:
-
The list is parsed and passed to the
rml_generator
script. There the concepts are filtered based on the provided list. The schema will be generated with the concepts on the list, plus the default conceptssphn:DataProvider
,sphn:SubjectPseudoIdentifier
,sphn:AdministrativeCase
, andsphn:DataRelease
. -
See output examples in the ticket: https://git.dcc.sib.swiss/sphn-semantic-framework/workingfair/-/issues/1090
-
New endpoint
/Check_concepts_subset
which can be used to validate the CSV subset file. If the file is not valid it will raise an exception, similarly as the/Create_project
endpoint does. It returns the list of concepts that will be considered for the generated schema. It might be that some concepts won't be there if they are not part of the provided project's schema.
From the docs: