Skip to content

Bioschemas profile URLs not matched as community-specific standard in R1.3 despite detection in I2 #581

@jeffstazer

Description

@jeffstazer

Bioschemas profile URLs not matched as community-specific standard in R1.3 despite detection in I2

Description

Bioschemas Dataset Profile namespaces present in JSON-LD metadata are not being matched as community-specific standards for metric FsF-R1.3-01M-1 ("Community specific metadata standard is detected using namespaces or schemas found in provided metadata"). The standard is detected under I2 (semantic resources) but not under R1.3 (community standards).

Context

The [MIDAS Catalog](https://catalog.midasnetwork.us/) publishes structured JSON-LD metadata for infectious disease modeling resources. Our metadata declares conformance to [Bioschemas Dataset Profile 1.0-RELEASE](https://bioschemas.org/profiles/Dataset/1.0-RELEASE) and includes the Bioschemas namespace in the @context:

"bioschemas": "https://bioschemas.org/profiles/Dataset/1.0-RELEASE"

We also include a conformsTo entry:

{
  "@type": "http://purl.org/dc/terms/Standard",
  "@id": "https://bioschemas.org/profiles/Dataset/1.0-RELEASE",
  "name": "Bioschemas Dataset Profile",
  "description": "A structured data markup specification for describing datasets in the life sciences, extending schema.org Dataset",
  "url": "https://bioschemas.org/profiles/Dataset/1.0-RELEASE",
  "standardType": "metadata schema"
}

Observed behavior

When assessing https://catalog.midasnetwork.us/collection/350 (F-UJI v3.5.1, metric v0.8):

FsF-I2-01M (passes): F-UJI successfully detects the Bioschemas namespace among the active namespaces:

"namespace": "https://bioschemas.org/profiles/Dataset",
"is_namespace_active": true

and

"namespace": "https://bioschemas.org",
"is_namespace_active": true

FsF-R1.3-01M-1 (fails): Despite the above, F-UJI only identifies generic/multidisciplinary standards (Dublin Core, DCAT, Schema.org, PROV). Bioschemas is not matched. The debug log shows:

Found non-disciplinary standard (but RDA listed) -: via ns: Dublin Core - http://purl.org/dc/terms/
Found non-disciplinary standard (but RDA listed) -: via ns: DCAT ...
Found non-disciplinary standard (but RDA listed) -: via ns: Schema.org ...
Found non-disciplinary standard (but RDA listed) -: via ns: PROV ...

No mention of Bioschemas. As a result, only FsF-R1.3-01M-3 (multidisciplinary but RDA-listed) passes at maturity level 1, while FsF-R1.3-01M-1 (community-specific) scores 0.

Suspected cause

The issue appears to be a URL matching gap. F-UJI builds its standards lookup dictionary from FAIRsharing and RDA MSC, normalizing URLs with strip('#/'). Two factors prevent a match:

  1. FAIRsharing registers Bioschemas at the project level ([FAIRsharing.20sbr9](https://fairsharing.org/FAIRsharing.20sbr9)), likely with a base URL such as https://bioschemas.org. Our namespace uses the full profile URL https://bioschemas.org/profiles/Dataset/1.0-RELEASE. Since the lookup appears to be exact dictionary key matching rather than prefix matching, these don't match.

  2. RDA Metadata Standards Catalog does not include Bioschemas at all (checked at [rdamsc.bath.ac.uk/scheme-index](https://rdamsc.bath.ac.uk/scheme-index)), so there is no fallback path for a match.

Expected behavior

Bioschemas profile URLs (e.g., https://bioschemas.org/profiles/Dataset/1.0-RELEASE) should match the Bioschemas entry in FAIRsharing and be classified as a community-specific (life sciences) standard for R1.3. This could be achieved by:

  • Using prefix matching when comparing namespace URLs against the standards registry (so that https://bioschemas.org/profiles/Dataset/1.0-RELEASE matches a registered https://bioschemas.org entry), or
  • Registering common Bioschemas profile URLs as additional known URLs for the Bioschemas standard entry

Why this matters

Bioschemas is a recognized community standard endorsed by ELIXIR and the European Research Council. Their own documentation recommends using full profile URLs in dct:conformsTo declarations (see [Bioschemas markup tutorial](https://bioschemas.org/tutorials/howto/howto_add_markup)). Implementers following this best practice will not receive credit for community standard usage in F-UJI assessments.

Reproduction

Environment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions