vO.4.0 Release Notes

This release adds incremental features to both Nexus Forge core and specializations modules. This release also implemented many fixes to various issues raised by users.

New Features

Querying

  • Add cross-bucket search feature #71 #74 (issue:#63)(usage: 04 - Querying notebook on Github or on Binder)

If cross_bucket is set to True, resources could be retrieved by identifier from multiple buckets. In such case, a specific bucket can be target using the bucket argument.

Usage:

resource = forge.search(<resource_id>, cross_bucket=True, bucket=<str>)

Note

The configured store should support cross bucket search.

  • Add support for targeting specific search endpoint when searching data. Currently only SPARQL endpoints are supported #71 (issue:#63)(usage: 04 - Querying notebook on Github or on Binder)

A searchendpoints configuration can be added to the Store configuration.

"Store": {
        "name": <str>,
        "endpoint": <str>,
        "bucket": <str>,
        "token": <str>,
        "searchendpoints": {
             "<querytype>": { # <a query paradigm supported by the store (e.g. sparql)>
             "endpoint": <str> #<an IRI of a query endpoint>
         }
        },
        "versioned_id_template": <str>,
        "file_resource_mapping": <str>
}

Enhancements

Forge

  • Add creating a forge session from a configuration URL #91 (issue:#65)

forge = KnowledgeGraphForge(configuration="https://...config.json") # or config.yml

Querying

  • Add query param support when retrieving a resource #79

resource = forge.retrieve(id="https://..?rev=1&p=aparam")

Specifically, when the version argument is provided like in:

resource = forge.retrieve(id="https://..?rev=1&p=aparam, version=2")

then it supersedes the rev query param while the other query params (e.g p=aparam) are kept as they are.

Dataset

  • Dataset provenance assertion methods are now all aligned to accept either a Resource object of identifier. #88 (usage: 02 - Dataset notebook on Github or on Binder)

Dataset(forge: KnowledgeGraphForge, type: str = "Dataset", **properties)
  add_contribution(resource: Union[str, Resource], versioned: bool = True, **kwargs) -> None
  add_generation(resource: Union[str, Resource], versioned: bool = True, **kwargs) -> None
  add_derivation(resource: Union[str, Resource], versioned: bool = True, **kwargs) -> None
  add_invalidation(resource: Union[str, Resource], versioned: bool = True, **kwargs) -> None

When a Resourcce is provided, forge.freeze will add the resource’s version to its reference within the dataset enabling immutable identifier. forge.freeze is not supported when a resource identifier (str) is provided.

Bug Fixes

Forge

  • Fix has no attribute ‘_rev’ error when calling forge.freeze on a Dataset object. #81 #88 (issue:#80)

  • Fix RuntimeError occuring when calling nest_asyncio.apply() outside a Jupyter notebook session #62 (issue:#60)

  • Fix failing forge.download when the provided path (e.g. distribution.contentUrl) is no longer present in all distributions json objects #85 (issue:#84)(usage: 02 - Dataset notebook on Github or on Binder). This is to cope with schema.org/distribution specification which supports distribution.contentUrl or distribution.url when recording the location of a dataset.

Resolving

  • Fix OntologyResolver to resolve ontology terms when notation and prefLabel properties are missing #76 (issue:#69)(usage: 09 - Resolving notebook on Github or on Binder)

Querying

  • Fix BlueBrainNexus store file retrieval support #75 (issue:#64)

Converting

  • Fix turning a list of Resource elements to RDFlib Graph using forge.as_graph while avoiding graph merge resulting from using graph set operation (such as +) #87