The Big Picture – A list of Multimedia Ontologies for MPEG-7 (part 2 of 2)

10 min read
jason-leung-370358

Photo: William Bout

A list of Multimedia Ontologies for MPEG-7

This is a continuation of the post about what multimedia ontologies are, and the requirements one should try to apply.

On the basis of the requirements for multimedia ontologies, here follows a list of ontologies for describing MPEG-7 multimedia and multimedia content.

M3O – Multimedia Metadata Ontology

M3O sets out to solve the problem of too narrow metadata and specific media types, which cannot be used in conjunction to describe multimedia presentations. Unlike existing metadata models, the M3O is not locked into particular media types and allows for integrating the features of the different models and standards we find today (Dasiopoulou et al., 2010).

M30 is based on DOLCE+DnS Ultralight and uses a generic modelling framework to represent multimedia metadata, by adopting existing metadata models and metadata standards. The M3O is based on five Ontology Design Patterns (ODP) representing data structures:

  • Identification of resource.
  • Separation of information objects and realisations.
  • Annotation of information objects and realisations.
  • Decomposition of information objects and realisations.
  • Representation of provenance information.

M3O is aligned with COMM (see below), MRO (see below) and EXIF , and the ontology is targeted multimedia presentations on the web. M3O has medium quality documentation and high code clarity. It’s drawbacks are missing options for annotations (Suárez-Figueroa, Ghislain & Corcho, 2013).

Harmony MPEG-7 based ontology

Translations of MPEG-7 definitions follow the original MPEG-7 schema, where content and segments are modelled as classes. Entities can have more than one semantic interpretation. However, this leads to ambiguities when interpretations travel to other parts of the ontology (Dasiopoulou et al., 2010).

MRO – Media Resource Ontology

Developed by the W3C Media Annotation Working Group, the MRO defines a set of minimal annotation properties for describing multimedia content together with a set of mappings between the 23 main metadata formats in use on the Web (IPTC, MPEG-7, XMP, Dublin Core, EXIF 2.2, DIG35, Media RSS, TV-Anytime and YouTube API Protocol, among others).

MRO maps multimedia metadata to ontology elements describing: identification, content description, relational, copyright, distribution, parts, and technical properties. It has strong interoperability among many metadata formats, along with ontology properties describing media resources. MRO is used for annotation and analysis and has high quality documentation and high code clarity. MRO drawbacks are missing options to create annotations (Suárez-Figueroa, Ghislain & Corcho, 2013).

COMM – Core Ontology for Multimedia

The intention of COMM is to ease multimedia annotation and solve the formal properties (defined by the creators of COMM) of a high quality multimedia ontology: MPEG-7 compliancy, semantic interoperability, syntactic interoperability, separation of concerns, modularity and extendability. COMM is used for annotation, has a modular design which facilitates extensibility with other ontologies and the ontology is based on DOLCE and implemented in OWL DL. It uses design patterns for contextualisation called Descriptions and Situations (DnS) and information objects called Ontology for Information Object (OIO) (Suárez-Figueroa, Ghislain & Corcho, 2013). The four design patterns are:

    • Decomposition: structural and localisation information.
    • Content Annotation: attaching metadata to content/segments.
    • Media Annotation: mapping physical instances of content in multimedia.
    • Semantic Annotation: connecting content instances to domain specific descriptions.

(Dasiopoulou et al., 2010)

COMM has high quality of documentation and code clarity. It’s main drawbacks are the missing options to create disjoints, or setting domain or range properties (Suárez-Figueroa, Ghislain & Corcho, 2013). COMM covers structural, localisation and media description schemas, as well as low-level descriptors of the visual part with room for information about algorithms and parameters to extract descriptions (Dasiopoulou et al., 2010).

  • OWL DL ontology.
  • Designed manually.
  • Based on the foundational ontology DOLCE.
  • Viewed using Protege and validated using Fact++-v1.1.5.
  • Upper-level ontology providing a vocabulary independent of domain independent and explicitly includes formal definitions of foundational categories.
  • Eases linkage of domain-specific ontologies because of the definition of top level concepts.
  • Covers the most important parts of MPEG-7 used for describing structure and content.

Hunter’s MPEG-7 Ontology

  • Extended and harmonised using the ABC upper ontology for applications in the digital libraries and eResearch fields.
  • OWL Full ontology containing classes defining media types and decompositions from the MPEG-7 Multimedia Description Schemes.
  • Can be viewed in Protégé and validated using the WonderWeb OWL Validator. Used for describing decomposition of images and their visual descriptors.
  • For use in larger semantic frameworks.
  • The ability to query abstract concepts is a result of being harmonised with upper ontologies such as ABC.

The following ontologies (MPEG-7 upper MDS, MPEG-7 Tsinakari, MSO and VDO, and MPEG- 7 Rhizomik) are the results of transforming the MPEG-7 standard to ontology languages based on a monolithic design.

MPEG-7 upper MDS

The aim of MPEG-7 upper MDS ontology is reuse by other parties for exchanging multimedia content through MPEG-7 by using the upper part of Multimedia Description Scheme (MDS) of the MPEG-7 standard. It is used for annotation and analysis and uses OWL-Full. The MPEG-7 upper MDS has low quality of documentation and low code clarity. It’s main drawbacks are missing options to assert inverse relationships (Suárez-Figueroa, Ghislain & Corcho, 2013).

MPEG-7 Tsinaraki

Built using OWL DL, MPEG-7 Tsinaraki spans MPEG-7 MDS and the classification schemes and parts of the MPEG-7 Visual and Audio Parts. It is used for annotation, retrieval and filtering for Digital Libraries and has low quality of documentation and medium code clarity. It’s main drawback is that it uses different naming criteria (Suárez-Figueroa, Ghislain & Corcho, 2013).

  • Written in OWL DL and captures the semantics of the MPEG-7 MDS (Multimedia Description Schemes) and the Classification Schemes.
  • Visualised with GraphOnto or Protégé. Validated and classified with the WonderWeb OWL Validator.
  • Integrated with OWL domain ontologies for football and Formula 1.
  • Used in many applications, including audiovisual digital libraries and e-learning. The XML Schema simple data-types defined in MPEG-7 are stored in a separate XML Schema to be imported in the DS-MIRF ontology.
  • XML elements are generally kept in the rdf:IDs of the corresponding OWL entities, except when two different XML Schema constructs have the same names.
  • The mapping ontology also captures the semantics of the XML Schemas that cannot be mapped to OWL constructs making it easy to return to the original MPEG-7 description from the RDF metadata.
  • Original XML Schema is converted into a main OWL DL ontology and a OWL DL mapping ontology keeps trace of the constructs mapped allowing for conversions later on.

MSO – Multimedia Structure Ontology

The aim of MSO is to support audiovisual content analysis and object/event recognition, to create knowledge beyond object and scene recognition through reasoning, and to enable user-friendly and intelligent search and retrieval. MSO covers MPEG-7 MDS and combines high level domain concepts and low level multimedia descriptions, enabling new content analysis.

The purpose of many tools using MSO is to automatically analyse content, create new metadata and support intelligent content search and information retrieval. MSO has medium quality documentation and high code clarity. It’s reliability pitfalls lie in the difficulty of merging concepts in the same class, and missing options for creating disjoints (Suárez-Figueroa, Ghislain & Corcho, 2013).

MSO largely follows the Harmony Ontology. However, in order to map explicitly the multiple interpretations the attributes in MPEG-7 come with, for instance for mapping and differentiating between frames and keyframes (which help in prioritising what to search for), MSO introduces new classes and properties not present in the Harmony Ontology. MSO, unlike Harmony, modulates structural and low-level descriptions, splitting definitions into two ontologies, making it easier to model domain specific ontologies by linking them together (Dasiopoulou et al., 2010).

VDO – Visual Descriptor Ontology

Although labelled as a visual ontology and not specifically for multimedia, VDO (available in RDF(s) and uniformed to DOLCE) uses the MPEG-7 standard for automatic semantic multimedia analysis of multimedia content similar to MSO. VDO has high quality documentation and medium code clarity, with some reliability pitfalls with merging concepts in the same class and no options for creating annotations (Suárez-Figueroa, Ghislain & Corcho, 2013).

MPEG-7 Rhizomik

The MPEG-7 Rhizomik ontology – in contrast to MSO/VDO/Harmony – assist in automatically translating the MPEG-7 standard to OWL via XSD2OWL and RDF2OWL mappings, and covers the complete MPEG-7 standard. Although generally good for automation, it is regarded as challenging to connect it to domain ontologies, and dealing with naming conflicts.

Easy linkage to domain ontologies makes MPEG-7 Rhizomik dovetail well with Semantic DS’s, however this is difficult due to opposing ontologies naming criteria. MPEG-7 Rhizomik’s strict conceptualisation model requires remapping of existing definitions to merge with the MPEG-7 model (Dasiopoulou et al., 2010).

MPEG-7 Rhizomik has low quality of documentation and low code clarity. It’s main drawbacks are domain or range in properties; usage of different naming criteria; same URI for different ontology elements; difficulty in merging concepts in the same class (Suárez-Figueroa, Ghislain & Corcho, 2013).

  • Maps XML Schema constructs to OWL constructs following generic XML Schema to OWL together with an XML to RDF conversion.
  • Covers the whole standard and the Classification Schemes and TV Anytime. Visualised with Protégé or Swoop and validated/classified using the WonderWeb OWL Validator and Pellet.
  • Corresponding elements are both defined as containers of complex types and simple types. Automatically mapping of XML Schemas to OWL ontologies via ReDeFer.
  • Used with other large XML Schemas in the Digital Rights Management domain like MPEG-21, ODRL and the E-Business domain.

Conclusion

Photo: Jason Leung

There have been many more attempts by the academic world to crack the multimedia ontology nut and this has created more or less heterogenous solutions. We have in this article given a quick overview of the majority of ontologies based on the MPEG-7 standard, listing their main features, technical properties and drawbacks.

There exist many types of metadata and metadata standards, but MPEG-7 is the most widespread multimedia standard, although it too is constrained by the use of XML and the interoperability problems this presents when mapping syntactic data and semantics, i.e. a lack of standardised correspondence between XML schema for definitions and RDF Semantic Web languages.

We find that many ontologies have been built to bridge the semantic gap, but no single ontology fits all scenarios or formats. Semantic elements are far from always properly defined and many alternative ways to model the same descriptions exist, making it difficult to validate or make ontologies understandable by all software. A way to solve ambiguities could be to use a limited set of description tools.

Ontologies wanting to provide full interoperability will have to providing full coverage of the MPEG-7 features leading to flexible structures, whereas ontologies for reasoning will have to enforce a more rigorous structure, which can become inflexible. It is also worth noting that if metadata is expected to carry semantics this could lead to verbose and large files, which in turn can make the information redundant.

Nonetheless, the COMM ontology highlights the significance of formally founded standardised description models and shows promising results by using a modular multimedia ontology based on an upper ontology (DOLCE), making it extensible and easy to integrate with domain ontologies.

References

Leave a Reply

The Big Picture – A list of Multimedia Ontol…

0
%d bloggers like this: