paint-brush
Requirements for Successfully Communicating Terms and Statementsby@interoperability

Requirements for Successfully Communicating Terms and Statements

tldt arrow

Too Long; Didn't Read

This paper explores a machine-actionable Rosetta Stone Framework for (meta)data, which uses reference terms and schemata as an interlingua to minimize mappings and crosswalks.
featured image - Requirements for Successfully Communicating Terms and Statements
Interoperability in Software Publication HackerNoon profile picture

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Vogt, Lars, TIB Leibniz Information Centre for Science and Technology;

(2) Konrad, Marcel, TIB Leibniz Information Centre for Science and Technology;

(3) Prinz, Manuel, TIB Leibniz Information Centre for Science and Technology.

Requirements for successfully communicating terms and statements

What is needed to communicate terms and statements efficiently and reliably? For communication between human beings to be successful, both the sender and the receiver of the information need to share the same relevant background knowledge. For a given term, sender and receiver have to share the same lexical competences. They need to share the same inferential lexical competences (35) and thus knowledge about the meaning of a given term in the form of an ontological definition that answers the question ‘What is it?’ (36). Inferential lexical competences are needed for the human-interpretability of terms.


Sender and receiver also need to share the same referential lexical competences (35) and thus diagnostic knowledge about the reference of a given term. A proper name refers to an individual entity (i.e., a particular) and a general term or kind term to a set of individuals that meet the defining properties of the term, sometimes also called its extension. A verb or predicate refers to a specific type of action or attribute. Diagnostic knowledge is often communicated in the form of method-dependent recognition criteria, images, or exemplars that answer the question ‘How does it look, how to recognize or identify it?’ (36), enabling the receiver to use the term correctly in designation (i.e., the object is given, and the matching term has to be found) or recognition tasks (i.e., the term is given, and the matching object has to be identified) 1 . Referential lexical competences are needed to use a term correctly in different contexts, and thus for the human-actionability of terms.


Given that the meaning of terms is provided by their ontological definitions, one could argue that terms are only placeholders, i.e., surrogates, for these definitions and thus statements, and that only statements carry meaning. In other words, the communication of meaning requires more than a single term, but rather several terms placed in context and related by predicates. The communication of meaning thus requires statements. Statements carry meaning in addition to the meaning of the terms that compose them. This becomes obvious when the positions of terms in a given sentence are changed, such as “Peter travels from Berlin to Paris” versus “Peter travels from Paris to Berlin”—the same set of terms carries two different meanings. Therefore, for the efficient and reliable communication of statements, the sender and the receiver of the information must share a set of rules and conventions for formulating sentences using terms.


But how is the meaning of a statement represented in the human brain? Understanding how the human brain creates cognitive representations of semantic content is a very active area of research (38). The human brain is a highly interconnected complex system that is continually influenced by input signals from the body and the world, so that a given neuron does not function in isolation but is substantially influenced by its neural context (39). It is therefore not surprising that there is evidence for at least two forms of object knowledge representation in the human brain, supported by different brain systems, i.e., motor-sensory-derived and conceptual-cognition-derived knowledge (40,41), and that lexical concepts are stored as patterns of neural activity that are encoded in multiple areas of the brain, including taxonomic and distributional structures as well as experience-based representational structures that encode information about sensory-motor, affective, and other features of phenomenal experience (42). These findings suggest that the cognitive representation of the meaning of a statement is likely to take the form of a complex network of associations, analogous to a multidimensional mind-map. Thus, when attempting to communicate a statement, the sender must first translate this multidimensional mind-map into a one-dimensional sequence of terms, i.e., a sentence. This translation step is supported by a set of syntactic and grammatical conventions, shared by the sender and the receiver, for formulating sentences using terms.


According to the predicate-argument-structure of linguistics (43,44), the main verb of a statement, together with its auxiliaries, forms the statement’s predicate. A predicate has a valence that determines the number and types of arguments it requires to complete its meaning. Adjuncts may be additionally related to the predicate, but they are not necessary to complete the predicate’s meaning. Adjuncts provide optional information, such as a time specification in a parthood statement. Therefore, every statement has a subject phrase as one of its arguments, and can have one or more object phrases as further arguments and additional adjuncts, depending on the underlying predicate.


In the syntax of a statement, each argument and adjunct of a predicate can be understood as having a specific position with a specific semantic role. This is related to Kipper et al.’s (45) verb lexicon VerbNet, where they extend Levin verb classes (46) to include abstract representations of syntactic frames for each class, with explicit correspondences between syntactic positions (i.e., positions in a syntax tree) and the semantic roles (i.e., thematic roles sensu (45)) that these positions express (Fig. 2B). Each verb class lists the semantic roles allowed by the predicate-argument-structure of the instances of the class and the basic syntactic frames shared by the instances.


The list of arguments of a predicate-argument-structure can be described by a list of thematic labels taken from a set of predefined possible labels (e.g., AGENT, PATIENT, THEME, etc.), and the syntactic frames are represented by an ordered sequence of such thematic labels. The thematic labels function as descriptors of semantic roles that are mapped onto positions in a given syntactic frame (45).


In PropBank, Palmer et al. (47) define semantic roles at the level of individual verbs by numbering the arguments of the verb, with the first argument generally taking the semantic role of an AGENT and the second argument typically, but not necessarily, taking the semantic role of a PATIENT or a THEME. For higher numbered arguments, there are no consistently defined roles.


Syntax trees, with their different syntactic positions and associated semantic roles, contribute substantially to the meaning of their sentences, and they are used to translate a web of ideas in the mind of a sender into a string of words that can be understood by a receiver to translate it back into a web of ideas (48). The clearer the semantic roles of the different positions are, the easier it is for a human being to understand the information. In a sense, we can understand syntax trees as the first knowledge graphs created by humans, and their use seems to be quite straightforward, providing a structure that is interoperable with human cognitive conditions, thus satisfying the need for cognitive interoperability.


To sum it up, whenever information needs to be communicated efficiently and reliably, the sender and receiver of the information must not only share the same inferential and referential lexical competences regarding the terms used in their communication, but also the same set of syntactic and grammatical conventions for formulating sentences with these terms, resulting in the same syntax tree in both sender and receiver.


Figure 2: Parallels between natural language statements and data schemata. The natural language statement in A) isstructured by syntactical and grammatical conventions into syntactic positions of phrases of a syntax tree as shown in B) or


The knowledge shared by the sender and the receiver of the message ensures that the message is:


1. readable: The receiver must be able to identify the basic units of information in a message, i.e., where a term or a statement begins and where it ends;


2. interpretable: The receiver must be able to understand each term and statement and thus recognize its meaning;


3. actionable: The receiver must be able to correctly apply terms and statements from the message, including correctly designating or recognizing referents of terms and correctly placing a statement in context.