Graphs in the 2020s: Databases, Platforms and The Evolution of Knowledge

February 4th 2020
Author profile picture

@linked_doGeorge Anadiotis

George's got tech, data, and media, and he's not afraid to use them.

Twitter social icon
Graphs, and knowledge graphs, are key concepts and technologies for the 2020s. What will they look like, and what will they enable going forward?

What is a Knowledge Graph?

According to Wikipedia:
The Knowledge Graph is a knowledge base used by Google and its services to enhance its search engine's results with information gathered from a variety of sources. The information is presented to users in an infobox next to the search results.
For example:
We have been keeping track of the evolution of graphs since the early 2000s, and publishing the Year of the Graph newsletter since 2018. Graphs have numerous applications that span analytics, AI, and knowledge management.
All of the above are built on a common substrate: data. This is why graph databases are a key enabler for all graph applications. Which, in turn, is why we have been dedicating extra effort in keeping track of the progress graph databases are making. To kick off the first Year of the Graph newsletter for the 2020s, we have a little bit of everything.
Use cases from eBay, Github, Google, and the UN. Updates and new releases from Arango, AWS, Cambridge Semantics, NebulaGraph, Neo4j, Ontotext, Oracle, and Stardog. New research and ideas.
Polyglot persistence, as is the lingo for using data models and data management interchangeably depending on the task at hand, is becoming the new normal. After relational, key-value, document, columnar, and time-series databases, the latest link in this evolutionary proliferation of data structures is graph.
Graph databases and knowledge graphs have been making waves and being included in hype cycles for the last couple of years. Their history goes way back, however, and this is just the beginning.
Knowledge graphs can address key challenges such as data governance but ultimately, they can serve as the digital substrate to unify the philosophy of knowledge acquisition and organization with the practice of data management in the digital age.
Knowledge graph and graph are sometimes used interchangeably. They should not – they are 2 different things. As noted by Kurt Cagle in his Dictionary of Graph Terms, Knowledge Graphs are semantic graphs, explicitly bound with meaning.
This means that choosing a substrate that can facilitate dealing with semantics is a good idea. Here is how knowledge graphs platforms are evolving to support different query languages, promoting interoperability, and trying to meet users where they are.
Knowledge graphs are among the most important technologies for the 2020s. Here is how they are evolving, with vendors and standard bodies listening, and platforms becoming fluent in many query languages
Neo4j just released what it dubs “the most significant product release in the graph technology market to date”. Whether that holds true is up to you to decide. What is true, however, is that Neo4j 4.0 addresses some chronic pain points.
Somewhat paradoxically at first blush, this ties back to the evolution of knowledge graph platforms. Neo4j takes a page from graph database history to add to its own evolution.
In its new release, Neo4j addresses key concerns for enterprise adoption. Scalability, security, management and architectural changes are here. And so is a strange feeling of deja-vu, too.
More graph database news: Oracle offers its graph database and graph analytics products free of charge for holders of Oracle product license. AWS has implemented a flurry of new features, and lists them all in one place. A new open source graph database, NebulaGraph, is getting some attention. And ArangoDB posits multi-model can deal with knowledge graphs challenges.
This article describes some of the challenges and how a multi-model’s flexible data representation can address them
Google did not only introduce the term knowledge graph to the world. It also employs key people in this space, and steers the evolution of what has unceremoniously become perhaps the most influential schema in the world: schema.org.
Recently, schema.org v.6 was released, featuring lots of small but useful improvements. As Aaron Bradlye notes, the new types Guide, Recommendation may be of interest for digital marketing. As WooRank’s data shows, 28% of 20 million websites are already using Structured Data
This page lists schema.org releases, most recent first.
A widely known early adopter of knowledge graphs is eBay. Here some members of its engineering team share their insights. As they note, for eBay, the application/infrastructure knowledge graph is a heterogeneous property graph that improves architectural visibility, operational efficiency and developer productivity.
Learn how eBay’s architecture knowledge graph was developed; the benefits eBay has received from it; and the use cases we see now and in the future for this approach.
Github acquired a semantic code analysis engine called CodeQL when it bought Semmle in September 2019. This was made free for research and open source development, to help security researchers find new CVEs and developers automate security checks of their codebase. Github uses a semantic library to analyze code, build a graph, and learn from it.
To make new features work, GitHub is generating a semantic code graph of all its public repos. That offers enormous opportunities to understand and improve coding patterns, quality and security
The UN is working with Linked Data, in order to make its data more useful (and discoverable) through semantic queries. DESA’s Sustainable Development Goals Taxonomy was produced in collaboration with technical experts from across the UN system. A system of Internationalized Resource Identifiers (IRIs) for SDGs, related targets and indicators, and an SDG Interface Ontology were developed.
These common identifiers are deployed to provide a key element of infrastructure that will allow UN system organizations and relevant stakeholders to map their SDG resources to the growing pool of knowledge about the SDGs available on the semantic web. A demo app is also available.
How do search engines retrieve results, and how can we make the UN’s published output relevant for such searchers? Linked data is structured data interlinked with other data, making it more useful (and discoverable) through semantic queries.
Szymon Klarman is an independent knowledge graph expert who has been involved in the UN project. Klarman has been long involved in an effort to bring the benefits of the Semantic Web (such as URIs and schema) to GraphQL. Most related analyses so far have been considering GraphQL as an interface for Linked Data. That’s valid, but there’s another way, says Klarman
What if GraphQL resources were annotated with URIs — global (Semantic Web / linked data) identifiers denoting concepts from shared vocabularies, such as schema.org or other dedicated ontologies?
Working with knowledge graphs and creating ontologies in a visual and collaborative way can always use some good tooling. Tell a closed group to model the world and they will, often without outside feedback. The result may be powerful, but leads to ontologies few people use. Zazuko people believe this is best done using a collaborative web platform, and this is why they have released ontology manager as open source.
Today we announce the open source release of our Zazuko Ontology Manager. This is something we’ve been working on for a while and we planned to release it as open source pretty much since we’ve started working on it for a customer of ours.
The Allen Institute for AI has been developing a Semantic Scholar Graph of References in Context: a large contextual citation graph of 81.1M academic publications, including parsed full text for 8.1M open access papers, across broad science domains.
A related effort: instead of representing research in static PDF articles, L3S Research Center works on a dynamic knowledge graph. The Open Research knowledge graph represents ideas, approaches, methods in machine-readable form.
We introduce the Semantic Scholar Graph of References in Context (GORC), a large contextual citation graph of 81.1M academic publications, including parsed full text for 8.1M open access papers, across broad domains of science.
Wrapping up with another perspective on knowledge graphs. Weaviate is building open source knowledge graphs in a box, or a Docker container, to be precise. It uses pre-trained graph embedding models that users can further train on their specific domain data, access via a REST API, and deploy in the cloud or on premise.
Bob van Luijt was a guest on Google Cloud’s Stack chat to talk about Weaviate and how we use Google Cloud @ SeMI Technologies.
To get the Year of the Graph Newsletter in your Inbox, signup here.

Comments

Tags

The Noonification banner

Subscribe to get your daily round-up of top tech stories!