SPARQL (pronounced “sparkle”, a recursive acronym[2] for SPARQL Protocol and RDF Query Language) is an RDF query language — that is, a semantic query language for databases — able to retrieve and manipulate data stored in Resource Description Framework (RDF) format
SPARQL is a query language similar to SQL in syntax but works on a knowledge graph database like Wikipedia, that allows you to extract knowledge and information by defining a series of filters and constraints.
#Awarded Chemistry Nobel Prizes
#defaultView:Timeline
SELECT DISTINCT ?item ?itemLabel ?when (YEAR(?when) as ?date) ?pic
WHERE {
?item p:P166 ?awardStat . # … with an awarded(P166) statement
?awardStat ps:P166 wd:Q44585 . # … that has the value Nobel Prize in Chemistry (Q35637)
?awardStat pq:P585 ?when . # when did he receive the Nobel prize
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
OPTIONAL { ?item wdt:P18 ?pic }
}
SELECT ?person
WHERE {
?person wdt:P106 wd:Q5482740 .
}
as the subject of interest, this is also what will appear as a column in our query results. Then we specify some constraints with
?person
. The constraints are
WHERE
need to be
wdt:P106
. What? You say. Let me explain it in more detail.
wd:Q5482740
is a prefix of a ‘predicate’ or ‘attribute’ of the subject while
wdt
is the prefix of a value(object in SPARQL terms, but that’s not important) of the attribute.
wd
means I am gonna specify an attribute of the subject here, and
wdt:
means I will specify what the value of this attribute is. So what is
wd:
and
P106
? These are just a code for the specific attribute and value.
Q5482740
stands for ‘occupation’ and
P106
stands for ‘programmer’. This line of code means, I want the
Q5482740
subject to have an attribute of ‘occupation’ of ‘programmer’. Not that scary anymore, right? You can find these codes easily on the WikiData page mentioned above.
?person
items with different
person
. If you look closer at the value, they are actually the code for a different person. For example, the first one
wd:value
is Tim Berners-Lee, the inventor of WWW. This is not intuitive, we want to be able to directly see the names. To do that, we add a WikiData ‘label service’ that helps us translate the code to name, like so:
wd:Q80
SELECT ?person ?personLabel
WHERE {
?person wdt:P106 wd:Q5482740 .
?person rdfs:label ?personLabel .
FILTER ( LANGMATCHES ( LANG ( ?personLabel ), "fr" ) )
}
to have a ‘label’ attribute, and we define a
person
value variable to hold these values so we can display them in the query results. Also, we added the
personLabel
into our
personLabel
phrase so it will be displayed. Please be noted that I also added a FILTER below to only display the French language label, otherwise it will show multiple language labels for one person, which is not what we want:
SELECT
SELECT ?person ?personLabel ?notableworkLabel
WHERE {
?person wdt:P106 wd:Q5482740 .
?person rdfs:label ?personLabel .
FILTER ( LANGMATCHES ( LANG ( ?personLabel ), "fr" ) )
?person wdt:P800 ?notablework .
?notablework rdfs:label ?notableworkLabel .
FILTER ( LANGMATCHES ( LANG ( ?notableworkLabel ), "fr" ) )
}
means ‘notable work’ attribute, everything else is similar. We then get the following results:
wdt:P800
SELECT ?person ?personLabel ( GROUP_CONCAT ( DISTINCT ?notableworkLabel; separator="; " ) AS ?works )
WHERE {
?person wdt:P106 wd:Q5482740 .
?person rdfs:label ?personLabel .
FILTER ( LANGMATCHES ( LANG ( ?personLabel ), "fr" ) )
?person wdt:P800 ?notablework .
?notablework rdfs:label ?notableworkLabel .
FILTER ( LANGMATCHES ( LANG ( ?notableworkLabel ), "fr" ) )
}
GROUP BY ?person ?personLabel
’ is used. Also,
GROUP BY
function is used to concatenate multiple
GROUP_CONCAT
into a new column
notableworkLabel
(I will not explain how these functions work, just want to quickly show you what SPARQL can do. Please feel free to Google if you want to know more, there are plenty of tutorial articles and videos out there):
works
SELECT ?person ?personLabel ( GROUP_CONCAT ( DISTINCT ?notableworkLabel; separator="; " ) AS ?works ) ?image
WHERE {
?person wdt:P106 wd:Q5482740 .
?person rdfs:label ?personLabel .
FILTER ( LANGMATCHES ( LANG ( ?personLabel ), "fr" ) )
?person wdt:P800 ?notablework .
?notablework rdfs:label ?notableworkLabel .
FILTER ( LANGMATCHES ( LANG ( ?notableworkLabel ), "fr" ) )
OPTIONAL {?person wdt:P18 ?image}
}
GROUP BY ?person ?personLabel ?image
#defaultView:ImageGrid
SELECT ?person ?personLabel ( GROUP_CONCAT ( DISTINCT ?notableworkLabel; separator="; " ) AS ?works ) ?image ?countryLabel ?cood
WHERE {
?person wdt:P106 wd:Q5482740 .
?person rdfs:label ?personLabel .
FILTER ( LANGMATCHES ( LANG ( ?personLabel ), "fr" ) )
?person wdt:P800 ?notablework .
?notablework rdfs:label ?notableworkLabel .
FILTER ( LANGMATCHES ( LANG ( ?notableworkLabel ), "fr" ) )
OPTIONAL {?person wdt:P18 ?image}
OPTIONAL {?person wdt:P19 ?country .
?country rdfs:label ?countryLabel .
?country wdt:P625 ?cood .
FILTER ( LANGMATCHES ( LANG ( ?countryLabel ), "fr" ) )
}
}
GROUP BY ?person ?personLabel ?image ?countryLabel ?cood
, put into a variable
country
, then find out the
country
of the country and put into a variable
coordinates
. With the coordinates, we can activate the ‘map’ view:
cood