Tag Archives: Wikidata

Station to station

While slowly going through the Wikidata SPARQL query examples of last year I ran across this one which draws the shortest railway link between Narvik, Norway, and Singapore. Amazing, really. What was also interesting to find out that still in October 2021, the route went via Sweden. I’m not that inclined in train networks but AFAIK the opening of Luleå-Haparanda track in the beginning of April 2021 changed this route also in Wikidata, eventually.

The news came from a tweet by James Benedict Brown, Associate Professor of Architecture at Umeå University. His travel stories are nice reading. For example, here is a thread on how he and his students traveled in Finland at ground level in summer 2022. Recently, he announced in Mastodon, that early next spring, the target will be Athens. Having the above-mentioned SPARQL query at hand, I asked Wikidata to draw me the shortest route from Umeå to Athens.

The return was 0. Wikidata does not know about a consecutive line of adjacent railway stations between these two cities. How far, in the direction of Athens, does it know these stations then? Inside the borders of Greece , one of the last stations with a clear, labelled indication of a preceding station is Didymoteicho near the Turkish border. From that station on towards Athens, the route quickly stagnates. You can run the query here Click the big Run/Play button. The query takes a few seconds to complete.

(I couldn’t help choosing the title, thank you Bowie.)


Wikidata and COVID-19

COVID-19 has come to stay with us, but we can still take a deep breath today and look back at what has happened so far. On an individual level, quite a lot of us didn’t survive. How many of these people can we found in Wikidata? What do we know about them? Let’s start closest to where I live, Europe.

The following query counts deaths by country in the Nordic countries, the EU, United Kingdom and Switzerland. It also gives the age of the youngest and oldest person in that particular group of people. Try it!

SELECT ?countryLabel (MIN(?age) AS ?youngest) (MAX(?age) AS ?oldest (COUNT(?person) AS ?deathsInCovid)
WITH {
SELECT DISTINCT ?country ?person ?birth_date ?death_date WHERE {
?person wdt:P31 wd:Q5 ;
p:P569/psv:P569 ?birth_date_node ;
p:P570/psv:P570 ?death_date_node ;
wdt:P509 wd:Q84263196 ; # cause of death: COVID-19
wdt:P27 ?country.
?birth_date_node wikibase:timeValue ?birth_date .
?death_date_node wikibase:timeValue ?death_date .
{ ?country wdt:P361 wd:Q52062 } # Nordic countries
UNION
{ ?country wdt:P463 wd:Q458 } # EU
UNION
{ ?country wdt:P17 wd:Q145 } # UK
UNION
{ ?country wdt:P17 wd:Q39 } # Switzerland
}
} AS %personsDeadByCovid

WHERE {
INCLUDE %personsDeadByCovid.
BIND( year(?death_date) - year(?birth_date) - if(month(?death_date)<month(?birth_date) || (month(?death_date)=month(?birth_date) && day(?death_date)<day(?birth_date)),1,0) as ?age )
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

GROUP BY ?countryLabel
ORDER BY ?countryLabel

What was the name of these 1000+ persons? What did they do for living? Did they die at a young age, or were they already old? Try it!

SELECT (SAMPLE(?countryLabel) AS ?countryOfCitizenship) ?personLabel ?age (GROUP_CONCAT(?occLabel; SEPARATOR=", ") AS ?occupations)
WITH {
SELECT DISTINCT ?country ?person ?birth_date ?death_date ?occ WHERE {
?person wdt:P31 wd:Q5 ;
p:P569/psv:P569 ?birth_date_node ;
p:P570/psv:P570 ?death_date_node ;
wdt:P509 wd:Q84263196 ; # death cause is COVID-19
wdt:P27 ?country .
OPTIONAL { ?person wdt:P106 ?occ. } # occupation
?birth_date_node wikibase:timeValue ?birth_date .
?death_date_node wikibase:timeValue ?death_date .
{ ?country wdt:P361 wd:Q52062 } # Nordic countries
UNION
{ ?country wdt:P463 wd:Q458 } # EU
UNION
{ ?country wdt:P17 wd:Q145 } # UK
UNION
{ ?country wdt:P17 wd:Q39 } # Switzerland
}
} AS %personsDeadByCovid

WHERE {
INCLUDE %personsDeadByCovid.
BIND( year(?death_date) - year(?birth_date) - if(month(?death_date)<month(?birth_date) || (month(?death_date)=month(?birth_date) && day(?death_date)<day(?birth_date)),1,0) as ?age )
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
?person rdfs:label ?personLabel.
?country rdfs:label ?countryLabel.
?occ rdfs:label ?occLabel
}
}
GROUP BY ?personLabel ?age
ORDER BY ?countryOfCitizenship ?age

Version 2.0

Later on I noticed two things.

First, it would have been nice to have a link to the Wikipedia page of the person, if there, in some of the major European languages – or my local ones, Finnish and Swedish.

Second, the statistics query included duplicate records and thus gave incorrect counts. The enhanced versions are here and here. To run the queries with these, click this or that. (Oh the joy of naming links!)