Click here to Skip to main content
15,878,748 members
Articles / Programming Languages / Java

Query in semantic web for beginners

Rate me:
Please Sign up or sign in to vote.
5.00/5 (4 votes)
23 Apr 2016CPOL4 min read 12.5K   219   4  
This article introduces using SPARQL to query semantic webpages.

Introduction

My 5-years-old son enjoys using my android phone to search by voice and listen to answers narrated by a lovely semi-human semi-machine voice. For example, He asks “What is the capital of France?”, and the phone replied “Paris”, “What is the largest country in Africa?” “Algeria”. However, one day he was so frustrated when he asked “Which is bigger United states or China” and didn’t get an answer!

This situation shows the limitation of the current search mechanism which use keywords and tags to find the webpages related to your search. The accuracy and quality of search depends mainly on the popularity of the question you ask. If someone else answered your question before (which is likely with this large pool of people), the search engine will retrieve this page and you have your answer; however, if you ask a relatively sophisticated question, you are out of luck. In another scenario, if you have a compound question (e.g. List all states in USA and the how many presidents were born in each state), you might have to go through multiple webpages to get the answer (again if no one answers the question before).

Semantic web

This limitation in search feature relates to the way the web has been structured. It is simply a bunch of text files that is readable by humans but not by machines. A machine cannot collect data from different webpages to form an answer. Consequently, it has been suggested to use a new structure, known as Semantic Web, which is readable by humans and more importantly by machine.

There are many wonderful articles that explain the semantic web in details, but here I will focus more on searching and querying feature which is one of the most powerful features. In this demo, I will use two web sites that use Semantic Web (Dbpedia.org and wikidata.org). These two websites convert the giant encyclopedia Wikipedia to RDF/OWL format which allows machine to merge data from different webpages to answer compound questions. There are some technical differences between these two websites but they are outside the scope of this article.

In order to search Semantic web, we use a query language called SPARQL Protocol and RDF Query Language (SPARQL), yet it is weird the S refers recursively to SPARQL. SPARQL is very similar to SQL. In order to write a query, either use a SPARQL endpoint which is a simple webpage to write a query and display results (think of Google homepage), or use semantic web library to write a custom application. There are many good semantic web libraries such as Jena (Java), and dotNetRDF (C#). In this article, I will demonstrate both endpoints and Jena framework.

SPARQL endpoint

Each one of these websites provides a SPARQL endpoint to write queries, let’s go to https://query.wikidata.org and write the following query to get all American presidents with their signatures.

SQL
SELECT ?president ?president_name ?signature

WHERE {

                      ?president wdt:P39 wd:Q11696.

        ?president wdt:P109 ?signature.

                      OPTIONAL {?president rdfs:label ?president_name
                      filter (lang(?president_name) = "en") .}

}

 

Now let’s go to the other endpoint, http://dbpedia.org/sparql, and use this query to retrieve all public Canadian universities with their cities and populations.

SQL
SELECT *

WHERE {

?Univeristy dbo:type dbr:Public_university.

 ?Univeristy  dbp:country dbr:Canada.

 ?Univeristy  dbp:city ?city.

 ?city dbo:populationTotal ?population

} ORDER BY DESC(?population)

Jena

Beside using SPARQL endpoints, you can also use semantic web library to write a custom application. Here I will use Jena which is one of the most powerful libraries that supports Semantic Web technology.

Jena can be downloaded from its website, https://jena.apache.org/, or by adding the following to your maven pom file.

XML
<dependency>

            <groupId>org.apache.jena</groupId>

            <artifactId>apache-jena-libs</artifactId>

            <version>3.0.1</version>

</dependency>

The following code runs a query with dbpedia website and the attached sample demonstrates wikidata and linkedmdb as well.

Java
public static void main(String[] args) {


        //The query

        String queryString =

                "PREFIX dbont: <http://dbpedia.org/ontology/> " +

                        "PREFIX dbp: <http://dbpedia.org/property/>" +

                        "PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>" +

                        "   SELECT ?musician  ?place" +

                        "   WHERE { " +

                        "       ?musician dbont:birthPlace ?place ." +

                        "   }";


        // Create query object

        Query query = QueryFactory.create(queryString);


        // Initializing queryExecution factory with remote service.

        QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);


        // Run the query and display the results

        try {

            ResultSet results = qexec.execSelect();

            ResultSetFormatter.out(System.out, results, query);

        } catch (Exception ex) {

            System.out.println(ex.getMessage());

        } finally {

            qexec.close();

        }

}

Limitations

No one can challenge the advantages of semantic web over the current web mode. Nonetheless, semantic web has not become popular yet, nor it reaches the critical mass to gain enough momentum. This can be attributed to radical required change to the current model, which is not usually welcome by many mindset; and obviously, the query’s syntax must be correct to run, while the simple keyword search has zero requirements.

Conclusion

Semantic web provides a new approach to deal with data that is “understandable” by machines. Although, there are many difficulties associated with this new technology, it will evolve and definitely supersede the current web structure in the future.

History

  • v 1.0 April 23, 2016.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Engineer
Canada Canada
Programming for me is a hobby

Comments and Discussions

 
-- There are no messages in this forum --