UK government launches public beta for data
The UK, like the US before it, has today launched the government data project. These projects hope to tap in the large pool developers to produce innovative applications utilising this vast amount of data. Web founder Sir Tim Berners-Lee said:
"It's such an untapped resource...Government data is something we have already spent the money on... and when it is sitting there on a disk in somebody's office it is wasted."
According to the W3C - Putting Government Data online, government data is put online typically for 3 reasons:
- Increasing citizen awareness of government functions to enable greater accountability;
- Contributing valuable information about the world; and
- Enabling the government, the country, and the world to function more efficiently.
The UK government data is made available in raw format, but the emphasis is placed on using linked data.
Why linked data?
Linked data is about publishing and interlinking data on the web. Applications can link data from various sources to build a more complete picture than is otherwise unavailable from one source.
To enable linked data, you need:
- a standard language/model for storing data, and
- a method for querying the data store.
W3C defined:
- RDF (Resource Description Framework) for the data store, and
- SPARQL (SPARQL Protocol and RDF Query Language). This is pronounced sparkle.
In the RDF language, all knowledge is expressed as a triple comprising:
- a subject
- a predicate or verb, and
- an object
The real power of RDF comes from using the same vocabulary to describe each of the three. Standard vocabularies such as the Dublin Core, the FOAF, the RDFS, and the OWL allow applications to easily relate objects from different data stores, and thus to enhance data in one store with data from another store.
Resources
If you want to find out more on government data, and related technologies, you are encouraged to follow the links below:
- Read more about the UK Government data project at "Unlocking innovation | data.gov.uk".
- See the article "How to publish Linked Data on the Web" for more information on linked data.
- Joshua Tauberer's "What is RDF and what is it good for?" provides a concise but thorough introduction to RDF, SPARQL, and Linked Data.
- For more on vocabularies see Dublic Core Metadata Initiative (DCMI) , RDF Vocabulary Description Language (RDFS), The Friend of a Friend (FOAF) project, OWL 2 Web Ontology LanguageOverview (OWL).
- To read on the syntax and semantics of SPARQL, see SPARQL Query Language for RDF.
- The article "Understanding SPARQL" introduces SPARQL and the data formats it is based on. It also covers the RDF, RDF Schema, OWL, and Turtle knowledge representation languages.
- Turtle is a textual syntax for RDF that allows RDF graphs to be written in a compact form. Read more at Turtle - Terse RDF Triple Language.
- W3C's RDF Primer aims to provide basic knowledge required to effectively use RDF.
- Joseki is an HTTP engine that supports the SPARQL Protocol and the SPARQL RDF Query language.
- Jena is a Java framework for building Semantic Web applications. Jena is open source and grew out of work with the HP Labs Semantic Web Programme.
- Jeni's Musings has blog articles discussing linked data among other things.
- W3C Semantic Web Activity
