ILRT Technical Report Number: 1065
Publication Date: 2003/10/10
Last Modified : 2003/10/13 11:45
Author(s): Libby Miller , Martin Poulter
The assumption is that you wish to annotate a photo or picture which is somewhere on the internet. You can say things like:
The existing vocabularies for each of these properties can be combined in an RDF/XML document with multiple namespaces. This document can be parsed and imported into a database, and it can also reside on the Web where it can be harvested by external programs, thus forming a part of the semantic web. Here is an example of such a file, with tags from different namespaces in different colours:
By combining many of these documents in a database, one could automatically produce:
Many tools already exist for creating this sort of annotation, for examples of which see the references. We would like to produce annotations simply and rapidly, using a language that is multiplatform, and which does not require a download or install. This started as a Java application. However, Java on Debian is not straightforard to install; and Java on Windows machines now usually requires administrator access.
However, a significant issue for image annotation is that as you catalogue you need to be able to see the image. It is also useful to be able to pick from a list of thumbnail images and then annotate several; this limits the usefulness of command line or bot interfaces. In response to user feedback on the first version, a clickable version was produced. The visual cues this gives makes cataloging images faster, although there are several significant problems with layout of the information.
The tool uses a proxy to download 1) a page of links to thumbnails, 2) a page with images in it, or 3) a single image, into an iframe. The images are accessed using the DOM, and displayed. Clicking on an image triggers a download of the image or html page linked to in the initial thumbnails page, and then the tool figures out if it is an image or an html page. If the latter, it makes a guess about which is the correct image, and makes that the main item to be catalogued. At this stage we have something like this:
and the RDF generated looks like this:
<rdf:RDF xmlns='http://xmlns.com/foaf/0.1/' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:rdfs='http://www.w3.org/2000/01/rdf-schema#' > <rdf:Description rdf:about=""> <annotates rdf:resource="http://swordfish.rdfweb.org/photos/2003/06/12/2003-06-12-Images/4.jpg"/> </rdf:Description> <Image rdf:about="http://swordfish.rdfweb.org/photos/2003/06/12/2003-06-12-Images/4.jpg"> <thumbnail rdf:resource="http://swordfish.rdfweb.org/photos/2003/06/12/2003-06-12-Thumbnails/4.jpg"/> </Image> </rdf:RDF>
Dan Brickley has produced a service whereby appending a noun to the namespace http://xmlns.com/wordnet/1.6/ gives you the wordnet heirarchy for that noun, if it exists. The image annotating tool uses this trick, so if you type 'parrot' into the 'keyword' box, the tool uses Jim Ley's RDF parser to fetch the RDF associated with http://xmlns.com/wordnet/1.6/Parrot, and display it in a useful way so that the tool user can check that it displays the term they are interested in, and also see if a sublcass of the main term might be more appropriate.
The wordnet term is then added to the generated RDF by clicking on it, for example:
<rdf:RDF xmlns='http://xmlns.com/foaf/0.1/' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:rdfs='http://www.w3.org/2000/01/rdf-schema#' > <rdf:Description rdf:about=""> <annotates rdf:resource="http://swordfish.rdfweb.org/photos/2003/06/12/2003-06-12-Images/4.jpg"/> </rdf:Description> <Image rdf:about="http://swordfish.rdfweb.org/photos/2003/06/12/2003-06-12-Images/4.jpg"> <thumbnail rdf:resource="http://swordfish.rdfweb.org/photos/2003/06/12/2003-06-12-Thumbnails/4.jpg"/> <depicts> <wn:Parrot/> </depicts> </Image> </rdf:RDF>
For images containing people, it's useful to be able to say that the image depicts a particular, identified person. See the codepiction experiment for more information about this approach.
One issue is a convenient way of finding people's sha1-encoded email addresses (or their actual email addresses and converting them using a tool). This is where a remote service from a database which already contains this information is useful. This could be, for example, a private address book with a remote interface which produces RDF. In this case, we use an interface to a harvested RDF database.
Sha1-encoded mailboxes and images are shown in response to a query on a substring of a name. Clicking on the image or the name produced adds the person to the RDF. If the person is not in the database, they can be added manually using the forms. At no time is an email address made public.
We have chosen to use the nearestAirport property to associate an image with location data at this time. This is because information linking airports with latitude and longitude is freely available. As an added bonus, this method preserves privacy.
The key issue in terms of accessing geodata is human-readable to lat/long mappings. As a rough pass, the airports data works well because there is a human-readable name for the airport which includes the nearest town or city. This means we can search on the airports data using user-inputted names of places and get out the lat/longs. A similar (and more finegrained) approach would be to use the spacenamespace data; at the moment this is UK-only however.
Modelling the nearestAirport information was difficult. It is not the nearestAirport to the picture as an artifact (the picture may be held on one or more servers, well away from the location). Nor is it necessarily a picture of a location. Instead, it's the location the camera was in when the picture was taken. Similar arguments apply to the date the picture was taken. An experimental new property, creationEvent, was created to test this out. The use of creationEvent masks a hidden resource - an object representing the event, to which nearestAirport and date can be attached.
Users can also add a freetext description. This is coded as the Dublin Core description of the image.
RDF can be used to say anything about anything, and coupled with the ability to annotate any image on the web, this could lead to both
Retaining the source of these annotations within the application and the sotware is therefore essential, in order to be able to remove annotations where there are privacy implications.
At the moment, users have two things they can do with the annotation once they have created it. If they are authorised, they can automatically upload the finished RDF/XML to Libby's server. Since the interfaces display the RDF as it is being created, users can copy and paste it into a text editor and then save it in their own web space. They would then have to publicise the URL of the resulting document if they want it harvested. The visible, colour-coded RDF/XML serves an educational purpose by making the machine-readable end product easy to understand. For deployment among cataloguers, the RDF itself will have to be invisible to the user.
To handle the aforementioned issues of trust and privacy, tools will have to encode and process information about the provenance of data. As discussed above, the present application made use of the properties foaf:annotator and foaf:creationEvent, which are not in the official FOAF schema. They are here as an experiment and will be probably moved into another namespace.
Further kinds of data that we might want to include in annotations include:
Codepiction (Dan Brickley)
Codepiction search interface (Libby Miller, Dan Brickley)
Codepiction paths interface (Damian Steer, Libby Miller, Dan Brickley)
FOAFFinger (Damian Steer)
A Semantic Web Shoebox - Annotating Photos with RSS and RDF (Matt Biddulph)
Adding SVG outlines to co-depiction photo metadata (Jim Ley)
spacenamespace (Jo Walsh)