Retrieving Photo Metadata from Flickr
Introduction
In this project we use Flickr as a storage of photos. As a result we need to retrieve the metadata of the photos from Flickr. These include the absolute URI of the photos, their titles, and the tags associated with them. This can be done by usign the API service provided by Flickr. An API Key must be obtained in order to use the Flickr API.
Retrieving Photo Metadata
The API function used in this project is the flickr.photos.search function. By providing the Flickr ID of a user, it will return all the photos together with their metadata. The function can be accessed by sending a REST request to the endpoint http://api.flickr.com/services/rest/. One can specify the format of the response such as XML or JSON. The following shows an example of using the API function.
http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key={APIKey}&user_id=30507791@N04&per_page=20&extras=tags
The above request calls the flickr.photos.search, provides the API key, the Flickr ID of the user, the number of photos to be displayed per page, and requests also the tags associated with the photos. This will return a response in XML format as shown below.
<?xml version="1.0" encoding="utf-8" ?>
<rsp stat="ok">
<photos page="1" pages="1" perpage="20" total="4">
<photo id="3005694515" owner="30507791@N04" secret="4d9565e6d3" server="3199" farm="4" title="P1040662" ispublic="1" isfriend="0" isfamily="0" tags="china beijing cctv" />
<photo id="3006529358" owner="30507791@N04" secret="7e4625c0cb" server="3073" farm="4" title="P1050190" ispublic="1" isfriend="0" isfamily="0" tags="uk bath roman" />
<photo id="3006529014" owner="30507791@N04" secret="886bf11707" server="3171" farm="4" title="P1040423" ispublic="1" isfriend="0" isfamily="0" tags="italy pisa leaningtower" />
<photo id="3006270614" owner="30507791@N04" secret="24b1aab8f5" server="3295" farm="4" title="P1050441" ispublic="1" isfriend="0" isfamily="0" tags="newyork timesquare" />
</photos>
</rsp>
The absolute URI of a photo can be reconstructed from its farm ID, server ID, photo ID and secret, as described in Flickr's URL documentation.
Note that this function returns only metadata of the photos which the calling user is allowed to access. If the photos are set to private or semi-private, authentication needs to be done before the metadata can be retrieved. This shall be implemented in this project.
Python Script for Retrieving Metadata from Flickr
A python script, flickr2rdf.py, has been developed to retrieve metadata from Flickr and output the data in RDF format according to the Photo Access Control Ontology. This allows testing of the interfaces in Tabulator. (Note: the Photo Import Pane in Tabulator also provides a way to import photo metadata directly in Tabulator.)
Requirements
- Python 2.4 or newer.
- Python Flickr API (For interacting with API of Flickr)
- Cwm (For processing RDF data)
Download
- The script can be downloaded here: flickr2rdf.py
Installation
The script can readily be used after download. Make sure both Python Flickr API and Cwm are installed. It is necessary to put the Cwm directory in the same directory in which this script is saved. Otherwise, change to following line in the script to match the path of the directory of Cwm.
sys.path.append("cwm/")
Documentation
The flickr2rdf.py script defines a Flickr2RDF class. To create an instance of flickr2rdf, a Flickr API key has to be provided:
f = Flickr2RDF(api_key="thisistheflickrapikey")
The getPhotosByNSID() method uses the FlickAPI to communicate with Flickr to obtain a set of photos and their tags. The method requires the NSID of the user, which will be used to search Flickr using the following line of code. The results are stored as a list of photos, each with a set of attributes such as farm ID, server ID, secret, and a set of tags.
result = self.flickr.photos_search(user_id=nsid, per_page=500, page=1, extras="tags")
The GenerateRDFOutput() method generates RDF output given the result obtained by getPhotosByNSID(). It also requires the URI of the user, a file name of the output file and the desired file format, which is either "xml" (RDF/XML Syntax) or "n3" (Notation3).
The URL of a photo is rebuilt using the PhotoURL() method, which constructs the URL of a photo from its farm ID, server ID, photo ID and secret, as described in Flickr's URL documentation.
http://farm{farm-id}.static.flickr.com/{server-id}/{id}_{secret}_[msbt].jpg
The above method uses the n3String() and rdfString() methods of a formula object in Cwm to generate the final RDF output. Note that a temporary base URL, http://example, is used when creating RDF triples, and it is removed in the final RDF output by passing it to these methods through the base parameter. Hence, the things appearing in the photo album are all referred to using relative identifiers.
Usage
To use the script, run the script using the Python interpreter with the following parameters.
$ python flickrexporter.py FlickrNSID yourURI OutputFileName OutputFormat
The parameters include:
- FlickrNSID: The Flickr NSID of the account from which you want to export photo metadata.
- yourURI: Your URI, which will be used to describe the owner of the photo album generated.
- OutputFileName: The file name of the final output.
- OutputFormat: The format of the final output, can be "xml" or "n3".
The output file will be generated in the same directory where the script is located.
Example
An example of the output of the script can be found here: photo.rdf. If the latest release of the Firefox Tabulator extension has been installed, the photo album can be viewed using the Photo Pane.