Detecting Creative Commons License Violations with Flickr Images on the Web


This document describes the work carried out by Oshani Seneviratne at the Department of Electronics and Computer Science, University of Southampton under the supervision of Prof. Nigel Shadbolt from July 28th to Aug 29th, 2008. This work is part of the WSRI (Web Science Research Initiative) Exchange Program.


Problem Statement

The Short Version:

Wouldn't it be nice to know if someone used your photos on the web without letting you know, or not attributing you?

The Long Version:

Social networks, blogs, photo sharing sites and other applications known collectively as the social web has lots of increasingly complex data. There are many accountability issues associated with such over-exposed data on the web. This project attempts to find a solution to an instance of such complex data usage relationships. The focus is on photo sharing sites such as Flickr, and blogging platforms such as Blogger or Live Journal (or simply any web site), with the aim of finding out if any Creative Commons Licences have been violated in reusing images.

Design and Implementation

Design of the System
Figure 1: System Architecture


Alice is a Flickr user and also has a QDOS account (a service which collects information about a user's online information). Bob uses one of Alice's photos, which happens to be under a CC license on his blog on Blogger without giving the proper attribution to Alice.

Overview of the System

This system has 4 major components.

  1. The crawler will look at a given site and determine if there are any embedded Flickr photos.
  2. If such photos are detected, License Checker will determine whether it is under a Creative Commons license.
  3. The User Checker will find out other identifying information related to the original creator of the photo. Since all Creative Commons licensed works should give attribution to the original creator, the crawler will again check whether the name or any other identifying information of the original creator appears on the page the photo is embedded on.
  4. The notifier will send a notification to the original creator about the data usage and license terms violation.

Note: For the implementation details please see Milestone 1.

Future Directions