Hands-On AI Part 12: Image Data Collection

Published: 10/12/2017  

Last Updated: 10/12/2017

A Tutorial Series for Software Developers, Data Scientists, and Data Center Managers

This article discusses the methods used for image data collection in the slideshow music project. Please refer to previous article 11 for information about the dataset search. Since then, we met limitations that forced us to choose existing picture databases instead of images taken from Flickr*. However, this article describes both approaches so that one can learn how to extract the data using the Flickr API*.

As a brief recap, initially the datasets selected for this project were:

  1. A training dataset of 7,000 emotionally charged images taken from Flickr* for the emotion extraction algorithm.
  2. A training dataset of the Bach chorales for the melody completion algorithm.
  3. A set of melodies that serve as a template for emotion modulation.

The datasets must now be collected. As will be shown, the amount of work required for this varies considerably, depending on the choice of dataset.

Collection of the music data collection and exploration is described in article 20.

Image Data Collection

A set of images that elicit seven different emotions (happiness, sadness, fear, anxiety, awe, determination, anger) was required for this project (refer to article 4 for an in-depth discussion). To collect these images, we decided to use Flickr, a popular photo sharing website, due to its size and Creative Commons* licensing.

Searching through Flickr and finding 7,000 images manually is clearly a daunting task. Luckily, Flickr has an API which provides a set of methods that allow for easy communication with Flickr using a programming language. However, before using the API to collect the images, it was important to know what to search for in order to elicit the specified emotions. To find a list of search terms or tags, an Amazon Mechanical Turk* task was used. This process is described in detail in the earlier article: Augmenting Artificial Intelligence with Human Intelligence with Amazon Mechanical Turk.

Flickr API*

In order to use the methods provided by the Flickr API, you will need to create a Flickr account and apply for an API key. To do this, you must first have a Flickr or Yahoo!* account. Then follow this link to get the key.

Flickr Apply
Figure 1.  Screenshot from Flicker* page.

The application process for a non-commercial key was simple, and involved describing the intended usage and agreeing to the terms of use. The API key is a security measure and is used to prevent abuse of the API. It is a required parameter in the methods given in the API.

Once the API key is obtained, you can download and install the API kit for the programming language of your choice from The App Garden. This project used Beej's PythonI Flickr API*, which can be used with Python* 3. Please follow the Flickr API installation instructions.

The code used to download the images is shown below. Essentially, it uses the API’s walk function to search for images with a specified tag. These tags are stored in .txt files and are listed one per row. Once an image is found, a URL to that image is constructed from a template, at https://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}.jpg, replacing the curly braces with attributes of the image. The top 30 images for each tag (sorted by relevance) are then retrieved and organized into folders, depending on the emotion and search term.

import flickrapi
import urllib.request
import os

project_path = '/path/to/your/project'
photos_per_tag = 30
filenames = ['Awe.txt', 'Happiness.txt', 'Fear.txt', 'Determination.txt',
             'Anxiety.txt', 'Tranquility.txt', 'Sadness.txt']

def download_files(flickr, t, category, num_photos):
    # Downloads the files of a specific tag
    s = []
    for photo in flickr.walk(tag_mode='all', sort='relevance', tags=t, license=4, per_page=50):
        url = 'https://farm{}.staticflickr.com/{}/{}_{}.jpg'.format(photo.get('farm'),
                             photo.get('server'), photo.get('id'), photo.get('secret'))
        if len(s) == num_photos:
    for i in range(len(s)):
        filename = '{}_{}_{}.jpg'.format(category, t, str(i))
        urllib.request.urlretrieve(s[i], filename)
    os.chdir(os.path.join(project_path, category))

if __name__ == '__main__':
    # Creates flickr object
    # These keys should be requested from flickr
    api_key = u'xxxxxxxxxxxx'
    api_secret = u'xxxxxxxxxxx'
    flickr = flickrapi.FlickrAPI(api_key, api_secret)

    # Runs the program, cycles through the emotions and downloads the images for each tag.
    for fname in filenames:
        categ = fname[:-4]
        with open(fname, 'r') as f:
            tags = f.read().splitlines()
        for t in tags:
            download_files(flickr, t, categ, photos_per_tag)

Example Python* script used to retrieve the image dataset using the Flickr API*.

To use the code in the figure above, clone the repository from the GitHub* link. Then follow the README instructions. Remember to replace the api_key and api_secret with your own API keys requested from Flickr. As noted above this only works on Python 3.

The folder looks like this after running the program:

Resulting dataset from Flickr* search

Resulting dataset from Flickr* search
Figure 2.  Resulting dataset from Flickr* search.

In total, roughly 8,800 images were collected. More images were retrieved than were required, as we expected to cut some of the images that were of bad quality and could not be used. Finding these images was the next step.

Screening Images

The quality of the collected images was varied. Some search terms such as flower (shown in Figure 3) had high-quality, usable images. However, search terms that were less concrete often returned images that were largely unusable. For example, the tag wonder (under the emotion awe) returned an image of a Wonder Woman*-themed cake, and the tag ambitious (under the emotion determination) returned an image of cabbages that were from Ambitious Farms.

Unusable images
Figure 3.  Unusable images.

For anyone who plans to use the Flickr API to collect images—selecting concrete nouns as the search terms will result in much better images than adjectives and abstract nouns. For example, if you wanted images that inspire awe, use search terms such as ocean or Grand Canyon instead of awe or wonder.

After clicking through the images, the team decided that over 40 percent of the images were unusable. This called for a reconsideration of the choice in the dataset. After discussing a number of possibilities such as limiting the images to emotional faces, the team decided to use images from existing databases that are typically used in psychological research (Geneva Affective PicturE Database* (GAPED*), Open Affective Standardized Image Set* (OASIS*), and Image Stimuli for Emotion Elicitation* (ISEE*). Though there is less variation in the images in existing databases than is possible with a novel dataset, existing datasets were selected for their higher quality images and existing stimuli information. The existing stimuli information is a huge advantage as it eliminates the need for data annotation through Amazon Mechanical Turk, which reduces costs!

Data Source

For the new dataset, data collection was a much simpler process. In particular, it no longer required the Amazon Mechanical Turk and Flickr API steps. The GAPED and OASIS datasets (including stimuli information) were immediately available on the Internet for download. The ISEE dataset was made available after sending an email to the author, requesting access. If the instructions on how to download a dataset are not clear, you can most likely find the contact information for the author of the dataset through a Google* search and ask them directly for access to the dataset.

Downloading the GAPED* database
Figure 4.  Downloading the GAPED* database.

Downloading the OASIS* dataset
Figure 5.  Downloading the OASIS* dataset.


Two datasets were collected for this project. The first one used the Flickr API to download images using emotional tags, and the second was a compilation of existing databases used in psychological research. There are a number pros and cons for each dataset; however, this project chose the second for its advantages in image quality, existing stimuli information, and cost.

The method used for collecting data clearly depends on what kind of data is required, but hopefully the processes and methods described in this article will help you in your project.

Now that the datasets have been collected, the project is ready to move to the next steps of data exploration and preprocessing.




Prev: Image Data Search Next: Image Data Exploration

View All Tutorials ›

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.