Frames Labeled In Cinema (FLIC)

Brought to you as part of our paper on MODEC.

Download

Released April 7, 2013.

FLIC.zip287MB5003 examples used in our CVPR13 MODEC paper.
FLIC-full.zip1.2GB20928 examples, a superset of FLIC consisting of more difficult examples (see below).
NOTE: please do not use this as training data if testing on the FLIC test set. It is a superset of the original FLIC dataset and will lead to overfitting. Choose a sensible split where no two frames from the same movie shot cross the train/test divide.

Please cite as

Sapp, Benjamin and Ben Taskar. "MODEC: MultimOdal DEComposable Models for Human Pose Estimation." In Proc. CVPR 2013.

bibtex:

  @inproceedings{modec13,
    title={MODEC: Multimodal Decomposable Models for Human Pose Estimation},
    author={Sapp, Benjamin and Taskar, Ben},
    booktitle={In Proc. CVPR},
    year={2013},
  }

Usage

Simply run the matlab script demo_FLIC.m for a demo on how to load up and display the annotated joints. The examples struct also contains the fields examples(i).{istrain,istest} which denotes the test/train split used in our paper. The 5 images at the top of this page were randomly sampled from examples([examples.istrain]).

Description

From the paper: We collected a 5003 image dataset automatically from popular Hollywood movies. The images were obtained by running a state-of-the-art person detector on every tenth frame of 30 movies. People detected with high confidence (roughly 20K candidates) were then sent to the crowdsourcing marketplace Amazon Mechanical Turk to obtain groundtruthlabeling. Each image was annotated by five Turkers for $0.01 each to label 10 upperbody joints. The median-of-five labeling was taken in each image to be robust to outlier annotation. Finally, images were rejected manually by us if the person was occluded or severely non-frontal. We set aside 20% (1016 images) of the data for testing.

What is FLIC-full?

The FLIC-full dataset is the full set of frames we harvested from movies and sent to Mechanical Turk to have joints hand-annotated. While the annotations between 5 turkers were almost always very consistent, many of these frames proved difficult for training / testing our MODEC pose model: occluded, non-frontal, or just plain mislabeled. We encourage ambitious researchers to try their hand on this data as an interesting source of learning with outliers, modeling with occlusion or profile poses! No results have been published on it to date.

Here is a random sampling of the hard examples (from examples([examples.isbad] & ~[examples.isunchecked])):

License

MIT