Our dataset, Faces in the Wild, consists of 30,281 faces collected from News Photographs. These faces have been automatically labeled using the system described in: Who's in the Picture. The labels are approximately 80% accurate. Included in the file faceData.tar.gz are a matlab file, FacesInTheWild.mat, and the face images stored by year/month/day/imgname.ppm. FacesInTheWild.mat contains two variables metaData (metaData{i} gives the file name of face i and it's label id), and lexicon (lexicon{i} gives the actual name of label i).


To unpack any of these file into your current directory use the command:
tar zxvf filename.tar.gz


This dataset is for academic research purposes only. If you use our dataset please reference:

  • Who's in the Picture [pdf] [ps]
    Tamara L. Berg, Alexander C. Berg, Jaety Edwards, David A. Forsyth
    Neural Information Processing Systems (NIPS), 2004