An Exploration of How to Learn from Visually Descriptive Text
JHU-CLSP Summer Workshop (2011)

Senior Members: Alexander C. Berg, Tamara L. Berg, Hal Daumé III
Graduate Student Participants: Amit Goyal, Xufeng Han, Margaret Mitchell, Karl Stratos, Kota Yamaguchi.
Undergraduate Student Participants: Jesse Dodge, Alyssa Mensch.
Affiliate Members: Yejin Choi, Julia Hockenmaier, Erik Learned-Miller

Not all content is created equal – as indicated by the descriptions people write (right). Some objects (e.g. man, baby, sling) seem to be more important than others (e.g. ladder, table, chair). Some attributes seem to be more important (e.g. beard) than others (e.g. shirt, or glasses). Sometimes scene words are used (e.g. kitchen), and sometimes they aren’t. One goal of this workshop was to examine the complex relationship between images and their descriptions.


Abstract

This workshop explored learning to identifying visually descriptive text, parsing this text and extracting statistical models, and using these models to 1) learn how people describe the visual world, 2) compose descriptions about images, and 3) build more relevant recognition systems in computer vision. It was an exciting opportunity to deal with large scale text and image data, be exposed to cutting edge techniques in computer vision, and interactively develop new strategies on the boundary between NLP and computer vision. Specific types of work included, data collection, parsing, using Amazon's Mechanical Turk, building and using probabilistic models, and work on applications including image parsing, retrieval, and automatic sentence generation from images.


Publications

  • An Exploration of How to Learn from Visually Descriptive Text [pdf]
    Alexander C. Berg, Tamara L. Berg, Hal Daumé III, Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Karl Stratos, Kota Yamaguchi
    JHU-CLSP Summer Workshop Whitepaper, 2011.

  • Midge: Generating Image Descriptions From Computer Vision Detections [pdf]
    Margaret Mitchell, Jesse Dodge, Amit Goyal, Kota Yamaguchi, Karl Sratos, Xufeng Han, Alysssa Mensch, Alexander C. Berg, Tamara L. Berg, Hal Daumé III
    European Chapter of the Association for computational Linguistics, EACL 2012.

  • Understanding and Predicting Importance in Images [pdf]
    Alexander C. Berg, Tamara L Berg
    Hal Daumé III, Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Aneesh Sood, Karl Stratos, Kota Yamaguchi, Computer Vision and Pattern Recognition, CVPR 2012.

  • Detecting Visual Text [pdf]
    Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Karl Stratos, Kota Yamaguchi, Yejin Choi, Hal Daumé III, Alexander C. Berg, Tamara L. Berg,
    North American Chapter of the Association for Computational Linguistics, NAACL 2012.


  • Related Data

    SBU Captioned Photo Dataset (Images and Descriptions)

    If photos or descriptions are used please cite:
    Im2Text: Describing Images Using 1 Million Captioned Photographs
    Vicente Ordonez, Girish Kulkarni, Tamara L. Berg
    Neural Information Processing Systems (NIPS), 2011.

    Pre-Processed Results (Small Sample of 1k Parsed Descriptions, Object Detections, Scene Classifications)
    Pre-Processed Results (All Parsed Descriptions, Object Detections, Scene Classifications)


    If pre-procesed results are used please cite Im2Text paper and:
    An Exploration of How to Learn from Visually Descriptive Text
    Alexander C. Berg, Tamara L. Berg, Hal Daumé III, Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Karl Stratos, Kota Yamaguchi,
    JHU-CLSP Summer Workshop Whitepaper, 2011.

    Detecting Visual Text Data

    If used please cite:
    Detecting Visual Text
    Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Karl Stratos, Kota Yamaguchi, Yejin Choi, Hal Daumé III, Alexander C. Berg, Tamara L. Berg,
    North American Chapter of the Association for Computational Linguistics, NAACL 2012.


    Related Talks

    Final Presentation - All
    Final Presentation - Amit
    Final Presentation - Karl
    Final Presentation - Alyssa