Visual Madlibs Q&A


Version I:

Version II:

  • This version splits training data into 80% sub-training data and 20% validation data.
  • The validation data consists of easy and hard multiple-choice question-answers. Users are encouraged to use them to cross-validate the hyper-parameters.

Python tool:   

  • Used to access our dataset.


Task 1:

Task 2:    (including easy and hard version)

  • Task 1 is automatic targeted descriptions of images to fill in the blank.
  • Task 2 is targeted multiple-choice question answering of images.
  • For both tasks, we provide the image, instruction, and a Madlibs prompt.
  • For some type of question, we also provide the indication of targeted person or object.


Paper:   arxiv paper

author = {Licheng Yu and Eunbyung Park and Alexander C. Berg and Tamara L. Berg}, 
title = "{Visual Madlibs: Fill in the blank Image Generation and Question Answering}", 
journal = {arXiv preprint arXiv:1506.00278}, 
year = {2015},