MAE Dataset

What is MAE?

The Multimodal Attribute Extraction (MAE) dataset is the first benchmark dataset for the task of multimodal attribute extraction. It is composed of mixed media data for 2.2 million product items. For each item there is a textual description, set of product images, and open-schema table of product attributes. For more information, read our paper:

Multimodal Attribute Extraction. Logan et al., 2017

Download MAE (v.0.0)

Training JSON

Validation JSON

Images

Evaluation Instructions

Leaderboard submission instructions are currently under construction.

Support

For support, please reach out to us at the MAE Dataset Google group.

Leaderboard - All Attributes

Last updated: 12/02/2017

Rank	Model	Accuracy
1	Most Common Value	33.99%

Leaderboard - Top 100 Attributes

Last updated: 11/12/2017

Rank	Model	Accuracy
1	Multimodal Baseline - Concat	59.48%
2	Text Baseline	58.41%
3	Multimodal Baseline - GMU	52.92%
4	Most-Common Value	38.81%
5	Image Baseline	38.07%