This work is the first part of the project - "DeepArt", the second part work is to build high performance and practical system for artworks retrieval and annotation, details can be found in this Page.
In this work, we present a unified framework, called DeepArt, to learn joint representations that can simultaneously capture content information and style information of visual arts.The intuition under this framework is of three folds:
1. The main difference between digital artworks and normal images is that artworks always contain style feature coming from its creators, some new representation should be investigated.
2. By studying a vast number of artworks, art experts acquire the ability to capture the unique characteristics of visual arts, similarly, learning representations should be proposed which are more flexible and rubost than handcrafted features and the performance of learned features can be improved with the increasing of the data.
3. For the traditional learning process, for example classification problem, the objective function for categorizing should be defined explicitly, but this problem is impossible to be redefined as an classification problem, a triplet based ranking model is employed to optimize the learning process.

Dataset: Art500K

To evaluate our framework, implement system and facilitate further researches, we have collected a large dataset mainly from four websites: Rijksmuseum, Google Arts & Culture, WikiArt.org - Visual Art Encyclopedia, Web Gallery of Art.
This dataset has three advantages over other similar datasets:
1. The dataset has over 500,000 digital artworks with rich annoations, more than two times larger than the previous largest digital artworks dataset.
2. There are variety of labels of digital artworks, apart from some general labels (e.g., artist, genre, art movement), some special labels (e.g., event, historical figure, description) are included.
3. The dataset is well organized and can be accessed publicly under the use of research propose.

This table summarized the comparision of different digital artworks datasets:

PrintART Painting-91 Rijksmuseum VGG Paintings Art500K
# of Digital Artworks 998 4,266 112,039 18,523 554,198
# of Big Classes 75 2 4 1 10
Contain Special Classes (Yes/No) Yes No No No Yes
Public Availability (Yes/No) No No Yes Yes Yes
Contain Eastern Artworks (Yes/No) No No No No Yes

Some examples and statistical results on Art500K:

The Dataset can be found in this Page.

Live Demo

Based on the proposed framework and the collected large number of dataset (Art500K), we implement an artwork retrieval and annotation system which can be tried out by clicking the demo button:

Access Live Demo

Technical Details

This work presented a framework that can learn joint representations from large number of digital artworks, the learned joint represenations are data driven features which are consistent with cognitive principle. Triplet based deep ranking model is employed to optimize the learning processing. The whole architecture is shown below, (a) is the proposed framework which can extract joint representations automatically, (b) is triplet based deep ranking model which can optimize the learning process of the framework, (c) the learned features can be used in different kinds of tasks related to art domain.


Below are some visualization results on Art500K using joint representations. We can see that similar artworks are closed to each others, and un-similar artworks are far away from each others.


H. Mao, M. Cheung and J. She "DeepArt: Learning Joint Representations of Visual Arts", ACM International Conference on Multimedia (MM), 2017. (Oral, ~7.5%)
[PDF] [Bibtex] [Dataset_mainSite] [Dataset_backupSite] [Code] [Code_backupSite]


This work is supported by the HKUST-NIE Social Media Lab.