Descriptive data

Wassily Kandinsky (1866 ? 1944); Circle in a Circle, 1923Descriptive data records make it possible to find images within ARTstor and enhance the usefulness of the images for teaching, learning, and research by providing important contextual information. Because ARTstor aggregates heterogeneous data from many contributors, however, descriptive terminology is not standardized, fields are not populated consistently, and each collection's data structure is typically tailored to a contributor's own needs.

A key challenge and goal for ARTstor has been to integrate these descriptions in such a way that users of the Digital Library can search and browse across disparate image collections in a unified way without losing the integrity of the original descriptions.

Go to the top of the page

Data enhancement and advanced search

To overcome some of the challenges inherent in facilitating access to records with such heterogeneous data, we initially identified three key access points that, with the aid of controlled vocabularies, were most suitable for enhancement: classification, country/region, and date.

We concluded that the best way to improve access is to add terms from controlled lists, rather than to change the source data itself. These enhancements required our working through hundreds of thousands of records, adding (or confirming) terms suggested by computer algorithms.

We have added object-type classification terms from an in-house, controlled list (painting, sculpture, etc.); assigned country terms from the Getty's Thesaurus of Geographic Names; and provided an earliest and latest date for each record. With virtually all the records in the digital library now containing these data, an advanced search function has been implemented that greatly improves a user's ability to browse and search across all the ARTstor collections using controlled vocabularies.

Go to the top of the page

Additional use of controlled vocabularies

Because of the diversity among ARTstor collections, the value of standard vocabularies or thesauri becomes particularly significant. Our first large-scale use of an external vocabulary has been to match artists' names (frequently contributed to ARTstor in non-standard forms) to the Getty Trust's Union List of Artist Names (ULAN). A user searching ARTstor for works by Gerrit von Honthorst, for example, would previously have failed to retrieve records containing such variant forms of his name as Gherardo della Notte and Gherardo Fiammingo. We plan to extend this method of matching source data to external vocabularies to facilitate normalization of other areas of information such as repository names, geographical locations, and styles and periods.

Go to the top of the page

Clustering Images

One of ARTstor's goals is to move closer to a model where all images and all data records representing a unique work are clustered together. For example, we often have separate images of the same work within individual collections or across different source collections. We recognize the authority of museum cataloging and the quality of original photography and want to promote the highest quality image and description as the primary representation of a work in ARTstor. We are also committed to augmenting primary collections with secondary, scholarly sources that will add depth to the descriptions and enhance subject access. We have begun by clustering duplicate whole views and all details of non-complex paintings and sculptures, and we are exploring ways to group larger complexes together. These efforts should help with access and also with retrieval as the Library increases in scale and complexity.

Go to the top of the page

Future Explorations

We also are eager to explore other ways to improve access and enrich the descriptive data in ARTstor. For example, because we are able to take advantage of a large body of anonymous usage data, we are investigating the use of collaborative filtering to suggest related images. In addition, since ARTstor is recognized as a community-built resource, we hope we can create ways for the community — scholars, visual resource professionals, and other knowledgeable users — to add useful data in the form of controlled, social tagging. We recognize that as long as the ARTstor Digital Library continues to grow at the rate of 100,000 or more images per year, it is essential that the descriptive data improve and expand in as many ways as possible.

Image credits

Jean Metzinger,1883 - 1956; Woman with Fan; 1913; Art Institute of Chicago; ARTstor ID# AMICO_CHICAGO_1031150677
Photography © The Art Institute of Chicago? Artists Rights Society (ARS), New York / ADAGP, Paris
v 2006 Artists Rights Society (ARS), New York / ADAGP, Paris