Clustering Images
Like the traditional slide library, the ARTstor Digital Library has more than its share of redundant images. Some are literally duplicates – digital images made from the same photographic source. Others are merely functionally redundant – multiple views of the same object that seem to contribute nothing extra to teaching or research. Why does ARTstor have so many duplicative images? There are two primary reasons for this duplication. First, some of ARTstor’s source collections themselves contain these redundancies. Secondly, as we are constantly adding collections, many of the new images represent works of art that are already in the ARTstor Digital Library. Often, this multiplicity increases the richness with which ARTstor documents these works; sometimes, however. it simply leads to more redundancy. Understandably, while some users welcome – or at least willingly tolerate – this variety, others find it distracting.

In order to enhance our users’ experience while working with the ARTstor Digital Library, ARTstor staff have been working behind the scenes to begin to cluster like images and to reduce this kind of duplication. We have begun to identify redundant images – both literal duplicates and “functionally redundant” images. Initially, we are focusing our efforts on a core component of the Charter Collection: those key works of art that are most frequently sought out and consulted by ARTstor users. By concentrating on de-duplicating those images that are most often searched, viewed, and saved into image groups, we hope to greatly improve the experience of a majority of our users in the very near term. And because much ARTstor use to date has revolved around teaching, our early efforts at de-duplication will likely have the greatest impact on “canonic” works of world art. But we expect to expand our effort over time in order to embrace less frequently consulted images as well, with the understanding that such duplication is much less common outside core areas of art history.

In listening to our users, we have concluded that we should not completely remove such duplicative images from ARTstor. Rather, we are clustering these images so that when users perform searches in ARTstor, they will not be confronted with myriad versions of same image. Increasingly, they will see a single image of a given work of art, with additional images clustered behind that main image. These clustered images are ones that we believe are duplicative in some meaningful sense. This icon will signal the availability of such supplementary, “clustered” images.

This approach should, over time, begin to address the dissonance some users feel when they encounter multiple versions of the same image. This strategy also preserves the user’s ability to select the image that best meets his or her immediate need as teacher or scholar – whether to illustrate a particular point, or to give a sense of how one image more faithfully represents the original object than another.

Improved Image Quality
In our continuing effort to develop the collections in the ARTstor Digital Library, we are often – and increasingly – able to provide users with truly superior digital images. Sometimes these images represent new high resolution digital photography from the original object, whether in a museum or in the Gobi Desert. In other cases, they are images scanned from large-format photographs of such objects. In order to highlight and make the most of such superlative images, our effort to cluster duplicative images has taken on an additional dimension. In addition to associating affiliated images, we are also actively drawing the user’s attention to the best image that ARTstor has to offer for a given work of art. As indicated above, we are often hesitant to make such judgement calls ourselves. But, when we have access to an image that seems, based on objective criteria, very likely to be superior and of greatest interest to our users, we are assigning this image priority in our clustering efforts.

As a result, you will typically find that a cluster of duplicative images has been appended to an image that was either made via direct digital capture from the original object (increasingly, but not always, an image contributed by the museum that owns that object) or scanned from a large- format photograph of that object (often contributed to ARTstor via collections such as the Carnegie Arts of the United States or collaborations with organizations such as Scala Archives, which create and assemble high quality photographic archives documenting museum collections, as well as architectural monuments and sites).

In some cases, such an objectively superior image will not yet be available to us for a key work of art that has been identified as a priority for de-duplication due to frequency of use. Despite the temporary absence of a superior image, we feel that it is essential to address the redundancy of these key momuments. For this reason, ARTstor users should also anticipate encountering image “clusters” in which the preferred image may not be a high resolution image. In such instances, we will continue our ongoing effort to provide superior images, guided as always by the needs of ARTstor users. So please continue to let us know how we can work to address your needs!