Tuesday, January 25, 2011

Trends and Issues related to Image Indexing

Maruja De Villa Lorica
Paper written Spring 2010

There are two major approaches in image analysis and representation of subject. These are text-based image retrieval (TBIR) and content-based image retrieval (CBIR). Text based image retrieval, also known as context based descriptive approach, has long been used by librarians, curators, and archivists to provide access to image collections through manual assignment of text descriptors and classification codes. The context-based descriptive approach uses verbal language, either drawn from a controlled vocabulary or extracted from the natural language. This method involves humans in manually providing captions, keywords, and other descriptions of image data for indexing and retrieval.

Classification Systems in Image Indexing

Early attempts were made to apply existing general classification system, such as the Dewey Decimal system or the Library of Congress Subject Headings, to an image collection. These attempts have generally not been successful and led to the conclusion that such systems provide a sparse language for image indexing, particularly in image content description. Indexing tools geared towards more general image access are found to lack unique attributes associated with visual materials. Thus, several approaches have been developed to provide precise indexing tools to meet the needs of specialized communities of researchers and provide access at a high level of specificity.

Thesaurus-based Image Indexing System

There are three thesaurus-based image indexing systems, namely: the Art and Architecture Thesaurus (AAT), the ICONCLASS (derived from “iconography” and “classification”), and the TELCLASS.

The Art and Architecture Thesaurus (AAT) provide a controlled vocabulary for the visual arts and architecture. The AAT provide “terminology describing physical attributes, styles and periods, agents, activities, materials and objects; and the terms themselves are derived from existing glossaries, subject lists, thesauri, reference works and scholarly monographs, as well as from the subject expertise of art historians and architects”. AAT is a controlled vocabulary to describe and index cultural heritage by providing structured terms relating to images in art, culture, and architecture. Likewise, the AAT can be used as useful tool for cataloging and research.

The second thesaurus-based system, ICONCLASS is a scheme intended to provide a “consistent classification of all the subjects which mankind has succeeded to portray”. It has 17 volumes; each volume has a textual description of a particular subject, theme or motif to be found within nine primary divisions of fine art relating to Religion and magic; Nature; Human being, man in general; Society, civilization, culture; Abstract ideas and concepts; History; Bible; Literature; and Classical mythology and ancient history. ICONCLASS uses alphanumeric classification codes and notations, wherein each notation provides the advantages of indexing using both controlled and uncontrolled vocabularies. While ICONCLASS is a powerful tool, it is very complex to use and is not suitable when “describing ordinary images of common objects”.

The third thesaurus-based system, TELCLASS was developed at BBC-TV Film and Videotape Library for use with television broadcast material (Baxter and Anderson). It consists of alphanumeric codes and associated terms, arranged within six main groups, namely: verbal, schematic, actuality, simulation, technical, and formal.

Concerns related to Text-based Image Retrieval

One major concern related to text-based image indexing is the subjectivity of the indexer. Two indexers are likely to provide two different terms to describe an image, and that the same indexer is likely to index “an image differently at different times”. As such, wide disparities occur in keywords given to the same image. Another concern raised regarding text-based image indexing is that it is very labor-intensive.

The issues and problems related to text-based image retrieval have contributed to the emergence of techniques of retrieving images through features such as color, texture, and shape, known as content-based image retrieval (CBIR).

Content-based Image Retrieval (CBIR)

Content-based image retrieval as the method of indexing and retrieving images based on automatic processing of textual information and of the image itself. Through CBIR, images are retrieved from large collections based on image’s features that are automatically extracted from the image itself. There are three levels of image properties analyzed via the content-based approach: a) primitive features such as color, shape, and texture; b) logical features such as the identity of objects presented, and c) abstract characteristics of the images shown. Through content-based indexing systems, images are indexed without the use of words and are described not with textual elements but rather by the use of content descriptors such as color, texture, shape or form.

The use of automatic machine-processing techniques in image retrieval was derived from research in pattern recognition to "parse" basic attributes within images such as color, texture, and shape. While not expected to solve the more general problems of image retrieval, these techniques could be in retrieving subsets of specific visual attributes or as useful tools for segmenting large image collections or for retrieving specific image configurations.

Advantages and disadvantages of CBIR

Several advantages have been cited in using CBIR for image analysis and retrieval such as ease in extracting features from the image, ability to change extracted features to other form such as histogram, and ease in building an automatic process. However, disadvantages of CBIR including difficulty in getting the semantic meaning of images form low level features, unknown usability in handling real-life images, and difficulty in choosing the features for extraction.

Sources:

Baxter, G. and Anderson, D. (1996). Image indexing and retrieval: some problems and proposed solutions. Internet Research. 4, 67 – 76.

Chu, H. (2001). Research in image indexing and retrieval as reflected in the literature. Journal of the American Society for Information Science & Technology. 52 (12), 1011-1018

Conduit, N. and Rafferty, P. (2007). Constructing an image indexing template for The Children's Society: Users' queries and archivists' practice. Journal of Documentation, 63 (6), 898-919.
Jeong, K.T. (2002). A common representation for multimedia documents. Unpublished dissertation, University of Texas, Denton, Texas.

Jörgensen, C. Image Indexing: An Analysis of Selected Classification Systems in Relation to Image Attributes Named by Naïve Users http://worldcat.org/arcviewer/1/OCC/2003/06/12/0000003507/viewer/file1.html
Menard, E. (2007). Image indexing: How can I find a nice pair of Italian shoes? 34 (1) 21-25.

Menard, E. (2009). Images: indexing for accessibility in a multi-lingual environment -- challenges and perspectives. Indexer, 27 (2) 70-76.

No comments:

Post a Comment