%0 Journal Article %T GLIMIR: Manifestation and Content Clustering within WorldCat %A Janifer Gatenby %A Richard O. Greene %A W. Michael Oskins %A Gail Thornburg %J Code4Lib Journal %D 2012 %I Code4Lib %X The GLIMIR project at OCLC clusters and assigns an identifier to WorldCat records representing the same manifestation. These include parallel records in different languages (e.g., a record with English descriptive notes and subject headings and one for the same book with French equivalents). It also clusters records that probably represent the same manifestation, but which could not be safely merged by OCLC's Duplicate Detection and Resolution (DDR) program for various reasons. As the project progressed, it became clear that it would also be useful to create content-based clusters for groups of manifestations that are generally equivalent from the end user perspective (e.g., the original print text with its microform, ebook and reprint versions, but not new editions). Lessons from the GLIMIR project have improved OCLC's duplicate detection program through the introduction of new matching techniques. GLIMIR has also had unexpected benefits for OCLC's FRBR algorithm by providing new methods for identifying outliers thus enabling more records to be included in the correct work cluster. %U http://journal.code4lib.org/articles/6812