• English
  • Hindi
  • Punjabi
  • Marathi
  • German
  • Gujarati
  • Urdu
  • Telugu
  • Bengali
  • Kannada
  • Odia
  • Assamese
  • Nepali
  • Spanish
  • French
  • Japanese
  • Arabic
  • Home
  • Noida
  • National
    • BulletsIn
    • cliQ Explainer
    • Government Policy
    • New India
  • International
    • Middle East
    • Foreign
  • Entertainment
  • Business
    • Tender News
  • Sports
    • IPL2025
  • Services
    • Lifestyle
    • How To
    • Spiritual
      • Festival and Culture
    • Tech
Notification
  • Home
  • Noida
  • National
    • BulletsIn
    • cliQ Explainer
    • Government Policy
    • New India
  • International
    • Middle East
    • Foreign
  • Entertainment
  • Business
    • Tender News
  • Sports
    • IPL2025
  • Services
    • Lifestyle
    • How To
    • Spiritual
      • Festival and Culture
    • Tech
  • Home
  • Noida
  • National
    • BulletsIn
    • cliQ Explainer
    • Government Policy
    • New India
  • International
    • Middle East
    • Foreign
  • Entertainment
  • Business
    • Tender News
  • Sports
    • IPL2025
  • Services
    • Lifestyle
    • How To
    • Spiritual
      • Festival and Culture
    • Tech
  • Noida
  • National
  • International
  • Entertainment
  • Business
  • Sports
CliQ INDIA > International > Foreign > AI WorldCat deduplication
ForeignInternational

AI WorldCat deduplication

cliQ India
cliQ India
Share
5 Min Read
SHARE

OCLC Metadata Quality teams implement a variety of measures—both manual and automated—to improve the quality and usefulness of WorldCat data. These extensive and ongoing efforts ensure that WorldCat data supports the needs of our membership and our global network of thousands of libraries across a wide range of services. As the technologies and tools that allow us to do this important work evolve, we are continually exploring new methods for enriching, repairing, and de-duplicating WorldCat records—data that powers the global discovery and sharing of library resources.

At OCLC, we believe Artificial Intelligence (AI) is at its best when guided by human expertise. Our journey with AI is a partnership—where the insights and values of library professionals shape how AI serves communities. A core component of many AI systems is machine learning, which involves training algorithms on data to enable them to make predictions or decisions without explicit programming.

In August 2023, we implemented our first machine learning model for detecting duplicate bibliographic records as part of our ongoing efforts to mitigate and reduce their presence in WorldCat. In the lead up to this, we had invited the cataloging community to participate in data labeling exercises, from which we received feedback from over 300 users on approximately 34,000 duplicates to help validate our model’s understanding of duplicate records in WorldCat. This initiative led to the removal of ~5.4 million duplicates from WorldCat for printed book materials in English and other languages like French, German, Italian, and Spanish.

We’ve now enhanced and extended our AI model to de-duplicate all formats, languages, and scripts in WorldCat. Leveraging the labeled data collected from community participation, we’ve tuned and optimized the AI machine learning algorithm, completed extensive internal testing, and engaged WorldCat Member Merge libraries to provide external verification of the algorithm’s performance. 

On 11 February 2025, we will do a test run of 500,000 record pairs, targeting only print English books in WorldCat, and merging 500,000 duplicate records. Print English books represent the largest category of duplicates in WorldCat and is the format that has been most rigorously tested and improved in our machine learning de-duplication activities to date. After this initial run, we will pause to evaluate the results before completing more de-duplication passes of WorldCat to address the remaining duplicate pairs for print English books. Once this category of materials is completed, de-duplication runs will be done for all non-book and non-English materials. We will provide updates as we initiate additional runs.  

We recommend that libraries not using WorldShare Management Services enable WorldCat updates in WorldShare Collection Manager to ensure they receive the updated OCN for held records that were merged. If you suspect an incorrect merge, report it to bibchange@oclc.org. WorldCat Metadata Quality staff can view the history of merged records and recover them if needed.

Cleaning up duplicate records is one of the most impactful ways to improve the quality of WorldCat. WorldCat’s scale presents challenges, with data from various sources, cataloging practices, and languages. Amplifying manual efforts by metadata professionals with the latest AI technology have led to significant success in reducing the number of duplicates. This approach reinforces our commitment to quality, so AI can help libraries deliver accurate, streamlined experiences for users.

Thank you to our community members who have participated so far in this effort—your collaboration helps advance the profession and the mission of libraries worldwide by helping us to hone and scale automated resolution of duplicate records in WorldCat, which saves countless hours of time and improves the experience for the entire library community.

http%3A%2F%2Fwww.oclc.org%2Fcontent%2Fmarketing%2Fpublish%2Fen_us%2Fnews%2Fannouncements%2F2025%2Fai-worldcat-deduplication.html

You Might Also Like

Iran's new proposal to US response to Washington's latest amendments to draft plan to end war, reports Axios
EAM Jaishankar unveils bust of Mahatma Gandhi in Vietnam
UN chief Guterres calls on Hamas to 'immediately' release hostages without conditions
Namibia's 'founding father' Sam Nujoma passes away at 95
France's envoy to UN extends support for candidacy of G4 countries as permanent members of UNSC

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Whatsapp Whatsapp Telegram Copy Link Print
Share
What do you think?
Love0
Sad0
Happy0
Angry0
Wink0
Previous Article Zinnov Launches Comprehensive Framework for "Integrating Art into Public Spaces"
Next Article UAE leaders congratulate President of Sri Lanka on Independence Day

Stay Connected

FacebookLike
XFollow
InstagramFollow
YoutubeSubscribe
TelegramFollow
- Advertisement -
Ad imageAd image

Latest News

Bengal Falta Repoll 2026: Massive Security Deployment After Election Controversy | Cliq Latest
National
May 21, 2026
Peddi Promotion Event In Bhopal: Ram Charan And AR Rahman Ready For Mega Show | Cliq Latest
Entertainment
May 21, 2026
Junior NTR Dragon Teaser Out: NTR Stuns Fans With Intense Assassin Avatar | Cliq Latest
Entertainment
May 21, 2026
KKR Vs MI IPL 2026: Manish Pandey And Bowlers Revive Kolkata Playoff Dream | Cliq Latest
Sports
May 21, 2026

//

We are rapidly growing digital news startup that is dedicated to providing reliable, unbiased, and real-time news to our audience.

We are rapidly growing digital news startup that is dedicated to providing reliable, unbiased, and real-time news to our audience.

Sign Up for Our Newsletter

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Follow US

Follow US

© 2026 cliQ India. All Rights Reserved.

CliQ INDIA
  • English – अंग्रेज़ी
  • Hindi – हिंदी
  • Punjabi – ਪੰਜਾਬੀ
  • Marathi – मराठी
  • German – Deutsch
  • Gujarati – ગુજરાતી
  • Urdu – اردو
  • Telugu – తెలుగు
  • Bengali – বাংলা
  • Kannada – ಕನ್ನಡ
  • Odia – ଓଡିଆ
  • Assamese – অসমীয়া
  • Nepali – नेपाली
  • Spanish – Española
  • French – Français
  • Japanese – フランス語
  • Arabic – فرنسي
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?