Clarivate Report Highlights Importance of Evolution in Data Categorization to Promote Responsible Research Metrics

Clarivate Plc (NYSE:CLVT), a global leader in providing trusted information and insights to accelerate the pace of innovation, today released a report that examines the organization of information in the global scientific community and introduces a flexible new data-driven approach to citation-based classification. The report showcases new technology, developed in collaboration with the leading academic scientometrics team at the Centre for Science and Technology Studies (CWTS) at Leiden University in the Netherlands. 

Entitled “Data categorization: understanding choices and outcomes”, the report from the Institute for Scientific Information (ISI)™ outlines existing research categorical systems from around the world and the analytical consequences of applying them to national and institutional data.  It introduces a new and highly innovative approach to data aggregation based on trusted research data in the Web of Science™ citation network. It also promotes the need for good practice in data management to improve knowledge, competency and confidence and to ensure the responsible use of research metrics.

The team’s research found that a categorization scheme informed by article metadata is stronger than one arranged by human concepts, e.g., those which are journal-based, top-down and use expert input to split domains into related sub-categories. Instead, a citation-based classification of articles and reviews progressively links individual elements into larger units with shared characteristics based on features in the underlying data. This innovative approach demonstrated in InCites™ Citation Topics more accurately represents microclusters, or specialties, provides more uniform content and improves citation normalization. It also gives opportunity for novel groups to appear that were not previously possible with journal-based schemes.

Apart from its wide range of data selections, tests and visualizations, InCites provides multiple choices of top-down data classifications and now also offers Citation Topics as a bottom-up citation-based classification. The current implementation of Citation Topics is composed of 10 macro topics, 326 meso topics and 2,444 micro topics, with monthly and annual updating built in which will allow it to evolve over time.

Jonathan Adams, Chief Scientist at the Institute for Scientific Information at Clarivate and a co-author of the report said: “There are clear strengths and weaknesses of the variety of classification systems currently available, and our aim in introducing Citation Topics is to promote good practice in data management as part of the responsible use of research metrics.”

Ludo Waltman, Deputy Director at CWTS, Leiden University said: “Bottom-up citation-based classifications play a prominent role in many of the scientometric analyses that we carry out at CWTS. It is great to see that InCites users will now also be able to benefit from these powerful classifications.”

Joel Haspel, SVP Strategy, Science at Clarivate said: “This new report highlights the evolving nature of data categorization. It addresses the way we recognize natural divisions of knowledge and research and how we categorize publications for discovery, analysis, management and policy. Being aware of the characteristics and limitations of the ways we categorize research publications is important to research management because it influences the way we think about established and innovative research topics, the way we analyze research activity and performance, and even the way we set up organizations to do research.”