COAR Launches AI Bots and Repositories Task Force

There are a growing number of aggressive crawlers interacting with repositories. While many crawlers are rather innocuous, others are sufficiently aggressive that they are increasingly causing service disruptions in repositories. A survey undertaken by COAR in April 2025 found that 90% of respondents indicated their repository is encountering aggressive crawlers (referred here colloquially as “AI Bots”), usually more than once a week, and often leading to slow downs and service outages. While there is no way to be 100% certain of the purpose of these bots, the assumption in the community is that they are AI bots gathering data for generative AI training.

This type of traffic has shown a marked increase in the last two years or so, and is having a considerable impact on repositories both in terms of the quality of service provision as well as the time and resources required to deal with the issue. In addition, in order to mitigate their impact, a variety of measures are being used to minimize or stop aggressive crawlers from accessing repositories, however, many of these measures also unintentionally block the benign systems and individual human users. The recent rise in aggressive bots activity could potentially result in repositories limiting access to their resources for both human and machine users – leading to a situation where the value of the global repository network is substantially diminished.

On July 15, 2025, COAR launched the “AI Bots and Repositories Task Force” in order to help the repository community navigate this rapidly evolving landscape and develop solutions that allow repositories to remain as open as possible. The Task Force brings together technical experts and representatives from repositories to assess the potential solutions to this problem and develop recommendations for the repository community.

Objectives

  1. Articulate the problem space, and provide evidence where possible
  2. Understand and document the available mitigation strategies
  3. Reiterate the importance of allowing legitimate machine-access to repositories
  4. Make recommendations for mitigation of the problems experienced by repositories which do not create problems for legitimate remote system access.

The aim is to have a report available for the community sometime in the fall of 2025.