# crawler-user-agents **Repository Path**: mirrors_Olical/crawler-user-agents ## Basic Information - **Project Name**: crawler-user-agents - **Description**: Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome :star: - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-25 - **Last Updated**: 2026-05-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # crawler-user-agents This repository contains a list of of HTTP user-agents used by robots, crawlers, and spiders as in single JSON file. ## Install ### Direct download Download the [`crawler-user-agents.json` file](https://raw.githubusercontent.com/monperrus/crawler-user-agents/master/crawler-user-agents.json) from this repository directly. ### Npm / Yarn Install using npm or Yarn, or d ```sh npm install --save "https://github.com/monperrus/crawler-user-agents.git" # OR yarn add "https://github.com/monperrus/crawler-user-agents.git" ``` In Node.js, you can `require` the package to get an array of crawler user agents. ```js const crawlers = require('crawler-user-agents'); console.log(crawlers); ``` ## Usage Each `pattern` is a regular expression. It should work out-of-the-box wih your favorite regex library: * JavaScript: `if (RegExp(entry.pattern).test(req.headers['user-agent']) { ... }` * PHP: add a slash before and after the pattern: `ìf (preg_match('/'.$entry['pattern'].'/', $_SERVER['HTTP_USER_AGENT'])): ...` * Python: `if re.search(entry['pattern'], ua): ...` ## Contributing I do welcome additions contributed as pull requests. The pull requests should: * contain a single addition * specify a discriminant relevant syntactic fragment (for example "totobot" and not "Mozilla/5 totobot v20131212.alpha1") * contain the pattern (generic regular expression), the discovery date (year/month/day) and the official url of the robot * result in a valid JSON file (don't forget the comma between items) Example: { "pattern": "rogerbot", "addition_date": "2014/02/28", "url": "http://moz.com/help/pro/what-is-rogerbot-", "instances" : ["rogerbot/2.3 example UA"] } ## License The list is under a [MIT License](https://opensource.org/licenses/MIT). The versions prior to Nov 7, 2016 were under a [CC-SA](http://creativecommons.org/licenses/by-sa/3.0/) license. ## Related work If you are using Ruby, [Voight-Kampff](https://github.com/biola/Voight-Kampff) and [isbot](https://github.com/Hentioe/isbot) provide libraries for accessing this data. Other systems for spotting robots, crawlers, and spiders that you may want to consider include [isBot](https://github.com/gorangajic/isbot) (Node.JS), [Crawler-Detect](https://github.com/JayBizzle/Crawler-Detect) (PHP), [BrowserDetector](https://github.com/mimmi20/BrowserDetector) (PHP), and [browscap](https://github.com/browscap/browscap) (JSON files).