# url-spider **Repository Path**: mirrors_iansinnott/url-spider ## Basic Information - **Project Name**: url-spider - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-02 - **Last Updated**: 2026-05-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README * URL Spider A simple Script to spider all the URLs for a given domain _and_ its subdomains. Example: #+BEGIN_SRC shell yarn start:run 'https://iansinnott.com' #+END_SRC Will spider all the URLs at my site as well as all URLs at =blog.iansinnott.com=, =lab.iansinnott.com=, etc. URLs to external sites will be skipped. Once the script runs it will dup all the information to a temp file. The location on your system will depend on the built-in =mktemp= util. ** Usage #+BEGIN_SRC shell yarn start:run #+END_SRC` Will spider the == and all its subdomains. * FIXME This script does no stream processing. In other words, it will quite happily eat up all the JS heap memory if the site you're spidering has many URLs.