top of page
Crawling
What is Search Engine Crawling?
Search engine crawling is the process carried out by search engine bots to explore and analyse webpages.
Search engines send out these bots to find content across the internet and inspect the content to understand the page's purpose and relevance. Within this process, the crawlers also discover links leading to other pages on the same website, which they follow to discover and examine more content.
There are various algorithms used to determine how frequently they should crawl different websites and the content within them.
In addition, besides crawling HTML pages, search engines possess the capability to crawl non-text files such as images, audio clips, and PDF documents if these resources are linked within webpages. However, it's important to note that search engine bots can only gather information like the file name and metadata from these non-text files, as they cannot interpret their actual content.
Website owners can exercise control over which pages on their site search engines should crawl by utilising directives specified in the robots.txt file. For further information on robots.txt, you can explore it here.
bottom of page