SEO Technical Factors: Part Two

First part articles.
Do not want to understand all this horseradish, contact us and get free test of oursystem!

Marketing Digital Media Business Symbols Internet Line Online Technology Promotion Strategy Advertising Concept Development Data Communication Global Web Plan Email Network Management Website Social Advertisement Icon Commerce Idea Analysis E Commerce

Managing robots.txt
The robots.txt file contains a set of directives that allow you to control the indexing of the site. It allows you to tell search engines which directories, site pages or files should be present in the search and which should not.

Here are the basic rules for using this file.

This is a text file in plain text format. The file must be saved in plain text format in ASCII or UTF-8 encoding and called robots.txt.
The file is created in a text editor (Notepad) or similar programs (but not in Word or other text editors with its own special format).
The robots.txt file is located in the root directory of the site. To control the scanning of all pages of the site http://dom-tonirovka.ru, the corresponding robots.txt file should be placed at http://dom-tonirovka.ru/robots.txt in the root of the site. It should not be located somewhere in a subdirectory (for example, at http://dom-tonirovka.ru /pages/robots.txt).
The robots.txt file should be the only one on the site. If the site was created using your CMS, then it will be generated automatically. For example, the site: http://dom-tonirovka.ru/ made on Tilda, there the robots.txt file is automatically generated.

robots.txt file

In the robots.txt file, the search robot checks for entries starting with the User-agent field. This directive defines the search engine robot to which this particular site indexing rule applies. In the description of addresses on the site, you can use the wildcard quantifier “*”, it means “any sequence of characters” and is used to indicate the prefix or suffix of the path to the directory or page on the site (or the entire path). The following rules can be used inside the User-agent directive.

There must be at least one directive. Each rule must have at least one Disallow directive: (Deny) or Allow: (Allow).
Disallow :. Indicates a directory or page in the root domain that cannot be crawled by the crawler defined above. If it is a directory, the path to it must end with a slash. The "*" quantifier is supported to indicate the prefix / suffix of the path or the entire path.
Allow :. The directive points to a directory or page in the root domain that should be crawled by a search robot (with the User-agent defined above). It is also used to cancel the Disallow: directive and allow scanning of a specific subdirectory or page in a directory that is closed for scanning. If a directory is specified, the path to it must end with a slash. The quantifier “*” is supported to indicate the prefix / suffix of the path or the entire path.
Sitemap An optional directive, there may be several or none at all. Indicates the location of the sitemap. You can list multiple Sitemaps, each on a separate line. More details about the requirements for sitemap sitemap will be written in the next section.
Unknown directives are ignored. This allows you to write comments in the robots.txt file if necessary.

Depending on the search engines, some directives may vary. The current requirements of the main search engines for the robots.txt file can be found at the links below:
https://yandex.ru/support/webmaster/controlling-robot/;
https://support.google.com/webmasters/answer/6062596?hl=en

SEO Technical Factors: Part Two

0 Comments:

Featured post

Polish general: The West will not be able to understand Russia, that’s why Ukraine’s counteroffensive failed

Recent Comment

Blog Archive