Twitter Facebook Linkedin Product Hunt

FAQ

Answers to frequently asked questions

Frequently Asked Questions

Why SiteAnalyzer is better than its analogs (Screaming Frog, Sitebulb, Netpeak Spider, etc.)?

We believe that the main advantages of SiteAnalyzer over competitors are as follows:

  1. High-speed scanning that does not concede with competitors.
  2. Ability to enlist all projects instead of downloading them one by one, as most competitors do.
  3. 80% of the SiteAnalyzer features are the same as in paid analogs, so you will have all the tools needed for site analysis.

We are planning to implement the main tools competitors have in the nearest future and add our own to provide even better user experience also.

Is it feasible to scan 1 million web pages or even more?

The program was tested in two operating systems: Windows XP and Windows 10.

  • Windows XP x32 (3 GB RAM):
    • scanned URL (html-pages): 92 000
    • analyzed URI (pages, images, scripts, documents, etc.): 296 316
    • elapsed time: 2 hours 50 minutes
  • Windows 10 x64 (8 GB RAM):
    • scanned URL (html-pages): 118 000
    • analyzed URI (pages, images, scripts, documents, etc.): 2 334 260
    • elapsed time: 6 hours 12 minutes

Therefore, as we can see the program can scan sites of almost any volume; it all depends on the RAM memory on your PC. The more RAM memory – the more site pages it will be able to scan.

P.S. Send your highscores of scanned pages (preferably with screenshots) and we'll include them in this FAQ.

How many iterations are recommended for calculating PageRank?

The more iterations used the less errors in calculations occur.
For instance, after the 10th iteration, the weight value changes to thousandths and ten thousandths, i.e. values so small that can easily be neglected.
Even a couple of iterations will be enough to see the difference in weight.
We recommend using 10 iterations.

How to preserve all projects when updating the SiteAnalyzer?

Usually, when you update the SiteAnalyzer, the results of previous scans are saved. However, when you bring in changes to database structure (this happens with major updates), you have to scan the projects again.

The best way to do this:

  • Run the current version.
  • Select the sites you need in the list of projects.
  • Use the context menu to copy them to the clipboard.
  • Launch a new version of the program.
  • Add bulk-copied sites (click on the plus icon and paste the URL from the clipboard).
  • Rescan the necessary sites if needed.
  • Profit!

What does the «lack of resources» in the program log error mean?

Sometimes when you scan large volumes with many threads in a log, you may receive the following error message: "The project scan is stopped due to lack of computer resources. We recommend changing the scan settings".

Due to a lack of computer resources, the project scan is stopped

This error message comes up when the program is running and there is a shortage of system resources in the operating system. In this case, the scanning of the current project automatically stops. This is done to prevent the occurrence of system errors and the correct recording of data in the database.

To prevent this type of error, it is recommended to increase the amount of RAM on your computer, switch to 64-bit Windows, and optimize the scanning parameters in the program settings (reduce the number of threads, limit the number of pages for parsing).

Sites are not scanned through Tilda, why?

When you try to scan projects through the Tilda site builder, you can often get a 307 redirect response code, and the site scanning will be stopped. This is due to Tilda internal protection against DDoS to parse the site. A possible solution is to set long delays between requests in the program settings (5-10 seconds), but it might not help solve the issue.

Tilda Technical Support Official Comment: This redirect is done by the integrated DDoS protection system. If you proceed using site scanners, then the protection will trigger more frequently. In this case, it triggered the "human" check and the service failed it. We cannot disable the protection system, since it is designed for the security of all sites. The protection is disabled in the case of search bots.

Scan sites with JavaScript

Q: Hello. I add a site for scanning, but the program sees only a few pages, although the site has a fairly large catalog of over 3000 pages. Can you please tell me what I do wrong?

A: Your site is made using JavaScript technology so since SiteAnalyzer can't render sites like that so far and it can't see all the links to other site pages. Thus, it brings so few results. The support of rendering sites in JavaScript is planned to be realized in the nearest versions.

While scanning the program finds many 403 error pages

Q: However, when you go to these pages, they are loaded and give up code 200. Can you tell me why this is the case?

A: If the site is Bitrix-based, then most likely a special module designed to protect against DDoS attacks and frequent requests to the site from one IP, so after some time the pages with the 403 error may again give the code 200. Proxies can be used to bypass such blocking.

How many streams are optimal to parse sites?

Q: I was parsing a site with 100k+ links yesterday and today. On my second attempt at 40 threads. All done, happy with it. How many threads is better to parse such volumes? I understand that the more threads, the higher the probability of missing a bad link.

A: The more threads in progress the greater the load on the server and the higher the probability of error 500, for example. I usually test on 10 threads. If the server is powerful you can have 20 or more.
Also, the number of threads affect the speed of writing data to disk. If there are many threads and you have a regular HDD, it can operate significantly slower than SSD.

When entering the key, the program writes that it is already registered

In this case, delete all your devices in the «Devices» section of your personal cabinet, and then re-enter the key in the program registration window.

The program does not scan the site. What to do?

If SiteAnalyzer does not scan your site try to change the «User-Agent» parameter to Googlebot or other bots in program settings, section «User-Agent».

How do I bypass a site on Cloudflare?

Q: I'm trying to bypass my own site, which is on Cloudflare (a popular DDoS protection system). I get a 403 error in response. What to do?

A: You can get around this by adding an exception for SiteAnalyzer in your settings.
The rule inside Cloudflare: Security -> WAF -> Create rule (screenshot with an example).

I can't register the program. What to do?

The error «Failure when receiving data from the peer», which occurs when entering the registration key indicates that the program cannot access the site to check the validity of the key. Most likely the connection is blocked by antivirus or firewall. Try to disable them and add SiteAnalyzer to the list of exceptions or «trusted» applications.

Requests per second

Q: Tell me how to set up the settings to configure the site scanning to no more than 4 requests per second?

A: Four (4) requests per second is one thread and the parameter «Requests interval» = 250 (milliseconds).

SiteAnalyzer System Requirements

The SiteAnalyzer does not have high system requirements. If your computer has Windows XP and above, you can safely run the program to audit the site without any problems.

Microsoft Windows versions: 11/10/8/7/Vista/XP (32 & 64-bit).

Is the SiteAnalyzer compatible with Linux and MacOS systems?

The program is currently supported on Windows OS and we currently don’t carry out development for other platforms.

However, there is a way out!

  • On Linux OS you can run SiteAnalyzer using Wine (which is an alternative implementation of the Win32 API that allows users of UNIX-like operating systems to run 32- and 64-bit applications).
    Read more about Wine emulator.
  • On macOS you can run SiteAnalyzer using CrossOver (a program that allows you to run applications written for Microsoft Windows on Linux and macOS, without having Windows installed).
    Read more about CrossOver emulator.

SiteAnalyzer with Linux and MacOS

Our Clients