Seo анализ сайта — screaming frog seo spider 12.1

Licence Activation

If you wish to use the free version, you can ignore this step. However, if you wish to crawl more than 500 URLs, save and re-open crawls and access the advanced features, then you can buy a licence.

When you purchase a licence, you are provided with a username and licence key which should be entered within the application under ‘Licence > Enter Licence Key’.

When entered correctly the licence will say it’s valid and show the expiry date. You will then be required to restart the application to remove the crawl limit and enable access to the configuration and paid features.

If the licence says it’s invalid, please read our FAQ on to troubleshoot.

6) Updated SERP Snippet Emulator

Earlier this year in May Google increased the column width of the organic SERPs from 512px to 600px on desktop, which means titles and description snippets are longer. Google displays and truncates SERP snippets based on characters’ pixel width rather than number of characters, which can make it challenging to optimise.

Our previous research showed Google used to truncate page titles at around 482px on desktop. With the change, we have updated our research and logic in the SERP snippet emulator to match Google’s new truncation point before an ellipses (…), which for page titles on desktop is around 570px.

Our research shows that while the space for descriptions has also increased they are still being truncated far earlier at a similar point to the older 512px width SERP. The SERP snippet emulator will only bold keywords within the snippet description, not in the title, in the same way as the Google SERPs.

Please note – You may occasionally see our SERP snippet emulator be a word out in either direction compared to what you see in the Google SERP. There will always be some pixel differences, which mean that the pixel boundary might not be in the exact same spot that Google calculate 100% of the time.

We are still seeing Google play to different rules at times as well, where some snippets have a longer pixel cut off point, particularly for descriptions! The SERP snippet emulator is therefore not always exact, but a good rule of thumb.

4) URL Mapping For Staging / Different URL Structure Comparison

As part of crawl comparison, we have introduced a URL Mapping feature, which allows you to compare two different URL structures effectively. This is helpful when comparing a production site against staging, or a website that uses separate mobile URLs against its desktop equivalent.

To set this up, click on the compare configuration and ‘URL Mapping’. Then input a regex to map the previous crawl URLs into the current crawl. For example, we recently switched host to Kinsta and tested a staging site pre to migrating using this new feature.

The Kinsta staging site is then rewritten to the production site and the equivalent URLs will be compared against each other for overview tab data, issues, and opportunities, the site structure tab, and change detection.

You can switch between the current and previous crawls to view the different URL structures still.

If a client has made much smaller changes like removed trailing slashes from URLs in staging, this feature can also be used to help compare the different versions of the same pages with minimal effort as well.

This will hopefully make SEOs lives a little bit easier during testing for new site releases and migrations.

1) Improved JavaScript Crawling

5 years ago we launched JavaScript rendering, as the first crawler in the industry to render web pages, using Chromium (before headless Chrome existed) to crawl content and links populated client-side using JavaScript.

As Google, technology and our understanding as an industry has evolved, we’ve updated our integration with headless Chrome to improve efficiency, mimic the crawl behaviour of Google closer, and alert users to more common JavaScript-related issues.

JavaScript Tab & Filters

The old ‘AJAX’ tab, has been updated to ‘JavaScript’, and it now contains a comprehensive list of filters around common issues related to auditing websites using client-side JavaScript.

This will only populate in JavaScript rendering mode, which can be enabled via ‘Config > Spider > Rendering’.

Crawl Original & Rendered HTML

One of the fundamental changes in this update is that the SEO Spider will now crawl both the original and rendered HTML to identify pages that have content or links only available client-side and report other key differences.

This is more in line with how Google crawls and can help identify JavaScript dependencies, as well as other issues that can occur with this two-phase approach.

Identify JavaScript Content & Links

You’re able to clearly see which pages have JavaScript content only available in the rendered HTML post JavaScript execution.

For example, our homepage apparently has 4 additional words in the rendered HTML, which was new to us.

By storing the HTML and using the lower window ‘View Source’ tab, you can also switch the filter to ‘Visible Text’ and tick ‘Show Differences’, to highlight which text is being populated by JavaScript in the rendered HTML.

Aha! There are the 4 words. Thanks, Highcharts.

Pages that have JavaScript links are reported and the counts are shown in columns within the tab.

There’s a new ‘link origin’ column and filter in the lower window ‘Outlinks’ (and inlinks) tab to help you find exactly which links are only in the rendered HTML of a page due to JavaScript. For example, products loaded on a category page using JavaScript will only be in the ‘rendered HTML’.

You can bulk export all links that rely on JavaScript via ‘Bulk Export > JavaScript > Contains JavaScript Links’.

Compare HTML Vs Rendered HTML

The updated tab will tell you if page titles, descriptions, headings, meta robots or canonicals depend upon or have been updated by JavaScript. Both the original and rendered HTML versions can be viewed simultaneously.

This can be useful when determining whether all elements are only in the rendered HTML, or if JavaScript is used on selective elements.

The two-phase approach of crawling the raw and rendered HTML can help pick up on easy to miss problematic scenarios, such as the original HTML having a noindex meta tag, but the rendered HTML not having one.

Previously by just crawling the rendered HTML the page would be deemed as indexable when in reality Google will see the noindex in the original HTML first, and subsequently skip rendering, meaning the removal of the noindex won’t be seen and the page won’t be indexed.

Shadow DOM & iFrames

Another enhancement we’ve wanted to make is to improve our rendering to better match Google’s own behaviour. Giacomo Zecchini’s recent ‘Challenges of building a search engine like web rendering service‘ talk at SMX Advanced provides an excellent summary of some of the challenges and edge cases.

Google is able to flatten and index Shadow DOM content, and will inline iframes into a div in the rendered HTML of a parent page, under specific conditions (some of which I shared in a tweet).

After research and testing, both of these are now supported in the SEO Spider, as we try to mimic Google’s web rendering service as closely as possible.

Troubleshooting

If set up correctly, this process should be seamless but occasionally Google might catch wind of what you’re up too and come down to stop your fun with an annoying anti-bot captcha test.

If this happens just pause your crawl, load up a PSI page in a browser to solve the captcha, then jump back in the tool highlight the URLs that did not extract any data right click > Re-Spider.

If this continues the likelihood is you have your crawl speed set too high, if you lower it down a bit in the options mentioned above it should put you back on track.

I’ve also noticed a number of comments reporting the PSI page not properly rendering and nothing being extracted. If this happens it might be worth a clear to the default config (File > Configuration > Clear to default). Next, make sure the user-agent is set to ScreamingFrog. Finally, ensure you have the following configuration options ticked (Configuration > Spider):

  • Check Images
  • Check CSS
  • Check JavaScript
  • Check SWF
  • Check External Links

If for any reason, the page is rendering correctly but some scores weren’t extracted,  double check the Xpaths have been entered correctly and the dropdown is changed to ‘Extract Text’. Secondly, it’s worth checking PSI actually has that data by loading it in a browser — much of the real-world data is only available on high-volume pages.

2) Google Sheets Export

You’re now able to export directly to Google Sheets.

You can add multiple Google accounts and connect to any, quickly, to save your crawl data which will appear in Google Drive within a ‘Screaming Frog SEO Spider’ folder, and be accessible via Sheets.

Many of you will already be aware that Google Sheets isn’t really built for scale and has a 5m cell limit. This sounds like a lot, but when you have 55 columns by default in the Internal tab (which can easily triple depending on your config), it means you can only export around 90k rows (55 x 90,000 = 4,950,000 cells).

If you need to export more, use a different export format that’s built for the size (or reduce your number of columns). We had started work on writing to multiple sheets, but really, Sheets shouldn’t be used in that way.

This has also been integrated into and the . This means you can schedule a crawl, which automatically exports any tabs, filters, exports or reports to a Sheet within Google Drive.

You’re able to choose to create a timestamped folder in Google Drive, or overwrite an existing file.

This should be helpful when sharing data in teams, with clients, or for Google Data Studio reporting.

Small Update – Version 14.3 Released 17th March 2021

We have just released a small update to version 14.3 of the SEO Spider. This release is mainly bug fixes and small improvements –

  • Add Anchor Text Alt Text & Link Path to ‘Redirect’ reports.
  • Show the display URL for duplicate content reports rather than the URL encoded URL.
  • Update right click ‘History >’ checks to be HTTPS.
  • Fix issue with Image Details tab failing to show images if page links to itself as an image.
  • Fix issue with some text labels being truncated.
  • Fix issue where API settings can’t be viewed whilst crawling.
  • Fix issue with GA E-commerce metrics not showing when reloading a DB crawl.
  • Fix issue with ‘Crawls’ UI not sorting on modified date.
  • Fix PageSpeed CrUX discrepancies between master and details view.
  • Fix crash showing authentication Browser.
  • Fix crash in visualisations.
  • Fix crash removing URLs.
  • Fix crash after editing SERP panel.
  • Fix odd colouring on fonts on macOS.
  • Fix crash during JavaScript crawling.
  • Fix crash viewing PageSpeed details tab.
  • Fix crash using Wacom tablet on Windows.

Extract

Now that the tool can crawl and render our chosen URLs, we need to tell it what data we actually want to pull out, (i.e: those glorious PageSpeed scores).

Open up the custom extraction panel, (Configuration > Custom > Extraction) and enter in the following Xpath variables depending on which metrics you want to pull.

Desktop Estimated Input Latency

If done correctly you should have a nice green tick next to each entry, a bit like this:

(Be sure to add custom labels to each one, set the type to Xpath and change the far right drop down from extract HTML to extract text.)

(There are also quite a lot of variables so you may want to split your crawl by mobile & desktop or take a selection of metrics you wish to report on.)

Hit OK.

3) Input Your Syntax

Next up, you’ll need to input your syntax into the relevant extractor fields. A quick and easy way to find the relevant CSS Path or Xpath of the data you wish to scrape, is to simply open up the web page in Chrome and ‘inspect element’ of the HTML line you wish to collect, then right click and copy the relevant selector path provided.

For example, you may wish to start scraping ‘authors’ of blog posts, and number of comments each have received. Let’s take the Screaming Frog website as the example.

Open up any blog post in Chrome, right click and ‘inspect element’ on the authors name which is located on every post, which will open up the ‘elements’ HTML window. Simply right click again on the relevant HTML line (with the authors name), copy the relevant CSS path or XPath and paste it into the respective extractor field in the SEO Spider. If you use Firefox, then you can do the same there too.

You can rename the ‘extractors’, which correspond to the column names in the SEO Spider. In this example, I’ve used CSS Path.

The ticks next to each extractor confirm the syntax used is valid. If you have a red cross next to them, then you may need to adjust a little as they are invalid.

When you’re happy, simply press the ‘OK’ button at the bottom. If you’d like to see more examples, then skip to the bottom of this guide.

Please note – This is not the most robust method for building CSS Selectors and XPath expressions. The expressions given using this method can be very specific to the exact position of the element in the code. This is something that can change due to the inspected view being the rendered version of the page / DOM, when by default the SEO Spider looks at the HTML source, and HTML clean-up that can occur when the SEO Spider processes a page where there is invalid mark-up.

These can also differ between browser, e.g. for the above ‘author’ example the following CSS Selectors are given –

Chrome: body > div.main-blog.clearfix > div > div.main-blog–posts > div.main-blog–posts_single–inside_author.clearfix.drop > div.main-blog–posts_single–inside_author-details.col-13-16 > div.author-details–social > aFirefox: .author-details–social > a:nth-child(1)

The expressions given by Firefox are generally more robust than those provided by Chrome. Even so, this should not be used as a complete replacement for understanding the various extraction options and being able to build these manually by examining the HTML source.

The w3schools guide on CSS Selectors and their XPath introduction are good resources for understanding the basics of these expressions.

Small Update – Version 4.1 Released 16th July 2015

We have just released a small update to version 4.1 of the Screaming Frog SEO Spider. There’s a new ‘GA Not Matched’ report in this release, as well as some bug fixes. This release includes the following –

GA Not Matched Report

We have released a new ‘GA Not Matched’ report, which you can find under the ‘reports’ menu in the top level navigation.

Data within this report is only available when you’ve connected to the Google Analytics API and collected data for a crawl. It essentially provides a list of all URLs collected from the GA API, that were not matched against the URLs discovered within the crawl.

This report can include anything that GA returns, such as pages in a shopping cart, or logged in areas. Hence, often the most useful data for SEOs is returned by querying the path dimension and ‘organic traffic’ segment. This can then help identify –

  1. Orphan Pages – These are pages that are not linked to internally on the website, but do exist. These might just be old pages, those missed in an old site migration or pages just found externally (via external links, or referring sites). This report allows you to browse through the list and see which are relevant and potentially upload via .
  2. Errors – The report can include 404 errors, which sometimes include the referring website within the URL as well (you will need the ‘all traffic’ segment for these). This can be useful for chasing up websites to correct external links, or just 301 redirecting the URL which errors, to the correct page! This report can also include URLs which might be canonicalised or blocked by robots.txt, but are actually still indexed and delivering some traffic.
  3. GA URL Matching Problems – If data isn’t matching against URLs in a crawl, you can check to see what URLs are being returned via the GA API. This might highlight any issues with the particular Google Analytics view, such as filters on URLs, such as ‘extended URL’ hacks etc. For the SEO Spider to return data against URLs in the crawl, the URLs need to match up. So changing to a ‘raw’ GA view, which hasn’t been touched in anyway, might help.

Other bug fixes in this release include the following –

  • Fixed a couple of crashes in the custom extraction feature.
  • Fixed an issue where GA requests weren’t going through the configured proxy.
  • Fixed a bug with URL length, which was being incorrectly reported.
  • We changed the GA range to be 30 days back from yesterday to match GA by default.

I believe that’s everything for now and please do let us know if you have any problems or spot any bugs via our support. Thanks again to everyone for all their support and feedback as usual.

Now go and !

5) Improved UX Bits

We’ve found some new users could get confused between the ‘Enter URL to spider’ bar at the top, and the ‘search’ bar on the side. The size of the ‘search’ bar had grown, and the main URL bar was possibly a little too subtle.

So we have adjusted sizing, colour, text and included an icon to make it clearer where to put your URL.

If that doesn’t work, then we’ve got another concept ready and waiting for trial.

The ‘Image Details’ tab now displays a preview of the image, alongside its associated alt text. This makes image auditing much easier!

You can highlight cells in the higher and lower windows, and the SEO Spider will display a ‘Selected Cells’ count.

The lower windows now have filters and a search, to help find URLs and data more efficiently.

Site visualisations now have an improved zoom, and the tree graph nodes spacing can be much closer together to view a site in its entirety. So pretty.

Oh, and in the ‘View Source’ tab, you can now click ‘Show Differences’ and it will perform a diff between the raw and rendered HTML.

3) Crawl Overview Right Hand Window Pane

We received a lot of positive response to our when it was released last year. However, we felt that it was a little hidden away, so we have introduced a new right hand window which includes the crawl overview report as default. This overview pane updates alongside the crawl, which means you can see which tabs and filters are populated at a glance during the crawl and their respective percentages.

This means you don’t need to click on the tabs and filters to uncover issues, you can just browse and click on these directly as they arise. The ‘Site structure’ tab provides more detail on the depth and most linked to pages without needing to export the ‘crawl overview’ report or sort the data. The ‘response times’ tab provides a quick overview of response time from the SEO Spider requests. This new window pane will be updated further in the next few weeks.

You can choose to hide this window, if you prefer the older format.

3) Improved Link Data – Link Position, Path Type & Target

Some of our most requested features have been around link data. You want more, to be able to make better decisions. We’ve listened, and the SEO Spider now records some new attributes for every link.

Link Position

You’re now able to see the ‘link position’ of every link in a crawl – such as whether it’s in the navigation, content of the page, sidebar or footer for example. The classification is performed by using each link’s ‘link path’ (as an XPath) and known semantic substrings, which can be seen in the ‘inlinks’ and ‘outlinks’ tabs.

If your website uses semantic HTML5 elements (or well-named non-semantic elements, such as div id=”nav”), the SEO Spider will be able to automatically determine different parts of a web page and the links within them.

But not every website is built in this way, so you’re able to configure the link position classification under ‘Config > Custom > ‘. This allows you to use a substring of the link path, to classify it as you wish.

For example, we have mobile menu links outside the nav element that are determined to be in ‘content’ links. This is incorrect, as they are just an additional sitewide navigation on mobile.

The ‘mobile-menu__dropdown’ class name (which is in the link path as shown above) can be used to define its correct link position using the Link Positions feature.

These links will then be correctly attributed as a sitewide navigation link.

This can help identify ‘inlinks’ to a page that are only from in-body content, for example, ignoring any links in the main navigation, or footer for better internal link analysis.

Path Type

The ‘path type’ of a link is also recorded (absolute, path-relative, protocol-relative or root-relative), which can be seen in inlinks, outlinks and all bulk exports.

This can help identify links which should be absolute, as there are some integrity, security and performance issues with relative linking under some circumstances.

Target Attribute

Additionally, we now show the ‘target’ attribute for every link, to help identify links which use ‘_blank’ to open in a new tab.

This is helpful when analysing usability, but also performance and security – which brings us onto the next feature.

2) Automated Crawl Reports For Data Studio

Data Studio is commonly the tool of choice for SEO reporting today, whether that’s for your own reports, clients or the boss. To help automate this process to include crawl report data, we’ve introduced a new Data Studio friendly custom crawl overview export available in scheduling.

This has been purpose-built to allow users to select crawl overview data to be exported as a single summary row to Google Sheets. It will automatically append new scheduled exports to a new row in the same sheet in a time series.

The new crawl overview summary in Google Sheets can then be connected to Data Studio to be used for a fully automated Google Data Studio crawl report. You’re able to copy our very own Screaming Frog Data Studio crawl report template, or create your own better versions!

This allows you or a team to monitor site health and be alerted to issues without having to even open the app. It also allows you to share progress with non-technical stakeholders visually.

Please read our tutorial on ‘How To Automate Crawl Reports In Data Studio‘ to set this up.

We’re excited to see alternative Screaming Frog Data Studio report templates, so if you’re a Data Studio whizz and have one you’d like to share with the community, let us know and we will include it in our tutorial.

Smaller Updates & Fixes

That’s the main features for our latest release, which we hope you find useful. Other bug fixes and updates in this release include the following –

  • The Analytics and Search Console tabs have been updated to allow URLs blocked by robots.txt to appear, which we believe to be HTML, based upon file type.
  • The maximum number of Google Analytics metrics you can collect from the API has been increased from 20 to 30. Google restrict the API to 10 metrics for each query, so if you select more than 10 metrics (or multiple dimensions), then we will make more queries (and it may take a little longer to receive the data).
  • With the introduction of the new ‘Accept-Language’ configuration, the ‘User-Agent’ configuration is now under ‘Configuration > HTTP Header > User-Agent’.
  • We added the ‘MJ12Bot’ to our list of preconfigured user-agents after a chat with our friends at Majestic.
  • Fixed a crash in XPath custom extraction.
  • Fixed a crash on start up with Windows Look & Feel and JRE 8 update 60.
  • Fixed a bug with character encoding.
  • Fixed an issue with Excel file exports, which write numbers with decimal places as strings, rather than numbers.
  • Fixed a bug with Google Analytics integration where the use of hostname in some queries was causing ‘Selected dimensions and metrics cannot be queried together errors’.

Exporting Data

You’re able to export all data into spread sheets from the crawl. Simply click the ‘export’ button in the top left hand corner to export data from the top window tabs and filters.

To export lower window data, right click on the URL(s) that you wish to export data from in the top window, then click on one of the options.

There’s also a ‘Bulk Export’ option located under the top level menu. This allows you to export the source links, for example the ‘inlinks’ to URLs with specific status codes such as 2XX, 3XX, 4XX or 5XX responses.

In the above, selecting the ‘Client Error 4XX In Links’ option above will export all inlinks to all error pages (pages that link to 404 error pages).

Small Update – Version 14.2 Released 16th February 2021

We have just released a small update to version 14.2 of the SEO Spider. This release includes a couple of cool new features, alongside lots of small bug fixes.

Core Web Vitals Assessment

We’ve introduced a ‘Core Web Vitals Assessment’ column in the PageSpeed tab with a ‘Pass’ or ‘Fail’ using field data collected via the PageSpeed Insights API for Largest Contentful Paint, First Input Delay and Cumulative Layout Shift.

For a page to ‘pass’ the Core Web Vital Assessment it must be considered ‘Good’ in all three metrics, based upon Google’s various thresholds for each. If there’s no data for the URL, then this will be left blank.

This should help identify problematic sections and URLs more efficiently. Please see our tutorial on How To Audit Core Web Vitals.

Broken Bookmarks (or ‘Jump Links’)

Bookmarks are a useful way to link users to a specific part of a webpage using named anchors on a link, also referred to as ‘jump links’ or ‘anchor links’. However, they frequently become broken over time – even for Googlers.

To help with this problem, there’s now a check in the SEO Spider which crawls URLs with fragment identifiers and verifies that an accurate ID exists within the HTML of the page for the bookmark.

You can enable ‘Crawl Fragment Identifiers’ under ‘Config > Spider > Advanced’, and then view any broken bookmarks under the URL tab and new ‘Broken Bookmark’ filter.

You can view the source pages these are on by using the ‘inlinks’ tab, and export in bulk via a right click ‘Export > Inlinks’. Please see our tutorial on How To Find Broken Bookmark & Jump Links.

14.2 also includes the following smaller updates and bug fixes.

  • Improve labeling in all HTTP headers report.
  • Update some column names to make more consistent – For those that have scripts that work from column naming, these include – Using capital case for ‘Length’ in h1 and h2 columns, and pluralising ‘Meta Keywords’ columns from singular to match the tab.
  • Update link score graph calculations to exclude self referencing links via canoncials and redirects.
  • Make srcset attributes parsing more robust.
  • Update misleading message in visualisations around respecting canonicals.
  • Treat HTTP response headers as case insensitive.
  • Relax Job Posting value property type checking.
  • Fix issue where right click ‘Export > Inlinks’ sometimes doesn’t export all the links.
  • Fix freeze on M1 mac during crawl.
  • Fix issue with Burmese text not displayed correctly.
  • Fix issue where Hebrew text can’t be input into text fields.
  • Fix issue with ‘Visualisations > Inlink Achor Text Word Cloud’ opening two windows.
  • Fix issue with Forms Based Auth unlock icon not displaying.
  • Fix issue with Forms Based Auth failing for sites with invalid certificates.
  • Fix issue with Overview Report showing incorrect site URL in some situations.
  • Fix issue with Chromium asking for webcam access.
  • Fix issue on macOS where launching via a .seospider/.dbseospider file doesn’t always load the file.
  • Fix issue with Image Preview incorrectly showing 404.
  • Fix issue with PSI CrUX data being duplicated in Origin.
  • Fix various crashes in JavaScript crawling.
  • Fix crash parsing some formats of HTML.
  • Fix crash when re-spidering.
  • Fix crash performing JavaScript crawl with empty user agent.
  • Fix crash selecting URL in master view when all tabs in details view are disabled/hidden.
  • Fix crash in JavaScript crawling when web server sends invalid UTF-8 characters.
  • Fix crash in Overview tab.

Экспорт

Вы можете просто использовать «кнопку экспорта» на любой вкладке, как обычно, для экспорта данных из режима списка.

Однако, если вы хотите экспортировать данные в режиме списка в том же порядке, в котором они были загружены, чтобы сопоставить их с другими данными, используйте кнопку «Экспорт», которая появляется рядом с кнопками «Загрузить» и «Пуск» в верхней части окна. пользовательский интерфейс.

Данные в экспорте будут в том же порядке и будут включать все точные URL-адреса из исходной загрузки, включая дубликаты или любые выполненные исправления.

«Исходный URL» – это загруженный URL, а «Адрес» – это URL, который просканировал SEO Spider.

3) GA & GSC Not Matched Report

The ‘GA Not Matched’ report has been replaced with the new ‘GA & GSC Not Matched Report’ which now provides consolidated information on URLs discovered via the Google Search Analytics API, as well as the Google Analytics API, but were not found in the crawl.

This report can be found under ‘reports’ in the top level menu and will only populate when you have connected to an API and the crawl has finished.

There’s a new ‘source’ column next to each URL, which details the API(s) it was discovered (sometimes this can be both GA and GSC), but not found to match any URLs found within the crawl.

You can see in the example screenshot above from our own website, that there are some URLs with mistakes, a few orphan pages and URLs with hash fragments, which can show as quick links within meta descriptions (and hence why their source is GSC rather than GA).

I discussed how this data can be used in more detail within the and it’s a real hidden gem, as it can help identify orphan pages, other errors, as well as just matching problems between the crawl and API(s) to investigate.

Export & Sort

After a coffee/nap/cat-vid, you should hopefully come back to a 100% completed crawl with every page speed score you could hope for.

Navigate over to the custom extraction tab (Custom > Filter > Extraction) and hit export to download it all into a handy .xls spreadsheet.

Once the export is open in Excel hit the find and replace option and replace https://developers.google.com/speed/pagespeed/insights/?url= with nothing. This will bring back all your URLs in the original order alongside all their shiny new speed scores for mobile and desktop.

After a tiny bit of formatting you should end up with a spreadsheet that looks something like this:


Bonus

What I find particularly powerful is the ability to combine this data with other metrics the spider can pull through in a separate crawl. As list mode exports in the same order its uploaded in, you can run a normal list mode crawl with your original selection of URLs connected to any API, export this and combine with your PSI scores.
Essentially allowing you to make an amalgamation of session data, PSI scores, response times, GA trigger times alongside any other metrics you want!

Small Update – Version 13.2 Released 4th August 2020

We have just released a small update to version 13.2 of the SEO Spider. This release is mainly bug fixes and small improvements –

  • We first released custom search back in 2011 and it was in need of an upgrade. So we’ve updated functionality to allow you to search within specific elements, entire tracking tags and more. Check out our custom search tutorial.
  • Sped up near duplicate crawl analysis.
  • Google Rich Results Features Summary export has been ordered by number of URLs.
  • Fix bug with Near Duplicates Filter not being populated when importing a .seospider crawl.
  • Fix several crashes in the UI.
  • Fix PSI CrUX data incorrectly labelled as sec.
  • Fix spell checker incorrectly checking some script content.
  • Fix crash showing near duplicates details panel.
  • Fix issue preventing users with dual stack networks to crawl on windows.
  • Fix crash using Wacom tablet on Windows 10.
  • Fix spellchecker filters missing when reloading a crawl.
  • Fix crash on macOS around multiple screens.
  • Fix crash viewing gif in the image details tab.
  • Fix crash canceling during database crawl load.

Расширенное сканирование в режиме списка

Режим списка действительно эффективен при правильной настройке. Есть несколько интересных продвинутых способов применения, которые помогут вам сфокусировать анализ и сэкономить время и силы.

Сканирование списка URL-адресов и другого элемента

Режим списка может быть очень гибким и позволяет сканировать список загружаемых URL и другой элемент.

Например, если вы хотите просканировать список URL-адресов и их изображений. Или вам нужно было проверить список URL-адресов и их недавно реализованные канонические, AMP или hreflang, а не весь сайт. Или вы хотели собрать все внешние ссылки из списка URL-адресов для построения неработающих ссылок. Вы можете выполнить все это в режиме списка, и процесс практически такой же.

Перейдя в режим списка, удалите , которое автоматически устанавливается равным «0». Перейдите в «Конфигурация> Паук> Ограничения» и снимите флажок с конфигурации.

Это означает, что SEO Spider теперь будет сканировать ваш список URL-адресов – и все URL-адреса в том же субдомене, на который они ссылаются.

Поэтому вам необходимо контролировать, что именно сканируется, с помощью параметров детальной конфигурации. Перейдите в “Конфигурация> Паук> Сканирование”. Отключите все «Ссылки на ресурсы» и «Ссылки на страницы» в меню конфигурации для «Сканирование».

Затем выберите элементы, которые вы хотите сканировать, рядом со списком URL-адресов. Например, если вы хотите просканировать список URL-адресов и их изображений, настройка будет такой.

А если вы загрузите один URL, например страницу SEO Spider, вы увидите, что страница и ее изображения просканированы.

Эта расширенная настраиваемость позволяет проводить лазерный аудит именно тех элементов связи, которые вам нужны.

Аудит перенаправлений

Если вы проверяете перенаправления при миграции сайта, может быть особенно полезно сканировать их целевые URL-адреса и любые встречающиеся цепочки перенаправления. Это избавляет от необходимости загружать несколько списков целевых URL-адресов каждый раз, чтобы добраться до конца.

В этом случае мы рекомендуем использовать конфигурацию «всегда следовать перенаправлениям» в разделе «Конфигурация> Паук> Дополнительно». Включение этой конфигурации означает, что «предел глубины сканирования» игнорируется, и перенаправления будут выполняться до тех пор, пока они не достигнут ответа, отличного от 3XX (или вашего, пока не будет достигнут предел «» в разделе «Конфигурация> Паук> Ограничения»).

Если вы затем воспользуетесь отчетом «Все перенаправления», он отобразит полную цепочку перенаправлений в одном отчете.

Пожалуйста, прочтите наше руководство по аудиту перенаправлений при миграции сайта для получения более подробной информации об этом процессе.

Подключение к API

В режиме списка вы можете подключиться к API-интерфейсам , , и инструментов анализа обратных ссылок для получения данных. Например, вы можете подключиться к и получить такие данные, как ссылающиеся домены, ключевые слова, трафик и ценность, которые затем отображаются на вкладке «Показатели ссылок».

Это может быть очень полезно, например, при сборе данных для конкурентного анализа.

5) rel=“next” and rel=“prev” Elements Now Crawled

The SEO Spider can now crawl rel=“next” and rel=“prev” elements whereas previously the tool merely reported them. Now if a URL has not already been discovered, the URL will be added to the queue and the URLs will be crawled if the configuration is enabled (‘Configuration > Spider > Basic Tab > Crawl Next/Prev’).

rel=“next” and rel=“prev” elements are not counted as ‘Inlinks’ (in the lower window tab) as they are not links in a traditional sense. Hence, if a URL does not have any ‘Inlinks’ in the crawl, it might well be due to discovery from a rel=“next” and rel=“prev” or a canonical. We recommend using the ‘Crawl Path Report‘ to show how the page was discovered, which will show the full path.

There’s also a new ‘respect next/prev’ configuration option (under ‘Configuration > Spider > Advanced tab’) which will hide any URLs with a ‘prev’ element, so they are not considered as duplicates of the first page in the series.

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *

Adblock
detector