Python Script SEO Content Analysis of your Competitor

Analyzing the content material of your rivals will give you invaluable insights regarding your operations and targets. This primary Python script can offer you information on n-Grams in seconds.

This Python Script could also be an elementary model of a content material evaluation of your competitor. The most plan is to induce a quick abstract of what the writing focus look like. A lean strategy is to fetch all pc addresses throughout the sitemap, take aside the URL slugs and run an n-Gram evaluation on it. If you wish to perceive quite a bit about n-Gram evaluation, please actually have a look at our Free N-Gram Tool. you may apply it not only for pc deal with nonetheless conjointly key phrases, titles, and so forth

As a end result, you may get an inventory of used n-Grams throughout the URL slugs together with the quantity of pages that used this n-Gram. This evaluation will solely take a pair of seconds, even on huge sitemaps, and may run with decrease than fifty traces of code.

Additional approaches

If you wish to induce deeper insights, I’ll recommend touring on with these approaches:

Fetch the content material of each common useful resource locator throughout the sitemap
Create n-Grams present in headlines
Create n-Grams discovered within the content material
Extract key phrases with Textrank or Rake
Extract acquainted entities for your SEO enterprise

But let’s start simple and take a main take into account the hole with this script. Supported your suggestions, I might add quite a bit of refined approaches. Before you run the script, you merely should be compelled to enter the sitemap URL you need to analyze. Once working the script, you may discover your results in sitemap_ngrams.csv. Open it in Excel or Google Sheets and make merry with analyzing the information.

Here is the Python code:

# Pemavor.com Sitemap Content Analyzer

# Author: Stefan Neefischer

import advertools as adv

import pandas as pd

def sitemap_ngram_analyzer(website):

sitemap = adv.sitemap_to_df(website)

sitemap = sitemap.dropna(subset=[“loc”]).reset_index(drop=True)

# Some sitemaps retains urls with “https://www.bignewsnetwork.com/” on the tip, some is with no “https://www.bignewsnetwork.com/”

# If there may be “https://www.bignewsnetwork.com/” on the tip, we take the second final column as slugs

# Else, the final column is the slug column

slugs = sitemap[‘loc’].dropna()[sitemap[‘loc’].dropna().str.endswith(“https://www.bignewsnetwork.com/”)].str.break up(“https://www.bignewsnetwork.com/”).str[-2].str.exchange(‘-‘, ‘ ‘)

slugs2 = sitemap[‘loc’].dropna()[~sitemap[‘loc’].dropna().str.endswith(“https://www.bignewsnetwork.com/”)].str.break up(“https://www.bignewsnetwork.com/”).str[-1].str.exchange(‘-‘, ‘ ‘)

# Merge two collection

slugs = record(slugs) + record(slugs2)

# adv.word_frequency mechanically removes the cease phrases

word_counts_onegram = adv.word_frequency(slugs)

word_counts_twogram = adv.word_frequency(slugs, phrase_len=2)

output_csv = pd.concat([word_counts_onegram, word_counts_twogram], ignore_index=True)

.rename({‘abs_freq’:’Count’,’phrase’:’Ngram’}, axis=1)

.sort_values(‘Count’, ascending=False)

#Save enter csv with scores

output_csv.to_csv(‘sitemap_ngrams.csv’, index=False)

print(“csv file saved”)

# Provide the Sitemap that needs to be analyzed

website = “https://searchengineland.com/sitemap_index.xml”

sitemap_ngram_analyzer(website)

#the outcomes can be saved to sitemap_ngrams.csv file

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Pages

Categories

Python Script SEO Content Analysis of your Competitor

Leave a Reply Cancel reply

Recommended For You

High Domain Authority Backlinks for Improved SEO Announced by

Optimizing Your Law Firm’s SEO Strategy Using Reddit | Good2bSocial

Han So Hee and Jeon Jong Seo clarify seatbelt controversy with new photo

A new Google Business Profile threat

Leave a Reply Cancel reply