To achieve top rankings on Google and other search engines, keywords have been relevant for many years. It has long been insufficient to increase the keyword density for certain terms - in the worst case, this even leads to devaluation. With the TF*IDF formula a complex alternative of term weighting is available, which everyone who wants to do SEO should deal with. I will show you how the TF*IDF analysis works and introduce you to some analysis tools.

What is TF*IDF

TF*IDF is the product of two parameters that play a role in the analysis of keywords on web pages. TF stands for "Term Frequency" and indicates mathematically how often a certain term can be found within a certain document, e.g. a web page. The TF is calculated logarithmically, more frequent mentions are somewhat weakened by this.

IDF stands for "Inverse Document Frequency" and is also developed as a logarithm. Here, the frequency of the term or keyword is determined in relation to a totality of documents that are used for comparison. In concrete terms, of course, it is not possible to relate one's own website to all other websites on the Internet. Instead, a selection of the top ranked pages for this keyword is used.

The TF*IDF product not only makes it possible to determine what a reasonable keyword density is for the given term. Rather, the analysis shows which topic-relevant terms can be found in one's own document and in the entire document pool. For these, too, an optimization for the minimum and maximum mention on a website can be carried out after a successful analysis.

Where does the TF*IDF analysis come from?

The basic principle of TF*IDF dates back to a time when there were no search engines and the Internet was not yet widespread. Pioneering work was done by Donna Harman in the information sciences in the early 1990s. The algorithm she developed was intended to enable term weighting within documents, at that time still in paper form.

After Google penalized too frequent mentioning of keywords in the ranking in the first decade of the new millennium ("keyword stuffing"), the relative frequency of keywords came to the fore. This is a major strength of Donna Harman's method, which also helps in the search for topic-relevant keywords.

What exactly does a TF*IDF analysis measure?

The TF*IDF analysis compares all words and terms of the website to be analyzed with the textual content of other websites, which are understood as competitors for TOP rankings in search engines. By multiplying the two values TF and IDF, a ratio is created for all relevant terms as to how frequently keywords are found on one's own website and the pages of the competition.

The pool of all comparison pages creates framework values, how often or rarely certain terms should be found in the document. If individual terms occur more frequently than with the competition, the risk of keyword stuffing increases. If the frequency is too low, Google & Co. might consider the comparison pages more relevant.

Since the analysis method puts all other terms of the own web page and the comparison documents in relation beside the leading keyword, it concerns a more complex form of the keyword analysis. The goal of contemporary SEO is to place as many relevant terms as possible in one's own text in the right number.

Attention: The comparison pool at TF*IDF is not static. The competition is also optimized for search engine optimization and will partly use a TF*IDF analysis. This changes the list of top pages that are used as a comparison pool for the IDF values. A regular analysis of existing pages is therefore advisable.

What is the benefit of a TF*IDF analysis for SEO?

Using the analysis method for search engine optimization has become mandatory for many website operators. The main advantages of TF*IDF analysis include:

  • The danger of keyword stuffing with too frequent mention of the keyword on a web page can be completely avoided.
  • In addition to the keyword to be optimized, topic-relevant terms can be integrated into the text, which increases the added value.
  • The analysis method provides a first, practicable point of orientation as to what good texts on one's own website might look like.
  • By including the direct competition, a well thought-out optimization of one's own content can be carried out over months and years.

Welche Tools für TF*IDF Analysen gibt es?


As a specialist in onpage optimization, SEObility offers a wide range of tools for webmasters, including TF*IDF analysis. In addition to the commercial variant, SEObility holds a simplified, free tools for beginners on the net.


Ryte is a well-known developer that provides various analysis tools for the user experience. The offered service packages also include a FREE version, with which basic TF*IDF analyses can be performed.


Sistrix is probably the best known SEO tool in Germany, the Sistrix Toolbox is one of the best known compilations of SEO tools for webmasters. The offer also includes a tool for TF*IDF to push the ranking of individual web pages among other factors.


SEOlyze with its offer for onpage optimization rather belongs to the specialists as far as TF*IDF analyses are concerned. The company offers various analysis concepts with different focuses, supplemented by further research tools for keywords and more.


Xovi is a comprehensive SEO software with a wide range of analysis options. The tool also includes the keyword analysis according to TF*IDF, offers itself with its wide range of additional tools especially for professional agencies.

Are there any disadvantages of TF*IDF?

Unfortunately, there is no golden road in search engine optimization that always leads to success. Even the analysis with TF*IDF has its weak points and disadvantages, which you should understand when optimizing your website:

  • The analysis form always provides you with insights about a complete web page, never about individual passages of the page.
  • The analysis method is not really suitable for short texts, which becomes a problem especially for product descriptions in online stores.
  • Depending on the type and number of comparison documents, the analysis yields sometimes significantly different results.

My conclusion and relevance in 2022

Despite the mentioned disadvantages, the TF*IDF analysis is recommended to all website operators as an SEO tool. It provides much more than the determination of individual, relevant keywords and is thanks to powerful tools of the software developers mentioned easily applicable even by laymen.

Artikel teilen:

Keine Kommentare bisher

Bisher wurden noch keine Kommentare geschrieben. Sei der Erste!

Notiz: Bitte achte darauf, dass dein Kommentar sich auf das Thema zu dem Artikel bezieht. Lass uns eine fruchtbare und angenehme Diskussion haben!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.