Tuesday , December 24 2024

What Is Latent Semantic Indexing (LSI)?

[custom_frame_center]What Is Latent Semantic Indexing (LSI)?[/custom_frame_center]

It would be easier to state what Latent Semantic Indexing is not, than try to explain what it is. Even a statistical mathematician would find it extremely difficult to explain the concept of LSI, sometimes referred to as latent semantic analysis, to a layman in just a few words!

LSI is not what most SEO experts claim it to be. It is certainly not a concept that can be used by the average web designer or webmaster to improve their search engine listings, and is not what many people, including myself, has written it to be. However, first some background.

The term semantics is applied to the science and study of meaning in language, and the meaning of characters, character strings and words. Not just the language and words themselves, but the true meaning being conveyed in the context in which they are being used.

In 2002 a company called Applied Semantics, an innovator in the use of semantics in text processing, launched a program known as AdSense, which was a form of contextual advertising whereby adverts were placed on website pages which contained text that was relevant to the subject of the adverts.

The matching up of text and adverts was carried out by software in the form of mathematical formulae known as algorithms. It was claimed that these formulae used semantics to analyze the meaning of the text within the web page. In fact, what it initially seemed to do was to match keywords within the page with keywords used in the adverts, though some further interpretation of meaning was evident in the way that some relevant adverts were correctly placed without containing the same keyword character string as used on the web page.

Google launched its own contextual advertising system in March 2003, and subsequently acquired Applied Semantics just over a month later. Adsense as we know it was launched and webmasters could make considerable sums of money by attracting visitors to web pages specifically designed for the purpose. Every click on an advert earned cash from Google for the owner of the website displaying it.

It became commonplace for websites to comprise hundreds, and even thousands, of software-generated pages containing repetitions of keywords and long-tailed key phrases, but little else. Thousands of pages could be generated, the only difference between them being the keyword or phrase used, with no content whatsoever for the visitor. Such software is still being sold on the internet in spite of all the attention given to the so-called LSI algorithm.

Google searched each webpage that was registered for the Adsense system and determined the theme of the page my means of semantic analysis. At this time there was no differentiation made in the analysis between sites using only the same keyword repeatedly and those with genuine content relevant to the theme. Adverts related to this theme were then added to the page by Google.

These pages were ranked highly due to their high keyword density, and there were so many generated that only a small proportion needed to become visible in the listings for their owners to make money from the adverts that Google placed on them. These sites could generate several thousands of dollars for their owners every single day without contributing any worth to the internet at all.

In order to control this ‘spamming of its search indices with worthless websites, Google decided to add what it termed LSI, or latent semantic indexing, to its indexing algorithm, very similar to what it was using to determine the theme of Adsense pages. What this claims to do is to analyze the semantic content of websites and determine the true value of the site to any visitor using a specific search term.

This value was analyzed by looking for semantically similar words and phrases as the keywords used, rather than only the keywords themselves. In this way, pages containing keywords with little other contextually similar content were rooted out and the pages either de-listed or demoted down Googles search index for these keywords.

LSI is now regarded as being a major means of optimizing webpages to conform to the requirements of the Google algorithms. Minimal use of keywords, and more use of synonyms and phrases relevant to the contextual meaning of the keyword relating to the page, became the way to use LSI to achieve higher listings. Or so the SEO experts informed us. In fact, the concept of latent semantic indexing has been known in statistical analysis for decades, and is not something that can be ‘used as such on a website.

There are many SEO websites suggesting that they can provide a service to make our website LSI friendly, or meet ‘LSI requirements. One way of doing this, it is suggested, is to stuff the page full of synonyms and other related terms. I have written articles myself about how this can be done, and tried to suggest the correct way to use LSI. Although my suggested ‘use of LSI was erroneous in scientific terms, the ideas introduced are nevertheless good practice and will help you to produce webpages containing genuine content.

Having taken the time to do some research into what latent semantic indexing, or analysis, really means, I now know that webmasters cannot use LSI as such; to suggest otherwise is blatant nonsense. Up to date, I have not seen any explanation from SEO experts as to what latent semantic indexing truly is. I have read several LSI papers and reviews, written by mathematical statisticians, that attempt to explain the subject to the layman. This was achieved with extreme difficulty, and I doubt that anyone who is not an expert on semantics fully understands what the term means.

It appears to be commonly used in SEO as a general definition for the way that the mathematical detection of synonyms, and how certain words are related to others in a piece of text, is applied to the indexing of webpages by search engines. It has little to do with ‘latency, more to do with the actual usage of semantics within a text.

Too many people, me included, have professed to understand its use by Google and other search engines, without fully understanding what the term itself means. While it may be necessary for an SEO expert to able to explain to clients what the concept of LSI means to them, it is difficult to see to what practical use it can be put.

It is far better for people to forget about trying to manipulate the use of language, and to concentrate on writing honest and relevant content, while spending more time on building an intelligent and useful marketing campaign. One of the better ways to achieve this is to use article directories to promote their website through the publication of well written and relevant articles.

Check Also

How to Create Your Marketing Funnel Online

[custom_frame_center][/custom_frame_center] I’ve been working with several clients recently on the notion of creating a marketing …