Inverse Document Frequency is a term used to help determine the position of a term in avector space model.
IDF = log ( total documents in database / documents containing the term )
Inbound Link
Link pointing to one website from another website.
Most search engines allow you to see a sample of links pointing to a document by searching using the link: function. For example, using link:www.seobook.com would show pages linking to the homepage of this site (both internal links and inbound links). Due to canonical URL issues www.site.com and site.com may show different linkage data. Google typically shows a much smaller sample of linkage data than competing engines do, but Google still knows of and counts many of the links that do not show up when you use their link: function.
Collection of data used as bank to search through to find a match to a user fed query. The larger search engines have billions of documents in their catalogs.
When search engines search they search via reverse indexes by words and return results based on matching relevancy vectors. Stemming and semantic analysis allow search engines to return near matches. Index may also refer to the root of a folder on a web server.
Internal Link
Link from one page on a site to another page on the same site.
It is preferential to use descriptive internal linking to make it easy for search engines to understand what your website is about. Use consistent navigational anchor text for each section of your site, emphasizing other pages within that section. Place links to relevant related pages within the content area of your site to help further show the relationship between pages and improve theusability of your website.
Information Architecture
Designing, categorizing, organizing, and structuring content in a useful and meaningful way.
Good information architecture considers both how humans and search spiders access a website. Information architecture suggestions:

  • focus each page on a specific topic
  • use descriptive page titles and meta descriptions which describe the content of the page
  • use clean (few or no variables) descriptive file names and folder names
  • use headings to help break up text and semantically structure a document
  • use breadcrumb navigation to show page relationships
  • use descriptive link anchor text
  • link to related information from within the content area of your web pages
  • improve conversion rates by making it easy for people to take desired actions
  • avoid feeding search engines duplicate or near-duplicate content

Information Retrieval
The field of science based on sorting or searching through large data sets to find relevant information.InktomiSearch engine which pioneered the paid inclusion business model. Inktomi was bought by Yahoo! at the end of 2002.
Internal Navigation (see Navigation)
Vast worldwide network of computers connected via TCP/IP.Internet ExplorerMicrosoft’s web browser. After they beat out Netscape’s browser on the marketshare front they failed to innovate on any level for about 5 years, until Firefox forced them to.
Inverted File (see Reverse Index)
Invisible Web
Portions of the web which are not easily accessible to crawlers due to search technology limitations, copyright issues, or information architecture issues.
IP Address
Internet Protocol Address. Every computer connected to the internet has an IP address. Some websites and servers have unique IP addresses, but most web hosts host multiple websites on a single host.
Many SEOs refer to unique C class IP addresses. Every site is hosted on a numerical address like aa.bb.cc.dd. In some cases many sites are hosted on the same IP address. It is believed by many SEOs that if links come from different IP ranges with a different number somewhere in the aa.bb.cc part then the link may count more than links from the same local range and host.
IP delivery (see cloaking)
Internet Service Providers sell end users access to the web. Some of these companies also sell usage data to web analytics companies.
Italics (see emphasis)


A client-side scripting language that can be embedded into HTML documents to add dynamic features.
Search engines do not index most content in JavaScript. In AJAX, JavaScript has been combined with other technologies to make web pages even more interactive.


A word or phrase which implies a certain mindset or demand that targeted prospects are likely to search for.
Long tail and brand related keywords are typically worth more than shorter and vague keywords because they typically occur later in the buying cycle and are associated with a greater level of implied intent.
Keyword Density
An old measure of search engine relevancy based on how prominent keywords appeared within the content of a page. Keyword density is no longer a valid measure of relevancy over a broad open search index though.
When people use keyword stuffed copy it tends to read mechanically (and thus does not convert well and is not link worthy), plus some pages that are crafted with just the core keyword in mind often lack semantically related words and modifiers from the related vocabulary (and that causes the pages to rank poorly as well).
Keyword Funnel
The relationship between various related keywords that searchers search for. Some searches are particularly well aligned with others due to spelling errors, poor search relevancy, and automated or manual query refinement.
Keyword Not Provided
When Google shifted to using secured search they continued to pass referrers, but they stripped the keyword information from most organic Google referrals, making it much harder for SEOs to close the loop and figure out which keywords drove conversions via organic search. While Google stripped the organic keyword information overnight, they kept passing the equivalent keyword data to AdWords advertisers for years.
Keyword Research
The process of discovering relevant keywords and keyword phrases to focus your SEO and PPC marketing campaigns on.
Example keyword discovery methods:

  • using keyword research tools
  • looking at analytics data or your server logs
  • looking at page copy on competing sites
  • reading customer feedback
  • placing a search box on your site and seeing what people are looking for
  • talking to customers to ask how and why they found and chose your business

Keyword Research Tools
Tools which help you discover potential keywords based on past search volumes, search trends, bid prices, and page content from related websites.
Short list of the most popular keyword research tools:

  • SEO Book Keyword Research Tool – free keyword tool cross references all of my favorite keyword research tools. In addition to linking to traditional keyword research tools, it also links to tools such as Google Suggest, Buzz related tools, vertical databases, social bookmarking and tagging sites, and latent semantic indexing related tools.
  • Bing Ad Intelligence – Excel plugin offering keyword data from Microsoft. Requires an active Bing Ads advertiser account.
  • Google AdWords Keyword Planner – powered from Google search data, but requires an active AdWords advertiser account.
  • Wordtracker – paid, powered from Dogpile and MetaCrawler. Due to small sample size their keyword database may be easy to spam.
  • More keyword tools
  • Overture Term Suggestion – free, powered from Yahoo! search data. Heavily biased toward over representing commercial queries, combines singular and plural versions of a keyword into a single data point. This tool was originally located at http://inventory.overture.com/d/searchinventory/suggestion/ but was taken offline years ago.

Please note that most keyword research tools used alone are going to be highly inaccurate at giving exact quantitative search volumes. The tools are better for qualitative measurements. To test the exact volume for a keyword it may make sense to set up a test Google AdWords campaign.
Keyword Stuffing
Writing copy that uses excessive amounts of the core keyword.
When people use keyword stuffed copy it tends to read mechanically (and thus does not convert well and is not link worthy), plus some pages that are crafted with just the core keyword in mind often lack semantically related words and modifiers from the related vocabulary (and that causes the pages to rank poorly as well).
Keyword Suggestion Tools (see Keyword Research Tools)
Kleinberg, Jon
Scientist largely responsible for much of the research that went into hubs and authoritiesbased search relevancy algorithms.
Knowledge Graph
Search result enhancements where Google scrapes third party information & displays it in an extended format in the search results.
The goal of the knowledge graph is largely three-fold:

  • Answer user questions quickly without requiring them to leave the search results, particularly for easy to answer questions about known entities.
  • Displace the organic search result set by moving it further down the page, which in turn may cause a lift in the clickthrough rates on the ads shown above the knowledge graph.
  • Some knowledge graph listings (like hotel search, book search, song search, lyric search) also include links to other Google properties or other forms of ads within them, further boosting the monetization of the search result page.


Landing Page
The page on which a visitor arrives after clicking on a link or advertisement.
Landing Page Quality Scores
A measure used by Google to help filter noisy ads out of their AdWords program.
When Google AdWords launched affiliates and arbitrage players made up a large portion of their ad market, but as more mainstream companies have spent on search marketing, Google has done many measures to try to keep their ads relevant.
A citation from one web document to another web document or another position in the same document.
Most major search engines consider links as a vote of trust.
Link Baiting
The art of targeting, creating, and formatting information that provokes the target audience to point high quality links at your site. Many link baiting techniques are targeted at social mediaand bloggers.
Link Building
The process of building high quality linkage data that search engines will evaluate to trust your website is authoritative, relevant, and trustworthy.
A few general link building tips:

Link Bursts
A rapid increase in the quantity of links pointing at a website.
When links occur naturally they generally develop over time. In some cases it may make sense that popular viral articles receive many links quickly, but in those cases there are typically other signs of quality as well, such as:

  • increased usage data
  • increase in brand related search queries
  • traffic from the link sources to the site being linked at
  • many of the new links coming from new pages on trusted domains

Link Churn
The rate at which a site loses links.
Link Disavow (see Disavow)
Link Equity
A measure of how strong a site is based on its inbound link popularity and the authority of the sites providing those links.Link FarmWebsite or group of websites which exercises little to no editorial control when linking to other sites. FFA pages, for example, are link farms.Log FilesServer files which show you what your leading sources of traffic are and what people are search for to find your website.
Log files do not typically show as much data as analytics programs would, and if they do, it is generally not in a format that is as useful beyond seeing the top few stats.
Link Hoarding
A method of trying to keep all your link popularity by not linking out to other sites, or linking out using JavaScript or through cheesy redirects.
Generally link hoarding is a bad idea for the following reasons:

  • many authority sites were at one point hub sites that freely linked out to other relevant resources
  • if you are unwilling to link out to other sites people are going to be less likely to link to your site
  • outbound links to relevant resources may improve your credibility and boost your overall relevancy scores

“Of course, folks never know when we’re going to adjust our scoring. It’s pretty easy to spot domains that are hoarding PageRank; that can be just another factor in scoring. If you work really hard to boost your authority-like score while trying to minimize your hub-like score, that sets your site apart from most domains. Just something to bear in mind.” – Quote from Google’s Matt Cutts
Link Popularity
The number of links pointing at a website.
For competitive search queries link quality counts much more than link quantity. Google typically shows a smaller sample of known linkage data than the other engines do, even though Google still counts many of the links they do not show when you do a link: search.
Link Reputation
The combination of your link equity and anchor text.
Link Rot
A measure of how many and what percent of a website’s links are broken.
Links may broken for a number of reason, but four of the most common reasons are:

  • a website going offline
  • linking to content which is temporary in nature (due to licensing structures or other reasons)
  • moving a page’s location
  • changing a domain’s content management system

Most large websites have some broken links, but if too many of a site’s links are broken it may be an indication of outdated content, and it may provide website users with a poor user experience. Both of which may cause search engines to rank a page as being less relevant.
Link Velocity
The rate at which a page or website accumulates new inbound links.
Pages or sites which receive a huge spike of new inbound links in a short duration may hit automated filters and/or be flagged for manual editorial review by search engineers.
Microsoft portal which was used as their search brand after MSN search and before rebranding as Bing.
Long Tail
Phrase describing how for any category of product being sold there is much more aggregate demand for the non-hits than there is for the hits.
How does the long tail applies to keywords? Long Tail keywords are more precise and specific, thus have a higher value. As of writing this definition in the middle of October 2006 my leading keywords for this month are as follows:

#reqs search term
1504 seo book
512 seobook
501 seo
214 google auctions
116 link bait
95 aaron wall
94 gmail uk
89 search engine optimization
86 trustrank
78 adsense tracker
73 latent semantic indexing
71 seo books
69 john t reed
67 dear sir
67 book.com
64 link harvester
64 google adwords coupon
58 seobook.com
55 adwords coupon
15056 [not listed: 9,584 search terms]

Notice how the nearly 10,000 unlisted terms account for roughly 10 times as much traffic as I got from my core brand related term (and this site only has a couple thousand pages and has a rather strong brand).
Company originally launched as a directory service which later morphed into a paid search provider and vertical content play.
Latent Semantic Indexing is a way for search systems to mathematically understanding and representing language based on the similarity of pages and keyword co-occurance. A relevant result may not even have the search term in it. It may be returned based solely on the fact that it contains many similar words to those appearing in relevant pages which contain the search words.
Malda, Rob
Founder of Slashdot.org, a popular editorially driven technology news forum.
Manual Penalty
Website penalties which are applied to sites after a Google engineer determines they have violated the Google Webmaster Guidelines. Recoveries from manual penalties may time out years later, or a person can request a review in Google Webmaster Tools after fixing what they believe to be the problem.
Sites which had a manual penalty would typically have a warning show in Google Webmaster Tools, whereas sites which have an automated penalty like Panda or Penguin would not.
Manual Review
All major search engines combine a manual review process with their automated relevancy algorithms to help catch search spam and train their relevancy algorithms. Abnormal usage data or link growth patterns may also flag sites for manual review.
Mechanical Turk
Amazon.com program which allows you to hire humans to perform easy tasks that computers are bad at.
In The Selfish Gene Richard Dawkins defines a meme as “a unit of cultural transmission, or a unit of imitation.” Many people use the word meme to refer to self spreading or viral ideas.
Meta Description
The meta description tag is typically a sentence or two of content which describes the content of the page.
A good meta description tag should:

  • be relevant and unique to the page;
  • reinforce the page title; and
  • focus on including offers and secondary keywords and phrases to help add context to the page title.

Relevant meta description tags may appear in search results as part of the page description below the page title.
The code for a meta description tag looks like this
<meta name=”Description” content=”Your meta description here. ” / >
Meta Keywords
The meta keywords tag is a tag which can be used to highlight keywords and keyword phrases which the page is targeting.
The code for a meta keyword tag looks like this
<meta name=”Keywords” content=”keyword phrase, another keyword, yep another, maybe one more “>
Many people spammed meta keyword tags and searchers typically never see the tag, so most search engines do not place much (if any) weight on it. Many SEO professionals no longer use meta keywords tags.
Meta Refresh
A meta tag used to make a browser refresh to another URL location.
A meta refresh looks like this
<meta http-equiv=“refresh” content=“10;url=http://www.site.com/folder/page.htm”>
Generally in most cases it is preferred to use a 301 or 302 redirect over a meta refresh.
Meta Search
A search engine which pulls top ranked results from multiple other search engines and rearranges them into a new result set.
Meta Tags
People generally refer to meta descriptions and meta keywords as meta tags. Some people also group the page title in with these.

Maker of the popular Windows operating system and Internet Explorer browser, owner of the search engine Bing.    MindshareA measure of the amount of people who think of you or your product when thinking of products in your category.
Sites with strong mindshare, top rankings, or a strong memorable brand are far more likely to be linked at than sites which are less memorable and have less search exposure. The link quality of mindshare related links most likely exceeds the quality of the average link on the web. If you sell non-commodities, personal recommendations also typically carry far greater weight than search rankings alone.
Mirror Site
Site which mirrors (or duplicates) the contents of another website.
Generally search engines prefer not to index duplicate content. The one exception to this is that if you are a hosting company it might make sense to offer free hosting or a free mirror site to a popular open source software site to build significant link equity.
Movable Type
For sale blogging software which allows you to host a blog on your website.
Movable Type is typically much harder to install that WordPress is.
MSN Search
Search engine built by Microsoft. MSN is the default search provider in Internet Explorer.
Multi Dimensional Scaling
The process of taking shapshots of documents in a database to discover topical clusters through the use of latent semantic indexing. Multi dimensional scaling is more efficient than singular vector decomposition since only a rough approximation of relevance is necessary when combined with other ranking criteria.
One of the most popular social networking sites, largely revolving around connecting musicians to fans and having an easy to use blogging platform.
Natural Language Processing
Algorithms which attempt to understand the true intent of a search query rather than just matching results to keywords.
Natural Link (see Editorial Link)
Natural Search (see Organic Search Results)
Scheme to help website users understand where they are, where they have been, and how that relates to the rest of your website.
It is best to use regular HTML navigation rather than coding your navigation in JavaScript, Flash, or some other type of navigation which search engines may not be able to easily index.
Negative SEO
Attempting to adversely influence the rank of a third-party site.
Over time Google shifts many link building strategies from being considered white hat to gray hat to black hat. A competitor (or a person engaging in reputation management) can point a bunch of low-quality links with aggressive anchor text at a page in order to try to get the page filtered from the search results. If these new links cause a manual penalty, then the webmaster who gets penalized may not only have to disavow the new spam links, but they may have to try to remove or disavow links which were in place for 5 or 10 years already which later became “black hat” ex-post-facto. There are also strategies to engage in negative SEO without using links.
Originally a company that created a popular web browser by the same name, Netscape is now a social news site similar to Digg.com.
A topic or subject which a website is focused on.
Search is a broad field, but as you drill down each niche consists of many smaller niches. An example of drilling down to a niche market

  • search
  • search marketing, privacy considerations, legal issues, history of, future of, different types of vertical search, etc.
  • search engine optimization, search engine advertising
  • link building, keyword research, reputation monitoring and management, viral marketing, SEO copywriting, Google AdWords, information architecture, etc.

Generally it is easier to compete in small, new, or underdeveloped niches than trying to dominate large verticals. As your brand and authority grow you can go after bigger markets.
Attribute used to prevent a link from passing link authority. Commonly used on sites with user generated content, like in blog comments.
The code to use nofollow on a link appears like
<a href=”http://wwwseobook.com.com” rel=”nofollow”>anchor text </a>
Nofollow can also be used in a robots meta tag to prevent a search engine from counting any outbound links on a page. This code would look like this
Google‘s Matt Cutts also pushes webmasters to use nofollow on any paid links, but since Google is the world’s largest link broker, their advice on how other people should buy or sell links should be taken with a grain of salt. Please note that it is generally not advised to practice link hoarding as that may look quite unnatural. Outbound links may also boost your relevancy scores in some search engines.
Not Provided (see Keyword (Not Provided))


OntologyIn philosophy it is the study of being. As it relates to search, it is the attempt to create an exhaustive and rigorous conceptual schema about a domain. An ontology is typically a hierarchical data structure containing all the relevant entities and their relationships and rules within that domain.
Open Directory Project, The (see DMOZ)
Open Source
Software which is distributed with its source code such that developers can modify it as they see fit.
On the web open source is a great strategy for quickly building immense exposure and mindshare.
A fast standards based web browser.
Organic Search Results
Most major search engines have results that consist of paid ads and unpaid listings. The unpaid / algorithmic listings are called the organic search results. Organic search results are organized by relevancy, which is largely determined based on linkage data, page content, usage data, and historical domain and trust related data.
Most clicks on search results are on the organic search results. Some studies have shown that 60 to 80% + of clicks are on the organic search results.
Outbound Link
A link from one website pointing at another external website.
Some webmasters believe in link hoarding, but linking out to useful relevant related documents is an easy way to help search engines understand what your website is about. If you reference other resources it also helps you build credibility and leverage the work of others without having to do everything yourself. Some webmasters track where their traffic comes from, so if you link to related websites they may be more likely to link back to your site.
The company which pioneered search marketing by selling targeted searches on a pay per clickbasis. Originally named GoTo, they were eventually bought out by Yahoo! and branded asYahoo! Search Marketing.
Overture Keyword Selector Tool
Popular keyword research tool, based largely on Yahoo! search statistics. Heavily skewed toward commercially oriented searches, also combines singular and plural versions of a keyword into a single version.
Page, Larry
Co-founder of Google.
A logarithmic scale based on link equity which estimates the importance of web documents.
Since PageRank is widely bartered Google’s relevancy algorithms had to move away from relying on PageRank and place more emphasis on trusted links via algorithms such as TrustRank.
The PageRank formula is:
PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
PR= PageRank
d= dampening factor (~0.85)
c = number of links on the page
PR(T1)/C(T1) = PageRank of page 1 divided by the total number of links on page 1, (transferred PageRank)
In text: for any given page A the PageRank PR(A) is equal to the sum of the parsed partial PageRank given from each page pointing at it multiplied by the dampening factor plus one minus the dampening factor.
See also:

Page Title (see Title)
A method of allowing websites which pass editorial quality guidelines to buy relevant exposure.
See also:

Paid Link (see Text Link Ads)
Panda AlgorithmA Google algorithm which attempts to sort websites into buckets based on perceived quality. Signals referenced in the Panda patent include the link profile of the site & entity (or brand) related search queries.

Sites which have broad, shallow content (like eHow) along with ad heavy layouts, few repeat visits, and high bounce rates from search visitors are likely to get penalized. Highly trusted & well known brands (like Amazon.com) are likely to receive a ranking boost. In addition to the article databases & other content farms, many smaller ecommerce sites were torched by Panda.
Sites which had a manual penalty would typically have a warning show in Google Webmaster Tools, whereas sites which have an automated penalty like Panda or Penguin would not.
Pay for Performance
Payment structure where affiliated sales workers are paid commission for getting consumers to perform certain actions.
Publishers publishing contextual ads are typically paid per ad click. Affiliate marketing programs  pay affiliates for conversions – leads, downloads, or sales.
Search engines prevent some websites suspected of spamming from ranking highly in the results by banning or penalizing them. These penalties may be automated algorithmically or manually applied.
If a site is penalized algorithmically the site may start ranking again after a certain period of time after the reason for being penalized is fixed. If a site is penalized manually the penalty may last an exceptionally long time or require contacting the search engine with a reinclusion request to remedy.
Some sites are also filtered for various reasons.
See also:

Penguin AlgorithmGoogle algorithm which penalizes sites with unnatural link profiles.

When Google launched the Penguin algorithm they obfuscated the emphasis on links by also updating some on-page keyword stuffing classifiers at the same time. Initially they also failed to name the Penguin update & simply called it a spam update, only later naming it after plenty of blowback due to false positives. In many cases they run updates on top of one another or in close proximity to obfuscate which update caused an issue for a particular site. Sometimes Google would put out a wave of link warnings and manual penalties around the time Penguin updated & other times Penguin and Panda would run right on top of one another.
Sites which were hit by later versions of Penguin could typically recover on the next periodic Penguin update after disavowing low quality links inside Google Webmaster Tools, but sites which were hit by the first version of Penguin had a much harder time recovering. Sites which had a manual penalty would typically have a warning show in Google Webmaster Tools, whereas sites which have an automated penalty like Panda or Penguin would not.
Altering the search results based on a person’s location, search history, content they recently viewed, or other factors relevant to them on a personal level.PHPPHP Hypertext Preprocessor is an open source server side scripting language used to render web pages or add interactivity to them.
Pigeon Update
An algorithmic update to local search results on Google which tied in more signals which have been associated with regular web search.
Piracy Update
Search algorithm update Google performed which lowered the rankings of sites which had an excessive number of DMCA takedown requests. Google has exempted both Blogspot and YouTube from this “relevancy” signal.
Poison Word
Words which were traditionally associated with low quality content that caused search engines to want to demote the rankings of a page.
Portable Document Format is a universal file format developed by Adobe Systems that allows files to be stored and viewed in the original printer friendly context.
The percent of users who click on a site listed in the search results only to quickly click back to the search results and click on another listing.
A high POGO rate can be seen as a poor engagement metric, which in turn can flag a site to be ranked lower using an algorithm like Panda.
Web site offering common consumer services such as news, email, other content, and search.
Pay Per Click is a pricing model which most search ads and many contextual ad programs are sold through. PPC ads only charge advertisers if a potential customer clicks on an ad.
The ability of a search engine to list results that satisfy the query, usually measured in percentage. (if 20 of the 50 results match the query the precision is 40%)
Search spam and the complexity of language challenge the precision of search engines.
Profit Elasticity
A measure of the profit potential of different economic conditions based on adjusting price, supply, or other variables to create a different profit potential where the supply and demand curves cross.
A measure of how close words are to one another.
A page which has words near one another may be deemed to be more likely to satisfy a search query containing both terms. If keyword phrases are repeated an excessive number of times, and the proximity is close on all the occurrences of both words it may also be a sign of unnatural (and thus potentially low quality) content.


Query deserves freshness is an algorithmic signal based on things like a burst in search volume and a burst in news publication on a topic which tells Google that a particular search query should rank recent / fresh results.
Newly published content may be seen as fresh content, but older content may also be seen as fresh if it has been recently updated, has a big spike in readership, and/or has a large spike in it’s link velocity (the rate of growth of inbound links).
Quality Content
Content which is linkworthy in nature.