Total of "indexed pages"

Bed & Breakfast / Short Term Rental Host Forum

Help Support Bed & Breakfast / Short Term Rental Host Forum:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.

briarrosebb

Well-known member
Joined
Jun 21, 2008
Messages
65
Reaction score
0
All,
This post is SEO inside baseball FWIW. I'm posting because this is subtle enough that I've had trouble remembering, but important enough that I want to remember... so I think it's helpful to some here.
Indexed pages is simply what's the count of pages that a search engine has indexed from your website. It has some value in understanding how well your website is doing. The slacker's way of arriving at this is to use google's "site" command... however, this command is notoriously inaccurate.
Then I ran into this post from a leading SEO authority:
http://www.seomoz.org/blog/indexation-for-seo-real-numbers-in-5-easy-steps
Basically, what he is saying is get "traffic sources" from Google Analytics, narrow by "search engines", then narrow by the particular search engine, and then filter by "landing page"... changing it from the default "keywords" filter.
I think he is assuming incorrectly that you must have traffic on all of your indexed pages... a bad assumption I think. I think also this is insufficient. I add to his steps to export to Excel and then delete a lot of junk:
  • query strings will create multiple pages for a single page... delete
  • Rezovation's sloppy ASPX booking engine is going to create a lot of bogus booking engine pages... delete
  • referrals from google's cache ... delete
  • we do some development in the production environment... so we'll get test pages indexed... delete
Anyway, for the last three months shows us with 84 pages when the reality is we have 90 plus pages. But this is better than the site command.
 
Randal Fishkin's method is interesting, and gives you a different set of numbers to look at. Can be pretty useful if you are dealing with a site with thousands of pages. He is really introducting it as a method for web analists to use as a number to communicate with "bosses" who want to know how their huge online store is doing. Consider it a landing page count. I wouldn't put much stock in the method for small sites. It's a lot of hoops to jump through to come up with what amounts to a 5 or 6% variation in your particular site's count. Google's index can easily match that variation from day to day.
For the most part, the site: command in google is accuarate, especially when looking at the size sites that are typical for a B&B. (not to get the site: command confused with the link: command which is only a small sampling of reality). If the count changes from one day to the next, it is usally a sign that something has happened and the actual number of pages in the active index has changed. The change may not be for any particular reason, but it actually changed. Example, this site has nearly 6000 pages but a few weeks back Google was showing only 400 indexed ... then 40,000 indexed... then 500. Things have finally settled out to around 1600 which is about right for this site. Google doesn't bother with the old threads too much. That doesn't mean that when the count was showing only 400, that the count wasn't accurate..it was accurate for that instant in time and that particular datacenter. Large variations like that are usually a sign of some large index re-vamping on Google's part (a roll-back to an earlier index, or an adjustment to let a penalty filter get applied to all the sites in the index.) Google "dances" used to play out over the course of a week or so ... now they play out practically in real time and sometimes the dance wobbles are too visible as google makes changes to the indexing and filtering methods (which are probably best thought about as being separate from their ranking algorithm.
It is also important to note that the site: command is more accurate than the indexing count that appears in Google Webmaster Tools.
 
Randal Fishkin's method is interesting, and gives you a different set of numbers to look at. Can be pretty useful if you are dealing with a site with thousands of pages. He is really introducting it as a method for web analists to use as a number to communicate with "bosses" who want to know how their huge online store is doing. Consider it a landing page count. I wouldn't put much stock in the method for small sites. It's a lot of hoops to jump through to come up with what amounts to a 5 or 6% variation in your particular site's count. Google's index can easily match that variation from day to day.
For the most part, the site: command in google is accuarate, especially when looking at the size sites that are typical for a B&B. (not to get the site: command confused with the link: command which is only a small sampling of reality). If the count changes from one day to the next, it is usally a sign that something has happened and the actual number of pages in the active index has changed. The change may not be for any particular reason, but it actually changed. Example, this site has nearly 6000 pages but a few weeks back Google was showing only 400 indexed ... then 40,000 indexed... then 500. Things have finally settled out to around 1600 which is about right for this site. Google doesn't bother with the old threads too much. That doesn't mean that when the count was showing only 400, that the count wasn't accurate..it was accurate for that instant in time and that particular datacenter. Large variations like that are usually a sign of some large index re-vamping on Google's part (a roll-back to an earlier index, or an adjustment to let a penalty filter get applied to all the sites in the index.) Google "dances" used to play out over the course of a week or so ... now they play out practically in real time and sometimes the dance wobbles are too visible as google makes changes to the indexing and filtering methods (which are probably best thought about as being separate from their ranking algorithm.
It is also important to note that the site: command is more accurate than the indexing count that appears in Google Webmaster Tools..
I'm with swirt on this one, but I like that there is another way to do it. I never like the only game in town. For the small b&b sites like we generally have the small percentage of error is inconsequencial.
The Google dance is much less wiggly lately than it was back in Nov-Feb, thanks be to the powers that be, and I am very interested to see what changes in my neck of the woods with the site load time factoring in page ranking now.
 
Randal Fishkin's method is interesting, and gives you a different set of numbers to look at. Can be pretty useful if you are dealing with a site with thousands of pages. He is really introducting it as a method for web analists to use as a number to communicate with "bosses" who want to know how their huge online store is doing. Consider it a landing page count. I wouldn't put much stock in the method for small sites. It's a lot of hoops to jump through to come up with what amounts to a 5 or 6% variation in your particular site's count. Google's index can easily match that variation from day to day.
For the most part, the site: command in google is accuarate, especially when looking at the size sites that are typical for a B&B. (not to get the site: command confused with the link: command which is only a small sampling of reality). If the count changes from one day to the next, it is usally a sign that something has happened and the actual number of pages in the active index has changed. The change may not be for any particular reason, but it actually changed. Example, this site has nearly 6000 pages but a few weeks back Google was showing only 400 indexed ... then 40,000 indexed... then 500. Things have finally settled out to around 1600 which is about right for this site. Google doesn't bother with the old threads too much. That doesn't mean that when the count was showing only 400, that the count wasn't accurate..it was accurate for that instant in time and that particular datacenter. Large variations like that are usually a sign of some large index re-vamping on Google's part (a roll-back to an earlier index, or an adjustment to let a penalty filter get applied to all the sites in the index.) Google "dances" used to play out over the course of a week or so ... now they play out practically in real time and sometimes the dance wobbles are too visible as google makes changes to the indexing and filtering methods (which are probably best thought about as being separate from their ranking algorithm.
It is also important to note that the site: command is more accurate than the indexing count that appears in Google Webmaster Tools..
I'm with swirt on this one, but I like that there is another way to do it. I never like the only game in town. For the small b&b sites like we generally have the small percentage of error is inconsequencial.
The Google dance is much less wiggly lately than it was back in Nov-Feb, thanks be to the powers that be, and I am very interested to see what changes in my neck of the woods with the site load time factoring in page ranking now.
.
I am very interested to see what changes in my neck of the woods with the site load time factoring in page ranking now.
I don't think we'll see site load time play a role ...nothing that can be pinned down.to it anyway. It may lower some local directories that depend on lots of database queries that slow things down significantly and maybe some really bad sites that just have non compressed graphics that they are resizing at the browser level. Though the results even for the non-compressed types of sites I don't think will change much (in most cases) because they probably weren't showing up all that competitively before. Also any changes in the results page have already happened because the site speed has already been figured in.
 
he does say
Now, technically I'm being a bit cheeky here. This number doesn't tell you the full story - it's not showing the actual number of pages a search engine has crawled or indexed on your site, but it does tell you the unique number of URLs that received at least 1 visit from the engine.
it actually is a good idea, but I'd think a site that has a dozen or so pages might not see a lot of benefit from this.
 
Back
Top