A standard SEO website analysis is going to consist of checking everything on your site to make sure it is search engine friendly and optimized. The analyst will run tools on your site to check internal links, links coming from other sites (backlinks), keywords, keyword metrics, code analysis, current indexed pages, and a whole lot more depending on the depth they want to report on.
What I am going to run through are some tools and files you should have in place, to ensure everything is working as intended. These tools include Link checking, Text only browsing, and files to ensure your site is indexed properly.
Xenu’s Link Sleuth™ will check your site for broken links of any type.
Link checking is done on:
As the software runs it displays a continuous list of good and bad links found on every page of your site. You can sort each column by different criteria to pinpoint problems.
It displays a continuously updated list of URLs which you can sort by different criteria. A report can be produced at any time.
This program will find broken URLs in your CSS and JavaScript files and will what type of file the URL is pointing to text/html, image/gif, and more. Other nice features for SEO is the ability to sort the title of page column. This allows you to identify duplicated titles, miss spellings, un-optimized titles and pages without titles.
If you have a site with a lot of external links to other sites you should run this weekly if not two or more to ensure the resources are still working.
Download from http://home.snafu.de/tilman/xenulink.html
Lynx is a web browser like Firefox, IE, Chrome. Except Lynx only shows you text. It renders your web page and shows you how it would look to a search engine’s bot crawling your site. Some examples are googlebot, bingbot, yahoobot, ect. You want to know what your site looks like in the eyes of the search engines.
Lynx is a very old piece of software originally written for Unix systems. Lynx is available for Windows in 2 forms, The easy one and the hard one. Most people who are not used to installing software that does not have a nice installer that does the work for you will have no chances. That is the hard one. The easy one is just an extension for Firefox call Yellowpipe Lynx Viewer.
Remember what is closest to the top is the most important part of the page. If you don’t see any text that is important your page is not optimized. The reason why ALT tags are so important on images is so the web bots have a name to put with what is there. The image is not displayed but the name or words inside the img tag’s alt attribute. If your code was <img src=”logo.gif” alt=”Company’s Name Logo”/> The text that would be displayed would be “Company’s Name Logo”.
The Lynx browser puts a number next to link that is displayed as text. The link URL is then displayed at the very bottom of the page in the corresponding order it was found on the page. That is the reason for the number next to the text.
My suggestion would to use the simple version, the Firefox addon. I do support using the original versions and finding someone technical to set it up for you. However there are some compiled versions but the Firefox addon is more convenient for quick page checking.
http://download.cnet.com/Yellowpipe-Lynx-Viewer-Tool
http://en.wikipedia.org/wiki/Lynx_web_browser
A robots.txt file is a file specifically for search engine bots, crawlers, or spiders. This file tells them or suggests to them what they can go see. A nice playing web bot will find a link on your site and before it goes to index it. The bot will scan your robots.txt file to make sure it is allowed to.
A defacto standard robots.txt file to allow every file on your site to be indexed would be:
User-agent: * Disallow:
If however you have a folder with pdfs you od not not want indexed directly in a search engine your robots.txt file would look like:
User-agent: * Disallow: /pdf/
Or say you do not want web bots to crawl your site at all:
User-agent: * Disallow: /
The bad playing web bots like the ones that scan for email addresses or other blackhat acts will not even consider looking at your robots.txt. they could possibly look at it for places that you do not want anyone to see that is public and not protected. So do not try and hide anything with just a robots file.
A great place to make a robots.txt file: http://www.mcanerin.com/en/search-engine/robots-txt.asp
Your sitemap xml file is like a robots.txt file, it talks to web bots. However this time instead of telling it where it can go, the sitemap.xml file is the table of contents for your entire site.
The sitemap file allows you to also include additional info about each page:
This allows search engines and other web bots to crawl the site more intelligently and faster.
An example of a sitemap file:
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.example.com/</loc> <lastmod>2005-01-01</lastmod> <changefreq>monthly</changefreq> <priority>0.8</priority> </url> </urlset>
If you would like the technical explanation of the sitemap.xml for I suggest going to the resource below. The easiest way to make a sitemap.xml file is to use a generator which are available all over the web.
The sitemap.xml compliments the robots.txt file, you can add a line to the robots.txt file to reference your sitemap.xml file location. In the robots.txt you would add:
Sitemap: http://www.yoursite.com/sitemap.xml
You will see some sitemaps with the extension .gz. That extension is a compress version of the xml file. Sites will 1000s of pages can have a very large sitemap file. Compressing them reduces bandwidth among other things.
http://www.sitemaps.org/
http://www.xml-sitemaps.com/
Google, Yahoo, and Bing search engines all have portals for you to help you help them. You can submit your site, add sitemap.xml file location, and see any errors or problems the search engine is having with your site.
Out of the 3; Google Webmaster Tools is the most full featured. GWT as we refer to it. Lets you see a lot of great information, including:
That’s sums up Google Webmaster Tools features. I would suggest to start with Google, then go on to Yahoo and Bing. If you spend enough time in Google you will run through Yahoo and Bing in minutes. The features are about the same. Google has a lot more functionality so Yahoo and bing are currently much more scaled down versions of webmaster tool portals.
Google Webmaster Tools
http://www.google.com/webmasters/tools/
Yahoo Site Explorer
https://siteexplorer.search.yahoo.com/
Bing Webmaster Tools
http://www.bing.com/webmaster
The major idea is to get in and let Google, Yahoo, and Bing know you exist. It is like opening the door to your store and throwing a program guide and menu at them.
This will conclude my SEO website analysis starting guide. These tools are designed to get you started and show the 3 top search engine you exist if they do not know already. You will also find any problems they are having trying to index your website or any suggestions. I suggest you follow this guide to as it is written: