Saturday, February 16, 2008

SEO FAQ 3

What is the best way to redirect a site?

The best method is the so called 301 redirect as in this way you also tell the search engines that the page in question has been moved permanently (302 is temporary and is very risky for SEO purposes).

Here are the ways to do it:

Examples using .htaccess

Redirect 301 /oldfolder http://www.toanewdomain.com
Redirect 301 /oldurl.html http://www.yourdomain.com/newurl.html


If your server has windows then use it this way on the file that is being moved:

For windows server

<%@ Language=VBScript %>

<%

Response.Status="301 Moved Permanently"
Response.AddHeader "Location", "http://www.dindomän.se/och-kanske/nya-dokumentet.html"

%>


You can also use PHP or ASP for this and then just add these lines at the top of the file:

PHP

header("HTTP/1.1 301 Moved Permanently");
header("Location: http://www.newdomain.com/newdir/newpage.htm");
exit();


ASP 301

<%@ Language=VBScript %>
<%
Response.Status="301 Moved Permanently"
Response.AddHeader "Location", "http://www.newdomain.com/newdir/newpage.asp"
response.end
%>


Redirect using meta refresh or javascript is not recommended and may even be harmful as this method has been used a lot by spammers and is most likely making the search engines penalising your site.

What is TrustRank?

TrustRank, as defined in a whitepaper by Jan Pederson (Yahoo!), Hector Garcia-Molina (Stanford), and Zoltan Gyongyi (Stanford), located at http://www.vldb.org/conf/2004/RS15P3.PDF, is a system of techniques to determine if a page is reputable or if it is spam. This system is not totally automated, as it does need some human intervention.

TrustRank is designed to help identify pages and sites that are likely to be spam or those that are likely to be reputable. The algorithm first selects a small seed set of pages which will be manually evaluated by humans. To select the seed sites, they use a form of Inverse PageRank, choosing sites that link out to many sites. Of those, many sites were removed, such as DMOZ clones, and sites that were not listed in major directories. The final set was culled down to include only selected sites with a strong authority (such as a governmental or educational institution or company) that controlled the contents of the site. Once the seed set is determined, a human examines each seed page, and rates it as either spam or reputable. The algorithm can now take this reviewed set of seed pages and rate other pages based on their connectivity with the trusted seed pages.

The authors of the TrustRank method assume that spam pages are built to fool search engines, rather than provide useful information. The authors also assume that trusted pages rarely point to spam pages, except in cases where they are tricked into it (such as users posting spam urls in a forum post).

The farther away a page is from a trusted page (via link structure), the less certain is the likelihood that the page is also trusted, with two or three steps away being the maximum. In other words, trust is reduced as the algorithm moves further and further away from the good seed pages. Several formulas are used to determine the amount of trust dampening or splitting to be assigned to each new page. Using these formulas, some portion of the trust level of a page is passed along to other pages to which it links.

TrustRank can be used alone to filter the index, or in combination with PageRank to determine search engine rankings

What is the sandbox?

The sandbox is a kind of filter implemented by Google in March 2004 that applies to maybe 99% of all new sites.

It's function is to push down new web sites in the SERPs.

My theory is that it was Google's solution to stop new spam sites that was created to rank high in the SERPs. It took probably some time before the Google spiders could detect such as a site and ban/penalize it and by that time the creater probably made several new ones.

When this phenomena was first noticed in March 2004 it was seen that it could take two months before a new site was "released" and could rank normally again. However by now October 2005, a time of half a year or more is normal and as long as more then a year has been reported.

By my own observations I have seen in Google that new sites can rank unusually high in the Google SERPs for some weeks before the sandbox filter gets activated

How do I get out of the sandbox faster?


One theory is that it has to do with link aging.

This means that as soon you put your site live, start getting quality links to it and try to keep them there forever

The sandbox is a collective filter that still has a lot of confusion and speculations surrounding it. One of the biggest areas is how one can "escape" it or at least make your stay there shorter. The theories (and I stress they are theories) break down into the following areas.

Link Speed
Some SEO's claim the sandbox actually has nothing to do with time and is actually more a function of linking. There is a double edge sword here. To rank high you need lots of links, but to avoid the sandbox you can't get too many links. As a compromise its proposed that you build your links exponentially which basically means if you get 10 the first week you would get 15 the next week and 20 the following, etc. Its believed it is this slow, steady and consistent increase in your number of incoming links that causes Google to see it as a legitimate and emerging site. So in the first month you may have accumulated 300 links but by spreading them out over the month you have avoided the sandbox filter. Alternatively, you can buy your domain before you even start designing your site and slowly accululate links. This way by the time you are completely up and ready, you will not be subject to the sandbox.

Link Quality and Relevance
Majority of SEO's and Webmasters tend to go after reciprocal and directory links when starting their first campaign. After all they are most likely a PR0 with a site no one has ever heard of, so they go with the one technique where these factors have little importance. What results is 100's of unrelated and unimportant links. Google has been devaluing the importance of directories over the last year and with the addition of hundreds a day, this trend seems here to stay. In many theories it is believed that if Webmasters get the majority of their links from "Authority" sites, they will be viewed as important and therefore not subject to the filters reffered to as the Sandbox

Purely Age
There are still those that feel that age is still the most important factor. If you are registering a new domain name, there is no way around this. However, many SEO's buy old and existing domain names and add their own content. If you are able to get a DMOZ listed domain, you may help your chances even more as it will have an existing link structure. There are many services on the web that offer lists of upcoming expirations so you may be able to grab one at a bargain price. Be aware though that there is a number of SEO's that believe that these domains will be "reset" when ownership is changed so you may be subject to the sandbox anyway.

Good Luck with your pending escape for the Sandbox

No comments: