2 Pages V  1 2 >  
Reply to this topicStart new topic
> How To Prosper in the New Google

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 5 2004, 06:38 PM
I spotted a new report by Dan Thies over at Jill's place this morning. I finally got a chance to read through it and thought that it was most definitely worth posting here.

How To Prosper In the New Google (PDF)

It goes through and explains a lot of the recent changes (many of which are things I've been trying to say for a while, but my mastery of being able to get things out of my brain and onto a page is lacking). There are a few things in there I don't agree with (most notably in the site structure area), but that's neither here nor there. There are going to be things in there that you don't agree with, too.

The point of this document is to try and sift through what is mere speculation (and to identify some speculation that's been flying around that is pure hogwash) and what is likely going on. There are also some suggestions at the end of this 17 page report on what you can do to be okay if you were harmed by the recent update.

The interesting thing about this report is that there really is nothing new in what you need to do. It's what we (meaning the staff here at cre8asite - and many, but not all of the users) have been telling you for ages - there are just some new things on the list that you probably shouldn't do. (And by understanding where Google is going (which is explained, too), it'll help you extrapolate what will likely be the "next" batch of things that'll get you in trouble).

If you haven't - you should read the report. If you've been affected by the last update then you must read this report. It addresses every question and discussion we've faced here over the past month. (And, if you ask a question in the Google forum that has to deal with the update, make sure you've read this because my answer is most likely just going to be - Read This.) wink-2.gif

Cheers.

G.
Offline Go to the top of the page

Quarter Grand Poster

Group: Members
Joined: 28-November 03
Posts: 386
From: Waterloo, ON, Canada
post Jan 5 2004, 06:53 PM
I'm going to read the report. But I'm surprised that the recommendation comes from you Grumpus. You have adamantly been maintaining that nobody knows anything. And that we should not even be discussing the subject.

I think that is an accurte reflection of the many times you have admonished us who are bold enough to suggest that we think we know at least some of the things that have changed and how to compensate for them.

I will read the report now. I assume you will allow us to discuss and debate it.
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 5 2004, 07:06 PM
In the days where everything was first happening, yes, I said that it wasn't productive to discuss theories until the results were constant.

Of late, I haven't said that people shouldn't disucss the what's going on, I've said that looking at one element of it and saying that that is the key to it is ridiculous. The thing that is good about this article is that it doesn't look at any one thing and say, "This is new or changed", but rather, it looks at the big picture and shows how that's changed.

I still maintain that it is ridiculous to say that "Such-and-so is the what google wants now."

G.
Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
post Jan 5 2004, 07:10 PM
I thought I share my thoughts that I shared at Jill's forum here as well, lucky guys smile.gif

Good article. Summed up everything well.

I find it interested that Google will start to use Teoma's "Subject-Specific Popularity" concept within its PageRank algo.

How sure are you about this theory?

I think it will be a wonderful thing but I am not too sure this is occuring at a global level at Google. Maybe they have a lot more work to get done for me to see it globally.

Thanks.
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 5 2004, 07:34 PM
lol - I just got done posting a response to this question over at Jills. I didn't go linking all over the place at her place, but I'll do it here.

This has actually been going on, though in a much more crude form, since about December of 2002 when I first started talking about it over at WMW. I had no concept of what was going on back then and it wasn't until this post (PageRank on the Fly) where the things that I was seeing (and most people were calling me a crackpot) could be put into perspective. At the beginning of the post, we did a lot of speculation as the discombobulated observations began to gel in my mind as I held them up against the patent.

If you also go to the very last post in that thread, you'll see a link Ammon posted that goes to something that is actually called topic sensitive pagerank.

I highly doubt the patent from the first part of that thread was ever used in its complete form, but there are most certainly elements of that and the TSPR that Ammon linked to in the current algo and going at least as far back as December of 2002.

G.
Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
post Jan 5 2004, 07:41 PM
Ha,

I find it very interested that they are using Teoma's concept of Supject Specific Popularity and Community Sites.

I based Teoma's concept as one of the major reasons why I wrote Teoma - The Superior Search Engine?

Kind of bugs me that Google would copy (lack of better word) Teoma on this. Everyone else is suppose to copy Google. And then Teoma came around and came up with something no one else could do and made it work.
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 5 2004, 07:48 PM
Did Google copy Teoma?

Google applied for that patent in January of 2001 (and so it's been on public record since then) - so they've had the concept at least that long. It just took them 2 years to get technology that could do something worthwhile with it. I don't know when Teoma put theirs into action, though.

G.
Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
post Jan 5 2004, 08:30 PM
QUOTE
Did Google copy Teoma? 

Google applied for that patent in January of 2001 (and so it's been on public record since then) - so they've had the concept at least that long. It just took them 2 years to get technology that could do something worthwhile with it. I don't know when Teoma put theirs into action, though. 


I'm not talking legally but I think you understand that.

So you think Google had this strategy in mind 3 years ago and finally put it into practice now?

Internet speeds smile.gif
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 5 2004, 08:45 PM
Google's had their strategy in mind since they started out. But, yes, this aspect of context/topic valued linking is 3 years old and has been in use (at least in part) for about 1.

G.
Offline Go to the top of the page

Honorary Member

Group: Members
Joined: 19-April 03
Posts: 695
From: South Carolina
post Jan 5 2004, 10:44 PM
Sometimes the goal is to innovate, sometimes the goal is to improve an existing concept.

All smart businesses use the success of of their competitors to improve their own products- the key is copying the right ideas...
Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
post Jan 5 2004, 11:01 PM
anyone feel bad for Teoma or Mr. Jeeves? cry.gif
Offline Go to the top of the page

Quarter Grand Poster

Group: Members
Joined: 28-November 03
Posts: 386
From: Waterloo, ON, Canada
post Jan 5 2004, 11:05 PM
I've finally read all of Dan's article. Over all it is well done and balanced. I have a few things I would like to discuss arising out of it.

1. He claims there is no "filter". Just a new system of ranking. If that is so how can we explain the fact that immediately after Nov 16 you could duplicate the old ranking by using a '-garbageterm' in conjunction with a search term? If there was a brand new ranking were did the original ranking come from when this trick was used?

2. He says:
QUOTE
Stemming has actually been in play for a while, or at least they've been experimenting with it. Now it's official. Hopefully, this will mean that we can start writing more naturally when it comes to Google, and let them figure out whether the plural and singular are both relevant. For now, it doesn't seem to be active in a lot of searches.\"

This paragraph is self contradictory. He says stemming is now official and then he says it "doesn't seem to be active". Which is it?

I have been saying that Stemming has been turned off since Nov 16 for weeks. If I look I can show you where I said this on this forum. So to say Nov 16 proves stemming is inaccurate.

3. He continues with the motherhood advice about doing what we were always doing and that nothing has really changed. The problem with that is that he never defines "what we were doing".

Standard SEO practice as preached by the self proclaimed ethical paradigm Jill Whalen included the duplication of exact keyword phrases in titles, headings and content. If you read the advice in the SEO copywriting part of her forum you will find multiple examples of clever ways to splice an exact keyword phrase between the end of one sentence and the beginning of the next.

Her advice was to ignore the company name in the title, or put it last at best. The prime keyword phrase had to be the first thing in the title no matter how unnatural that might seem.

She was so concerned about keyword density that when confronted with a client's requirement to include some legal information on a home page she put the entire section in as a graphic so not to dilute the keyword density.

When anybody says nothing has change, just keep doing what we have been doing, or continue using standard SEO practices, I get upset. Because for many people the practices I just outlined were the universal standard practices.

Dan says:
QUOTE
The rumor here is that Google is trying to drop \"optimized\" pages. Not only does this not hold up under close scrutiny, it doesn't make any sense to begin with. Another way to describe an
\"optimized\" web page would be \"a well structured page that clearly indicates the relevant topics.\"

Again he does not define "opitimized". Well he kind of does in the last sentence, but for many people reading that they will think that all the practices that I detailed above are what is meant by an "optimized" page.

5. He acknowledges applied semantics and topic sensitive search engines. I agree with him here, but for me the conclusion then is that we should immediately stop using the stilted and artifical structures that I think the vast majority of people understand as Search Engine Optimization.

Maybe we need a new term. Maybe we should be practicing "relevance enhancement". Stop trying to manipulate the SEs. Start trying to show what what information or benefit really exists in our sites for our viewers.
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 6 2004, 12:00 AM
QUOTE
If there was a brand new ranking were did the original ranking come from when this trick was used?


He also suggested that the new ranking only affected certain terms (where there was data available to do the new relational things), so if you change that term by adding a trigger, then that term no longer trips the new relational portion of the algo. It's not a filter, it's just that if you have no data to base your calculations on, you can't use that element in your calculations.

QUOTE
He continues with the motherhood advice about doing what we were always doing and that nothing has really changed. The problem with that is that he never defines \"what we were doing\".


The article assumes that you have a certain level of knowledge in SEO. There's nothing wrong with wanting to know what to do, but that would have been 500 pages if he told you what to do there.

Read through these forums. There's a nice sticky note by Ammon at the top of the Google forum that'll give you all the basics of "What we were doing."

QUOTE
When anybody says nothing has change, just keep doing what we have been doing, or continue using standard SEO practices, I get upset. Because for many people the practices I just outlined were the universal standard practices.


Without encouraging any more naming of names and such - the process you are talking about is a specific tactic, not not a fundamental SEO concept. The tactics do change - and they change frequently. The core concepts are fairly steadfast.

The trick is to "Do as I say, not as I do" when you are trying to learn SEO (whomever that I may be). In these forums (and in other forums) you will see a lot of advice to people trying to learn how to do SEO that isn't exactly matching up with what you see the person practicing. They don't do it to mess with you, they do it because the know more than you (at least about that aspect of it) and with that knowledge the have assessed the risk of using the particular tactic. If you go out and start using tactics without ever understanding the core principles, then you'll spend your whole life coming here (or wherever) and saying, "Why did I get dropped????!!!!???" Once you have a solid foundation of the core concepts, you then have the power to decide which tactics to use on your own - and you'll have the ability to change those tactics when they stop working.

QUOTE
many people reading that they will think that all the practices that I detailed above are what is meant by an \"optimized\" page.


And some people will think that the tactics you detailed above were optimization principles.

Some tactics don't work anymore. Actually, that happens with most every update - and even between updates. This one affected more tactics than most.

Again, this article assumes that you have and understanding of the core SEO principles.

QUOTE
for me the conclusion then is that we should immediately stop using the stilted and artifical structures that I think the vast majority of people understand as Search Engine Optimization.


I can only assume that I know what you mean when you say "stilted and artificial structures." If it means what I think you mean, I think he did say that that was a part of it - but that the technology isn't far enough along yet to be able to take it for granted. So at this point, he still recomends using each possible form of the word on the page until it has come along far enough.

G.
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 1-September 02
Posts: 9,213
From: UK
post Jan 6 2004, 06:16 AM
QUOTE(RustyBrick)
I find it interested that Google will start to use Teoma's \"Subject-Specific Popularity\" concept within its PageRank algo.


Well, actually they are all copying really. These were all concepts originated many decades ago, but predominently based on the refined work (and actual published algorithms) of Jon Klienberg, originator (1998) of the HITS (PDF file link) algorithm, which is the father of the idea of Hubs and Authorities. In fact, Teoma is pretty much an exact copy of its direct predecessor, DiscoWeb, and is incredibly similar to IBM's CLEVER Project, (though of course, IBM own copyright on CLEVER). Teoma is a Gaelic word meaning "expert". wink-2.gif

You might like to peruse this group project too, especially looking at the groups members and citations. biggrin.gif
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 31-August 02
Posts: 15,634
post Jan 6 2004, 07:15 AM
Excellent post, Ammon. Thanks.

The Yuntis project looks fascinating.

I came across an interesting paper on filtering that explains the topic well, and might add to the comprehension of how that might work under a semantic approach.

One of the articles that came out when Google purchased Applied Sematics surmised that Applied Semantics has "filtering" technology, which would likely be folded into the search engine's ads at some point in time. It is also possible that they have been used in the search engine itself.

It's important to note that there is more than one type of filtering, and this paper does a great job of describing how some of the different types could be applied to a search engine:

Using Semantic Analysis to Classify Search Engine Spam (pdf)
Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
post Jan 6 2004, 08:38 AM
Black_Knight
QUOTE

Well, actually they are all copying really. These were all concepts originated many decades ago, but predominently based on the refined work (and actual published algorithms) of Jon Klienberg, originator (1998) of the HITS algorithm, which is the father of the idea of Hubs and Authorities. In fact, Teoma is pretty much an exact copy of its direct predecessor, DiscoWeb, and is incredibly similar to IBM's CLEVER Project, (though of course, IBM own copyright on CLEVER). Teoma is a Gaelic word meaning \"expert\".


You got a point there. Its all based on something. Well we can go back to Calculus if we wanted to. But good point. I am just on the side of Teoma and I will fight for them! 8)
Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 6-March 03
Posts: 7,962
From: Langley, British Columbia, Canada
post Jan 6 2004, 08:47 AM
Thanks, Ammon and Bill, there are some great references there. It helps to pull it all together and try to understand what may be going on. I spent too much time on this but it was worth it.

As a statistician by training and knowing the difficulty of trying to infer mechanisms from noisy data, I despair that any of us on the outside can really know how things may be evolving. Even those on the inside probably find it rough to see the forest for the trees. That Stanford paper you cited, Bill, is an excellent example of the difficulty of getting clear mechanisms even with "hand picked" data for comparison.

One or two random thoughts that came to mind. First on the HITS algorithm ("hyperlinked induced topic search"), it really is important that none of us spend too much time discussing inferior web pages and giving extra air time to websites that should not be viewed as experts. Conversely, we see the importance of human moderated directories such as the Open Directory Project, where experts can give prominence to expert pages.

Another thought was to note that the Stanford paper seemed to be based on the Altavista search engine and I see this coming up elsewhere. For the last few months, the Altavista spider has been almost as vigorous as the Googlebots in visiting my website. Is this an indicator of the next evolution of the search engines? If so, what will be the main Google challenger and who will be behind it?
Offline Go to the top of the page

Member

Group: Members
Joined: 12-October 03
Posts: 49
From: Berlin/London
post Jan 6 2004, 12:33 PM
I checked out the paper he refers to by Taher H. Haveliwala, on first inspection it seems if they follow the Topic sensitive model then Global page rank is gone and we will only see a page rank based on a specific query.

QUOTE
This score can be used in conjunction with other IR-based scoring schemes to produce a final rank for the result pages with respect to the query.


So one would assume we might be seeing a topicality rank indication alongside search results. Of course we wouldn't need to see this in reality because the algorithim would have placed the best topic match as the top listing in the serp.

This sort of technology might be able to link you directly to the highest rated page for your query, rather than having to even bother to look at a list of serps and then make your mind up. Perhaps built into the google toolbar enter a query and go straight to the best page. Which of course is almost exactly what the feeling lucky search does !!
Offline Go to the top of the page

Moderator Alumni

Group Icon
Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
post Jan 6 2004, 01:10 PM
QUOTE
then Global page rank is gone and we will only see a page rank based on a specific query.


I don't think so. It's definitely not happening now and I don't think it'll be practical for a while.

In this post debunking a conspiracy theory I talk about how relational comparisons are pretty server intensive. I then go on to explain that there are really several steps that occur between when we click the "search" button and when the results come out.

The first step is to simply come up with a set of documents relevant to the search. This is done by looking for instances of the word on the page, density factors, and, yes, PageRank (among some other things).

Now it has that set of 1000 pages. There may be 9 million results, but only the top 1000 are used those are chosen from the quantative values of PR and other tests run during phase one. It's now got a set of 1000 pages that are unranked, but they are the highest scoring of the inherant page factors.

In step two it does some sorting based upon those inherent factors and their values.

In step three the topic sensitive PR and other parts of it that deal with how these pages relate to and against each other are figured in. (Note, it's quite possible that the full battery of comparative tests only happens on a limited number of pages - the ones ranking highest after step 2. What this number is is questionable - 50? 100? 500? I don't know).

Then you get your results.

PR is still important to that first step - and as long as there are limitations to computing power, there are going to be limitations in the number of pages that you can actually compare to one another in a reasonable time. Thus, you need something quantative in the inherent properties of a page in order to produce that set of 1000 results that you're going to work with.

PR is also still critical in areas like depth of crawl and google's general interest in a site in the first place. Since there's no search term involved in this phase, it only has inherent properties to go by - and PR is still a good one for these purposes.

G.
Offline Go to the top of the page

Member

Group: Members
Joined: 22-July 03
Posts: 34
From: Texas, y'all
post Jan 6 2004, 06:28 PM
Wow, people are talking... this is good.

I'll address a couple questions that came up, and I'm happy to participate in the discussion. Keeping up with this on as many forums as there are these days is gonna be tough. biggrin.gif

First, the question of why using "negative matching" would change the results.

I think over time this might change, but IF they are doing a topic-sensitive algorithm, and using some kind of semantic algorithm to match queries to topics, they probably can't determine a topic if you have a negative match. "money market" is probably easy to match to a topic or two, "money market" minus something, they'd have to come up with a way to exclude topics, and I don't see how they'd do it.

Second, the question of whether the "global" PageRank would go away.

I don't think that they could get rid of the generic PageRank. Some queries aren't going to match any topics, or they'll match too many topics. Some queries will have negative matching or other search options that they can't deal with.

Topic-Sensitive PageRanks (TSPR) would be calculated for some number of topics, but there's no reason not to compute a generic PageRank score at the same time.

It's also possible that they're only calculating topic-sensitive ranks for sites that already have a generic PageRank score above a certain amount. That amount could be different depending on the topic, but lacking a TSPR for a topic would *kill* a site's ranking for queries that matched the topic.

As we have seen, a lot of sites went from the top 10 to somewhere below the top 1000, for certain search terms. SEO Research Labs was #1 or #2 for keyword research most of the year, it's out of the top 1000 since November 15, but it hasn't had any real on-topic links because I made no real effort to establish links.

So maybe we had no TSPR for whatever topic "keyword research" matches, and wham the site sinks like a stone. The site still appears on a lot of searches, though, so we haven't been penalized, banned, or anything like that.

It's also possible that I am completely, 100% wrong.
Nobody "out here" knows what Google is doing. I believe that what I have described fits the facts, but saying that "a giant dragon eats the sun every night, and potties it out on the other side of the world in the morning" used to fit the facts as well.
Offline Go to the top of the page
Fast ReplyReply to this topic Start new topic
2 Pages V  1 2 >
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:
Jump to Forum:
 
Lo-Fi Version Time is now: 9th February 2010 - 05:52 PM
Meet our Moderators: cre8pc : projectphp : sanity : Black Phoenix : bwelford : EGOL : Ruud : rustybrick : AbleReach : swainzy : joedolson: eKstreme: dazzlindonna : SEOigloo: iamlost : RisaBB
Cre8asite RSS Feed