![]() ![]() |
Moderator Alumni![]() Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
|
Jan 5 2004, 06:38 PM |
|
|
I spotted a new report by Dan Thies over at Jill's place this morning. I finally got a chance to read through it and thought that it was most definitely worth posting here.
How To Prosper In the New Google (PDF) It goes through and explains a lot of the recent changes (many of which are things I've been trying to say for a while, but my mastery of being able to get things out of my brain and onto a page is lacking). There are a few things in there I don't agree with (most notably in the site structure area), but that's neither here nor there. There are going to be things in there that you don't agree with, too. The point of this document is to try and sift through what is mere speculation (and to identify some speculation that's been flying around that is pure hogwash) and what is likely going on. There are also some suggestions at the end of this 17 page report on what you can do to be okay if you were harmed by the recent update. The interesting thing about this report is that there really is nothing new in what you need to do. It's what we (meaning the staff here at cre8asite - and many, but not all of the users) have been telling you for ages - there are just some new things on the list that you probably shouldn't do. (And by understanding where Google is going (which is explained, too), it'll help you extrapolate what will likely be the "next" batch of things that'll get you in trouble). If you haven't - you should read the report. If you've been affected by the last update then you must read this report. It addresses every question and discussion we've faced here over the past month. (And, if you ask a question in the Google forum that has to deal with the update, make sure you've read this because my answer is most likely just going to be - Read This.) Cheers. G. |
||
| Offline | ![]() |
Moderator![]() Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
|
Jan 5 2004, 07:10 PM |
|
|
I thought I share my thoughts that I shared at Jill's forum here as well, lucky guys
Good article. Summed up everything well. I find it interested that Google will start to use Teoma's "Subject-Specific Popularity" concept within its PageRank algo. How sure are you about this theory? I think it will be a wonderful thing but I am not too sure this is occuring at a global level at Google. Maybe they have a lot more work to get done for me to see it globally. Thanks. |
||
| Offline | ![]() |
Moderator Alumni![]() Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
|
Jan 5 2004, 07:34 PM |
|
|
lol - I just got done posting a response to this question over at Jills. I didn't go linking all over the place at her place, but I'll do it here.
This has actually been going on, though in a much more crude form, since about December of 2002 when I first started talking about it over at WMW. I had no concept of what was going on back then and it wasn't until this post (PageRank on the Fly) where the things that I was seeing (and most people were calling me a crackpot) could be put into perspective. At the beginning of the post, we did a lot of speculation as the discombobulated observations began to gel in my mind as I held them up against the patent. If you also go to the very last post in that thread, you'll see a link Ammon posted that goes to something that is actually called topic sensitive pagerank. I highly doubt the patent from the first part of that thread was ever used in its complete form, but there are most certainly elements of that and the TSPR that Ammon linked to in the current algo and going at least as far back as December of 2002. G. |
||
| Offline | ![]() |
Moderator![]() Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
|
Jan 5 2004, 07:41 PM |
|
|
Ha,
I find it very interested that they are using Teoma's concept of Supject Specific Popularity and Community Sites. I based Teoma's concept as one of the major reasons why I wrote Teoma - The Superior Search Engine? Kind of bugs me that Google would copy (lack of better word) Teoma on this. Everyone else is suppose to copy Google. And then Teoma came around and came up with something no one else could do and made it work. |
||
| Offline | ![]() |
Moderator![]() Group: Moderators
Joined: 20-August 03
Posts: 1,248
From: New York
|
Jan 5 2004, 11:01 PM |
|
|
anyone feel bad for Teoma or Mr. Jeeves?
|
||
| Offline | ![]() |
Moderator Alumni![]() Group: Hall Of Fame
Joined: 1-September 02
Posts: 9,213
From: UK
|
Jan 6 2004, 06:16 AM |
|
|
QUOTE(RustyBrick) I find it interested that Google will start to use Teoma's \"Subject-Specific Popularity\" concept within its PageRank algo. Well, actually they are all copying really. These were all concepts originated many decades ago, but predominently based on the refined work (and actual published algorithms) of Jon Klienberg, originator (1998) of the HITS (PDF file link) algorithm, which is the father of the idea of Hubs and Authorities. In fact, Teoma is pretty much an exact copy of its direct predecessor, DiscoWeb, and is incredibly similar to IBM's CLEVER Project, (though of course, IBM own copyright on CLEVER). Teoma is a Gaelic word meaning "expert". You might like to peruse this group project too, especially looking at the groups members and citations. |
||
| Offline | ![]() |
Moderator Alumni![]() Group: Hall Of Fame
Joined: 31-August 02
Posts: 15,634
|
Jan 6 2004, 07:15 AM |
|
|
Excellent post, Ammon. Thanks.
The Yuntis project looks fascinating. I came across an interesting paper on filtering that explains the topic well, and might add to the comprehension of how that might work under a semantic approach. One of the articles that came out when Google purchased Applied Sematics surmised that Applied Semantics has "filtering" technology, which would likely be folded into the search engine's ads at some point in time. It is also possible that they have been used in the search engine itself. It's important to note that there is more than one type of filtering, and this paper does a great job of describing how some of the different types could be applied to a search engine: Using Semantic Analysis to Classify Search Engine Spam (pdf) |
||
| Offline | ![]() |
Moderator Alumni![]() Group: Hall Of Fame
Joined: 7-November 02
Posts: 6,179
From: New England, USA
|
Jan 6 2004, 01:10 PM |
|
|
QUOTE then Global page rank is gone and we will only see a page rank based on a specific query. I don't think so. It's definitely not happening now and I don't think it'll be practical for a while. In this post debunking a conspiracy theory I talk about how relational comparisons are pretty server intensive. I then go on to explain that there are really several steps that occur between when we click the "search" button and when the results come out. The first step is to simply come up with a set of documents relevant to the search. This is done by looking for instances of the word on the page, density factors, and, yes, PageRank (among some other things). Now it has that set of 1000 pages. There may be 9 million results, but only the top 1000 are used those are chosen from the quantative values of PR and other tests run during phase one. It's now got a set of 1000 pages that are unranked, but they are the highest scoring of the inherant page factors. In step two it does some sorting based upon those inherent factors and their values. In step three the topic sensitive PR and other parts of it that deal with how these pages relate to and against each other are figured in. (Note, it's quite possible that the full battery of comparative tests only happens on a limited number of pages - the ones ranking highest after step 2. What this number is is questionable - 50? 100? 500? I don't know). Then you get your results. PR is still important to that first step - and as long as there are limitations to computing power, there are going to be limitations in the number of pages that you can actually compare to one another in a reasonable time. Thus, you need something quantative in the inherent properties of a page in order to produce that set of 1000 results that you're going to work with. PR is also still critical in areas like depth of crawl and google's general interest in a site in the first place. Since there's no search term involved in this phase, it only has inherent properties to go by - and PR is still a good one for these purposes. G. |
||
| Offline | ![]() |
![]()
|
|
2 Pages 1 2 >
|
|
| Lo-Fi Version | Time is now: 9th February 2010 - 05:52 PM |
| Meet our Moderators: | cre8pc : projectphp : sanity : Black Phoenix : bwelford : EGOL : Ruud : rustybrick : AbleReach : swainzy : joedolson: eKstreme: dazzlindonna : SEOigloo: iamlost : RisaBB |