Is there any point in submitting to DMOZ?
Posted 02 August 2005 - 05:42 AM
One of my sites aint doing too well on Google, it started well but is now slipping back. The home page has been optimised with appropriate keywords, navigation and cms issues have been ironed out etc and there really doesn't seem to be anything that should cause Google a problem.
Then I realised that its not in DMOZ. But now I want to know that if I do submit it, is it going to have any effect on the way the site is ranked by Google et al?
Posted 02 August 2005 - 08:07 AM
One link by itself is unlikely to have a dramatic effect on the ranking of your website, even a link from DMOZ. How many links from other related and reputable pages do you have? What pages do these link to? Are the links on pages that are 'on-theme'? Does the anchor text include the search terms (or variants) that you are trying to target?
Posted 02 August 2005 - 08:22 AM
I have a links programme on going for the site at the moment, all relevant, themed sites from reputable sites with good PR. The links text is keyword rich etc etc. There arent a huge amount at the momnet but the sister site to this one has less links and is doing better and has only had the same amount of optimisation done.
The reason I was looking into DMOZ was because we were unaware whether Google's LSI thingy might be influenced by the DMOZ category listing? Just a stab in the dark but ain't it always?
Posted 02 August 2005 - 09:36 AM
Up until last week, the DMOZ listing was proving to be highly influential. Not any longer, thanks to Google's recent update.
Posted 02 August 2005 - 09:42 AM
Posted 02 August 2005 - 01:12 PM
This is another one of those buzz-expressions that people in the SEO communities have latched onto without any real understanding of what it is or what it entails because "it makes sense" (that is, someone says something that for lack of a clear rebuttal sounds plausible).
An explanation of Latent Semantic Indexing can be found here:
In brief, what LSI attempts to do is reduce words to core concepts and then looking at the relationships between the core concepts of separate documents to determine which documents in a collection should be placed together. It is a theory of conceptual categorization.
Adapting that model to a search engine's query resolution process would indeed require a lot of resources.
Google may attempt it, may be attempting it, may have abandoned it, may never even have considered it. We don't know. But the effects of external links, text, and other factors upon document rankings in search results are attributable to much less complicated concepts which have, at least, been documented by Google through their technical papers, patents, and/or generalized Web site explanations.
Posted 03 August 2005 - 09:02 AM
We also know that just because something requires extensive pre-processing, and cannot be calculated in the milliseconds required for a search result, does not mean that it cannot be applied in some fashion. The original PageRank implementation took a very long time to calculate, and when combined with the need to then distribute those final calculations to worldwide data centers, actually took several days.
Furthermore, we know that shortcuts can often be found whereby one might lose a little accuracy and dramatically reduce calculation times, as has been done with Google's current 'rolling' PageRank calculations. Google has not needed to 'dance' in order to calculate PageRank for a considerable time now, and one only sees a 'dance' when a new algorithm tweak needs to be deployed to the link popularity part of the index.
I would point to the known changes to the way Google regards link text, (the changes causing SEOs to widely advise people not to desire all external links to their site to have the same link text), to suggest that applying LSI solely to the link index, (an index that already requires very extensive calculation), would not only be possible, but might well explain a lot of what people are reporting...
Posted 03 August 2005 - 01:14 PM
Posted 04 August 2005 - 04:22 AM
I'm in agreement Black_Knight. We've been keeping an eye on this but had no definite proof, but my colleague is convinced that this is in fact the case and is keeping a close eye on things to try to get something to back up our thoughts. He's a determined man!
In terms of link text , I have rewritten the external link text, giving several combinations, all still keyword rich. These are being changed shortly so hopefully that should have some bearing on the situation and in the meantime I've submitted the site to DMOZ anyway, just in case!
Posted 04 August 2005 - 07:20 AM
Posted 04 August 2005 - 09:49 AM
Michael, can you think of any method more suitable than a term-vector database for tying the immense variety of words in and around links across potentially millions of documents (in the case of popular links) to the url of the link itself?
I can think of several ideas I'd want to try, if I were in the business of indexing data. Term vectors require linear processing, and I would want to move into a more dynamic environment.
Think of a glob of words that are all related. They would be like a miniature web of words. For example, let's start with the words "car", "automobile", and "vehicle". They can all be used intechangeably, but they each have different meanings which can lead into different directions.
"car" can refer to that gas-guzzling thing we drive to and from work, or it can refer to a module in a train, or it can refer to something else (we're getting into obscure territory, so let me veer back).
"automobile" can refer to "car" or "truck". I'm not sure I've seen it used of any other kind of vehicle, but the potential application is there.
"vehicle" can refer to "spaceship", "boat", "ship", "car", "truck", "tractor", "train", etc.
So, let's say that "vehicle" is the heart of this glob. It branches out to "car", which is the heart of a sub-glob. Think of each word being the center of a wheel with spokes going out to other words.
You can assign a word to more than one glob (and possibly to more than one sub-glob if you incorporate jargon). So, the challenge is to identify the correct glob. By examining the words in a query, I would identify all the globs the words are attached to and look for a union of the sets of globs (I would not always expect to find such a conjunction).
If you get a union, you should have a fairly small set of globs to work with. That set of globs would probably identify the concepts behind the query.
At this point, I would look for documents which can be tied to those globs. There may be 1,000 or more such documents. So far, I haven't actually looked at any term vectors. All I have done is identify groups of related words which may or may not occur in the documents I have selected, which may or may not occur in the query.
From this point, I would see if I could match the query with an EXACT FIND. I would position all exact finds at the top of the 1,000 selected documents.
Next, I would look for proximity matches (there could be intervening words, maybe stop words and white space, etc.).
Finally, I would round out the top 1,000 (if necessary) by including pages which include similar phrases (substituting synonyms for one or more of the words in the query). The globs would tell me which synonyms would work.
Now, we haven't actually ranked anything. At this point, once we have our set of 1,000 documents, striped into at most three bands of relevance (you could actually step it down to four bands if you allow for a proximity match with synonyms), we can use any method to rank the sites within each band.
Is that better? I have no way of knowing. I don't have a database of millions of Web sites that I can run a test against.
But intuitively that looks like an attractive method of finding matches for a query. The advantage it has over term-vector matching is that it allows for the flexibility of language and idiom. No two people say everything the same way. The disadvantage of that kind of system (versus term-vector matching) is that your chances of getting bizarre results (especially where you have no exact matches) would be somewhat higher.
Posted 04 August 2005 - 01:08 PM
Of course, Term-Vectors are what inspired the original theorizing about Themes and Theming, through largely a misunderstanding of what Term Vectors were really all about. This may help a little:
http://www9.org/w9cdrom/159/159.html - The Term Vector Database: fast access to indexing terms for Web pages by Raymie Stata, Krishna Bharat and Farzin Maghoul
Greatly simplified in regard to the 'theme' connection in my own "Vertical Themes Vectoring and theme-based ranking"
However, by applying Term Vectors solely to link text, and words in proximity to links, one could easily find the most commonly associated words for that link url that excluded the most commonly used words around all links.
Posted 04 August 2005 - 02:00 PM
Posted 19 November 2005 - 12:47 PM
If an editor sees a submission that doesn't have to be rewritten they might be more app to accept it.
I am getting the sites I submit to DMOZ added up as quick as 2 weeks and a number of them there has been few or no changes made (17 site in the regional Aylmer cat). I have been following the editorís style.
Then if DMOZ is important to you want to get in bad work on your content. DMOZ is looking for original quality content.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users