Reply to this topicStart new topic
> Accented Words, How do SEs deal with them?

Honorary Member

Group: Members
Joined: 30-August 02
Posts: 341
From: Fairfield, Iowa, USA
post Sep 5 2006, 11:23 AM
How do search engines deal with the accents commonly used in Spanish, German and other languages? Do they index them? Do people use them in their query strings?
Offline Go to the top of the page

Star Member

Group Icon
Group: 1000 Post Club
Joined: 22-May 06
Posts: 1,632
post Sep 5 2006, 11:33 AM
Search engines do index them. They will even index this 会在位. Users do use foreign languages to also do searches.

Yannis


Offline Go to the top of the page

Moderator

Group Icon
Group: Moderators
Joined: 15-January 04
Posts: 4,736
From: Rimouski, Canada
post Sep 5 2006, 12:00 PM
Although I don't have any data to back this up, I do believe people use this in their queries. Most of these languages have their own keyboard layout. If you're used to a word being spelled with an accent you're very likely to type it the way you always do.

Search engines do index them.

According to How search results may differ based on accented characters and interface languages, Google initially treats both the accented and non-accented version as the same. It is the user's chosen interface language and his actual location that seem to do the deciding in what to list (first) and what not:

QUOTE
The searcher's interface language is taken into account during this process. For instance, the set of accented characters that are treated as equivalent to non-accented characters varies based on the searcher's interface language, as language-level rules for accenting differ.

Also, documents in the chosen interface language tend to be considered more relevant. If a searcher's interface language is English, our algorithms assume that the queries are in English and that the searcher prefers English language documents returned.

This means that the search results for the same query can vary depending on the language interface of the searcher. They can also vary depending on the location of the searcher (which is based on IP address) and if the searcher chooses to see results only from the specified language.
Offline Go to the top of the page

Technical Administrator

Group Icon
Group: Technical Administrators
Joined: 8-March 06
Posts: 2,650
From: Minneapolis/Saint Paul, MN
post Sep 5 2006, 12:01 PM
And, to be a bit more thorough, Google at least treats accented characters differently depending on your interface language.

How Search Results May Differ Based On Accented Characters and Interface Language

Basically, Google treats certain accented characters as equivalents, and will return results for both versions when either is searched. This is important for words which are commonly spelled with and without accents - the example provided above uses "Mexico" and "México." Your interface language impacts this because Google will prefer results in the correct spelling for your interface language.

Offline Go to the top of the page

Star Member

Group Icon
Group: 1000 Post Club
Joined: 10-March 05
Posts: 1,065
From: Montreal Canada
post Sep 5 2006, 12:08 PM
I have a site in French and find a lot of searches are without the accents but I have no stats. One of my top words is the french word îles (the little hat on top of the i). Most just type iles.

I also find that in languages other than English the SEs are not as good in ignoring noise words as they are in English. Also they are not too good at considering all variations of a letter (eg: e, é, è,) as the same.
Offline Go to the top of the page

Honorary Member

Group: Members
Joined: 30-August 02
Posts: 341
From: Fairfield, Iowa, USA
post Sep 5 2006, 12:42 PM
Thanks. Very helpful responses. Here's a related question. A client of mine, whose site is all in English, plans to translate about 10 pages into Spanish. Should he just add those to his current site, which is well-established and has good SE rankings? Would there be some compelling advantages to setting up a new site with a new domain name for the Spanish content? I can think of plenty of cons and only a few pros, which might include being able to submit the site to some Spanish search engines and directories. Then again, maybe I could do that with just his Spanish pages if he were to add them to his existing site. Your thoughts?
Offline Go to the top of the page

Technical Administrator

Group Icon
Group: Technical Administrators
Joined: 8-March 06
Posts: 2,650
From: Minneapolis/Saint Paul, MN
post Sep 5 2006, 12:51 PM
I think that a single bilingual site has a lot of advantages over two separate monolingual sites.

If the Spanish language pages have a spanish interface (that is, you can "switch" the site into Spanish), then I think you could make a fair argument for adding the site into Spanish language directories regardless - so, as I see it, there are no benefits at all to making a new site. It would add expense, it would require you to start over in promotion, etc., etc.

Adding more content to the existing site would improve that site - and that's probably the more logical direction for you to go.

It may be worthwhile for your client to consider just translating the entire site into Spanish - although this depends a lot on the total number of documents this would involve, and how much regular updates might impact continuing translation.
Offline Go to the top of the page

Star Member

Group Icon
Group: 1000 Post Club
Joined: 10-March 05
Posts: 1,065
From: Montreal Canada
post Sep 5 2006, 08:40 PM
I looked at my stats for the past 5 months.

In french (Québec Canada):

For the letters é, à, and è (the most popular) 39-44% of the people use accents. These usually require 1 key stroke.

For î (circumflex i) only 20% use accents. My guess is because it requires 2 key strokes. I bet all 2 key stroke letters have this average.
Offline Go to the top of the page
Fast ReplyReply to this topic Start new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:
Jump to Forum:
 
Lo-Fi Version Time is now: 9th February 2010 - 05:40 PM
Meet our Moderators: cre8pc : projectphp : sanity : Black Phoenix : bwelford : EGOL : Ruud : rustybrick : AbleReach : swainzy : joedolson: eKstreme: dazzlindonna : SEOigloo: iamlost : RisaBB
Cre8asite RSS Feed