Jump to content

Cre8asiteforums Internet Marketing
and Conversion Web Design


Social Responsibility Mucks Up Indexing

  • Please log in to reply
11 replies to this topic

#1 naturalwoman


    Ready To Fly Member

  • Members
  • 11 posts

Posted 23 July 2007 - 10:06 AM

I'm working on a site that has a need to require users to complete an age verification script before entering any page of a site with flash-heavy content (Lucky me!).

The sIFR technique would allow me to present indexable content first -- making the site better for usability and SEO, but since visitors without script and flash could be search engine spiders OR under-aged visitors it does not address the social responsibility goal.

I know some subscription-based sites have addressed indexing by offering optimized teaser copy but again that wouldn't meet the social responsibility goal of keeping the content from people who have not completed the age verification.

Does anyone have a thought on this? I'm really stumped.


#2 bobbb


    Sonic Boom Member

  • Hall Of Fame
  • 3379 posts

Posted 23 July 2007 - 10:47 AM

I' m quite certain I saw something about giving Google a password when you are using their sitemap stuff.

#3 Ron Carnell

Ron Carnell

    Honored One Who Served Moderator Alumni

  • Invited Users For Labs
  • 2069 posts

Posted 23 July 2007 - 11:04 AM

I applaud your attention to social responsibility.

But I think that has to be carried over to the search engines, too. If you can't show your content to underage surfers then the engines shouldn't show it to underage surfers either. And that's exactly what will happen, via snippets and cached pages, if the site is indexed.

#4 naturalwoman


    Ready To Fly Member

  • Members
  • 11 posts

Posted 23 July 2007 - 12:34 PM

I believe that snippets and cached can be controlled. Plan would be to use the no-cache meta tag and <META NAME="GOOGLEBOT" CONTENT="NOSNIPPET"> for Google.

The point would be to describe what the page is about ... not provide the content in the SERPs. Any user clicking the listing in the SERPs would see the age verification window before any content.

Last week 11 major brands announced policies to curtail marketing of things like happy meals and cereal with more than 12 gm of sugar to kids under 12.

My 10-year old daughter was actually deterred from signing up for cyworld by their age verification page and I actually know parents who restrict their kids tv watching and do not take kids under 13 to PG-13 movies. Maybe that's just because I live in the midwest :angel: but I don't think you can claim that those efforts are wasted.

My point is that this indexing issue isn't just a problem for sites marketing to people over 21.

#5 sharkeo


    Unlurked Energy

  • Members
  • 3 posts

Posted 24 July 2007 - 09:59 AM

Hi Naturalwoman,

Interesting topic, I'm curious to know the topic/theme of the website as this can also have a bearing on the SEO strategy. I could be way of track, but my thoughts are below:

I recently discussed a similar topic with a colleague regarding a financial webiste. The client wanted the site to remain indexable but prevent users from viewing any page of the site, unless they have accepted the terms and conditions. This was due to the strict financial regulations.

* Prevent users from viewing the website unless they have accepted the terms and conditions, irrelevant of where they arrive at the site from.

Solution 1
Allow visitors to arrive at the site via search engines, but prevent them from viewing or selecting any page or link, unless they accept the terms and conditions. The terms and conditions are initiated in a pop-up as soon as a user selects a link on the page.

Solution 2
Provide a standalone page, which remains indexable for search engines, but only provide limited information about the site and why verification is required.

Although I agree Social Responsibility is a must, the other problem is, no matter what measurements are put in place, as with anything, if under-age users wish to access something, they will nearly always find away around it.

The same could be said for other industries such as the film industry and movie trailers, specifically "Harry Potter And The Order Of The Phoenix" (12+ Film). The film is for over 12 kids only, I think it would almost be impossible to prevent under age children from finding out about the film. (Maybe this is a silly example, but I'm sure you get the idea).

I hope this helps. :thumbs:

#6 naturalwoman


    Ready To Fly Member

  • Members
  • 11 posts

Posted 24 July 2007 - 10:53 AM

Hi sharkeo,
Ideally, they would want their content indexed, but the age verification hides the content until they complete it. My guess is that corporations believe they just have to put up the roadblock, they aren't responsible for the people who circumvent it. Still I would NOT argue with anyone who believes that these roadblocks make the content that more attractive to underage visitors. But I haven't seen any more robust technical solutions being used for age verification.

In Solution 1 you said:
"The terms and conditions are initiated in a pop-up as soon as a user selects a link on the page."
I don't think that would work in this situation because the client wouldn't want that content visible for any period of time until the AV was completed.

Solution 2 wouldn't help either because the client's goal is to get all their content indexed ... not just a placeholder page. Since it's a competitive space, you'd need fully optimized content to be competitive.

bobbb thanks for the suggestion. The current age verification doesn't require a registration so there's not a password. It's just a java script.

Google has suggested using IP delivery for subscription based news sites
(http://www.google.co...0543&topic=8871) ... However when this client asked the question they were told that option was only for publishers.

#7 JohnMu


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3519 posts

Posted 24 July 2007 - 11:31 AM

The IP delivery for Google's crawlers is only for news sites. The password/authentication which Bobbb mentioned is for the Adsense bot, not for the web crawler.

As I understand it, the problem is that the user should always see an age verification page before being able to view the content while a search engine should be able to crawl the site naturally. That idea alone goes against the Google Webmaster Guidelines, "Don't deceive your users or present different content to search engines than you display to users (...)" -- no matter how you solve the issue technically.

Personally, I do not think that there is a way to handle it without breaking that guideline.

There are a few ways it could be done, if you are willing to ignore that guideline, eg:
- IP+user-agent delivery
- client side coding (javascript) + cookies
- server side coding + cookies

Are you willing to take that risk? Would your client be willing?


#8 naturalwoman


    Ready To Fly Member

  • Members
  • 11 posts

Posted 24 July 2007 - 12:54 PM


Yep, that's my conclusion as well. The client's IT department came back with an approach that was definitely cloaking but wanted assurance that they won't damage their branded url.

The best practice solution for the engines is to present the content at page load and the age verification after ... this is obviously not the best practice for social responsibility.

I think I'm turning gray. They say you start to see things more in shades of gray as you age. Or is it just your hair color that's supposed to turn?

#9 Jozian


    Light Speed Member

  • Members
  • 583 posts

Posted 24 July 2007 - 01:53 PM

Google Guidelines: Don't deceive your users or present different content to search engines than you display to users

This guideline certainly seems necessary in order for Google to be able to mechanically index information effectively, but it has a major flaw that naturalwoman has stumbled across: How can content owners and Search Engines provide access to information that is not completely free? (subscription, registration, verification, advice).

I think that this limitation is the primary contributor to search engine dissatisfaction. Often seekers cant find an answer because not everything is indexed/displayed: books, periodicals, subscription data, subject matter expertise... And of course PPC and Organic rank are both based upon a commercialized ideal of value.

Search engines work, most of the time, for most of the people, with most of the information needed. No doubt about that. Google's valuation and omnipresence makes that clear. But I think naturalwoman has found a big hole where that value equation breaks down.

Sidebar: Is not Google already in a sense violating their own policy with services like the full-text search services like Google Scholar? I could be wrong, but it's my understanding that Google Scholar provides a full text search of articles, but that the full articles are not always displayed - you might have to subscribe or register to see the whole document. Even if Scholar isn't doing full-text search (maybe they only do abstract search), certainly that will change a some point if they want the value proposition to mature. And the Google Library Digitization projects.

What about marketing research for example? Why do I have to search Google, then Google Scholar, then my library OPAC, then a federated search of subscription data, then site/service-specific searches of Forester, Nielson, eMarketer....

Sorry I'm ranting more than providing you a solution :) But your problem points to a larger one.


#10 Black_Knight


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 9417 posts

Posted 24 July 2007 - 03:09 PM

Okay, there is a way to balance those needs against the 'fair play' guidelines for Google and that is to actually sort hrough the content in such a way as to separate 'safe' areas of content from 'unsafe - verification requiring' content on your end, dynamically through the server.

Think of it like posting spoilers, where those spoiuler parts are removed from the overall page unless the user is logged in. To 'log in' requires age verification. But the rest of the page, minus those words or passages marked as sensitive in your database, would be displayed to all users, including googlebot, and thus could be indexed.

That's about the best 'safe' compromise solution I can come up with without knowing a lot more specifics about the site/clint tha I tend to cover outside of paid consultancy. However, it should be plenty to show that the way out of this problem is to simply think creatively about what needs to be achieved, and not ways around one or other extreme end of the spectrum of solutions.

Googlebot must get enough content to index the site reasonably.
Any unverified user (regarding age) must not be presented certain content.
Googlebot is unverified in age, and must be presented that content.

Using PICS ratings is the only really effective signal online of content unsuitable for children. Correctly choosing/setting an adult PICS rating can help your site/pages automatically be blocked from 'net-nanny' type software for child safety, block from all "family-safe" search, and other SERPs where the user has not chosn to allow Adult oriented results.

#11 Jozian


    Light Speed Member

  • Members
  • 583 posts

Posted 25 July 2007 - 10:17 PM

I found an example of subscription data being indexed as full-text in Google.

Only a portion of the text is displayed in the landing page - unless you subscribe to a service. In fact the portion that matched my search - 'Harry Seldon' - does not appear on the landing/abstract page at all.

It ranks very high - Number 6 on the first page to results.

* I searched for 'Harry Seldon' in G and got the following returned:

SOCIOLOGY: Network Theory-the Emergence of the Creative Enterprise ...
In the Foundation Trilogy, Isaac Asimov placed psychohistorian Harry Seldon so far into the future that Earth, the birthplace of the Galactic civilization, ...
www.sciencemag.org/cgi/content/full/308/5722/639 - Similar pages - Note this

* Clicking the link shows a page that doesnt contain 'Harry Seldon'

Network Theory--the Emergence of the Creative Enterprise
Albert-László Barabási
In both the arts and sciences, there are certain characteristics that set apart creative "dream teams". But can these characteristics be explained mathematically, asks Barabasi in his Perspective? A new study that mathematically examines the creative teams responsible for Broadway shows and landmark scientific papers reveals the key elements that underpin team creativity and success (Guimerá et al.).
The author is in the Center for Complex Network Research and the Department of Physics, University of Notre Dame, Notre Dame, IN 46556, USA. E-mail: alb@nd.edu

Read the Full Text

* Clicking the Read Fulltext link tells me I have to signup.

Clearly a guideline violation - but should it be?

In this particular search instance, I would rather have a partial result than miss the reference completely. If it wasnt indexed like this by Google, I would have missed the reference completely. Only way to find it would have been a subscription to a service that I might not actually need (or a trip to the library ;) ).

It seems to me that when we search for different things, or search in different ways, there ought to be different ways to rank and display the data. And maybe a way to control the depth of search. You can accomplish this somewhat by visiting different search engines that rank things differently, or visiting specific free or fee databases.

But why not develop a taxonomy of the types of information people are seeking and apply different page rank heuristics based upon the desired outcome?

If I'm doing marketing research rather than trying to find a hotel in Orlando, shouldn't the PR equation know that periodical references should, for instance, be given more weight?


#12 DianeV


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 7216 posts

Posted 27 July 2007 - 03:52 AM

One quick note: I see that you're targeting Google SERPs, but the solution should actually be workable for all search engines.

RSS Feed

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users