The post is also a write-up of what goes behind the scenes into developing a tool. Website dev is a lot like this process, except repeated many times and scaled to a larger set of required functionality. So there are lessons in here for the budding coders, and for the experts, please don't laugh :pieinface:
The project is called Open Keyword. It's a PHP script (well, set of scripts) that work together to get keywords from Google Suggest, Yahoo! Live Search, Yahoo! Related Search, and Google Hot Trends. It's released under an Open Source licence, and you can download the first version from the Open Keyword home page on eKstreme.com. The idea here is to kick start a collaborative development process where we could form this tool into something useful for everyone.
So how does it do it's magic? Simple really: all the sources of keywords I mentioned give their data out in XML format. In short, Open Keyword gets the XML and parses. In my 1000th post, we talked about the basics of XML parsing. The code is still the same, but the source of data has changed:
For Yahoo! Live Search, the source XML is http://livesearch.al...command=KEYWORD
For Yahoo! Related Search the source is http://api.search.ya...p;query=KEYWORD
For Google Suggest, the source is http://www.google.co...e...&qu=KEYWORD
Open Keyword works as follows: each of those URLs is retrieved for the keyword of interest and parsed. The output is then displayed into an HTML list. Tada! Keywords.
Wait, that's it?
No.
See, we have a very cool member here on create who posts SE newslets about once a week (hey, Lee
XML is what's known as a meta-language, which means it is a language used to describe other languages. So RSS and Atom feeds are examples (or dialects if you will) of XML, and so you can see how fundamental XML is to Web 2.0. It turns out that the hourly hottest searches are released in an Atom feed. Since Atom is a standard feed language, we simply find a feed retrieval library. The very same question we're dealing with here was asked a few days ago and I mentioned two solutions: MagpieRSS or SimplePie. Coders are very loyal to either of them and it's a fight like Canon vs Nikon: it will never end and they're both as good as each other. My loyalty is for Magpie, so that's what we use.
So we get MagpieRSS and tell, please oh please, fetch me the Google Hot Trends feed and tell me what it says. When you do that, you find that the feed's content is simply an HTML ordered list of links. Parsing that is easy using PHP: First we use regular expressions to get the list of links. Then we use the PHP built-in functions to parse each link into its components. This gives the keyword and the date it was hot on (useful tidbit to store!). While we're at it, we also save the rank of each keyword for a future feature of plotting trends of keywords over time.
Where do we store the keywords? I racked my brains about this one, deciding whether to go with MySQL or SQLite. In the end, SQLite won out because I wanted to learn to use it. The nice thing about SQLite is that it's fast, easy to set up (Open Keyword takes care of it automatically) and built into PHP 5 and above. No need to trouble yourself with phpMyAdmin or whatnot. It's all ready.
To collect the keywords, you need to set up the server to run a cron job or a scheduled task to run the script every hour. The script will fetch the keywords and save them to the database where they'll be available for searching along with Google Suggest, Yahoo! Live Search and Related Search.
So in summary, the lessons are:
1. Learn about XML. Now. Do it.
2. Find libraries and other code you can use to make your life easier. A lot of programmers suffer from the "Not Invented Here" (NIH) syndrome. Cure yourself of this terrible illness!
3. The best way to learn a programming language is to have a reason to learn it. SQLite was my learning here.
4. ALWAYS save all the data a feed gives you. You never know when it will become useful. The Google Hot Trends data tells us the date and the rank of the keyword, so we save those. In the future, they'll come in handy.
5. (This one is not yet proven in SEO!) Open Source collaboration works. Look at Linux, Apache, Firefox and many others. Let's us pitch in here to make something useful for all SEOs around the world!
And of course, the Open Keyword home page.
All this is really a little thank you to the community that shaped my development (pun intended) as a web programmer and as an SEO. Without you guys and gals here, I don't know what I would be doing with all my free time these days. So thank you, and come on, let's start coding! :thankyou:
Pierre






