Let's Write A Keyword Research Tool
Posted 19 October 2007 - 06:10 PM
The post is also a write-up of what goes behind the scenes into developing a tool. Website dev is a lot like this process, except repeated many times and scaled to a larger set of required functionality. So there are lessons in here for the budding coders, and for the experts, please don't laugh :pieinface:
The project is called Open Keyword. It's a PHP script (well, set of scripts) that work together to get keywords from Google Suggest, Yahoo! Live Search, Yahoo! Related Search, and Google Hot Trends. It's released under an Open Source licence, and you can download the first version from the Open Keyword home page on eKstreme.com. The idea here is to kick start a collaborative development process where we could form this tool into something useful for everyone.
So how does it do it's magic? Simple really: all the sources of keywords I mentioned give their data out in XML format. In short, Open Keyword gets the XML and parses. In my 1000th post, we talked about the basics of XML parsing. The code is still the same, but the source of data has changed:
For Yahoo! Live Search, the source XML is http://livesearch.al...command=KEYWORD
For Yahoo! Related Search the source is http://api.search.ya...p;query=KEYWORD
For Google Suggest, the source is http://www.google.co...e...&qu=KEYWORD
Open Keyword works as follows: each of those URLs is retrieved for the keyword of interest and parsed. The output is then displayed into an HTML list. Tada! Keywords.
Wait, that's it?
See, we have a very cool member here on create who posts SE newslets about once a week (hey, Lee ). In one of his posts, Lee mentioned that Google Hot Trends started releasing an hourly feed of the top keyword searches of that hour. Surely we can do something with that, no? You know the answer is yes, so let's find out how exactly.
XML is what's known as a meta-language, which means it is a language used to describe other languages. So RSS and Atom feeds are examples (or dialects if you will) of XML, and so you can see how fundamental XML is to Web 2.0. It turns out that the hourly hottest searches are released in an Atom feed. Since Atom is a standard feed language, we simply find a feed retrieval library. The very same question we're dealing with here was asked a few days ago and I mentioned two solutions: MagpieRSS or SimplePie. Coders are very loyal to either of them and it's a fight like Canon vs Nikon: it will never end and they're both as good as each other. My loyalty is for Magpie, so that's what we use.
So we get MagpieRSS and tell, please oh please, fetch me the Google Hot Trends feed and tell me what it says. When you do that, you find that the feed's content is simply an HTML ordered list of links. Parsing that is easy using PHP: First we use regular expressions to get the list of links. Then we use the PHP built-in functions to parse each link into its components. This gives the keyword and the date it was hot on (useful tidbit to store!). While we're at it, we also save the rank of each keyword for a future feature of plotting trends of keywords over time.
Where do we store the keywords? I racked my brains about this one, deciding whether to go with MySQL or SQLite. In the end, SQLite won out because I wanted to learn to use it. The nice thing about SQLite is that it's fast, easy to set up (Open Keyword takes care of it automatically) and built into PHP 5 and above. No need to trouble yourself with phpMyAdmin or whatnot. It's all ready.
To collect the keywords, you need to set up the server to run a cron job or a scheduled task to run the script every hour. The script will fetch the keywords and save them to the database where they'll be available for searching along with Google Suggest, Yahoo! Live Search and Related Search.
So in summary, the lessons are:
1. Learn about XML. Now. Do it.
2. Find libraries and other code you can use to make your life easier. A lot of programmers suffer from the "Not Invented Here" (NIH) syndrome. Cure yourself of this terrible illness!
3. The best way to learn a programming language is to have a reason to learn it. SQLite was my learning here.
4. ALWAYS save all the data a feed gives you. You never know when it will become useful. The Google Hot Trends data tells us the date and the rank of the keyword, so we save those. In the future, they'll come in handy.
5. (This one is not yet proven in SEO!) Open Source collaboration works. Look at Linux, Apache, Firefox and many others. Let's us pitch in here to make something useful for all SEOs around the world!
And of course, the Open Keyword home page.
All this is really a little thank you to the community that shaped my development (pun intended) as a web programmer and as an SEO. Without you guys and gals here, I don't know what I would be doing with all my free time these days. So thank you, and come on, let's start coding! :thankyou:
Posted 19 October 2007 - 06:16 PM
And Congratulations to you for this mile stone.
Thanks very much for all you have contributed to our community here and that would be a lot. You are a valued member. I've enjoyed your participation greatly AND you've given me some chuckles along the way too. Which is way important.
Here's to a many more years at Cre8!
Posted 19 October 2007 - 06:40 PM
Here's a w00t w00t hooray for Pierre, and a shout out for the tool he wrote about in his 1000th post.
edited for typo
Edited by AbleReach, 19 October 2007 - 09:07 PM.
Posted 19 October 2007 - 08:07 PM
And thank you for this, and 1,999 other thoughtful and engaging posts.
Posted 20 October 2007 - 12:13 AM
I hope the movement catches on
Now, I need time to learn PHP and SQLite and maybe more coding languages to join, though.
Posted 20 October 2007 - 06:44 AM
Yuri, now is the time to learn PHP
Posted 20 October 2007 - 11:56 AM
This new venture certainly sounds like a very worthwhile endeavour and should give some further stimulating posts.
Posted 20 October 2007 - 12:14 PM
Posted 21 October 2007 - 01:53 PM
Just a few weeks ago I was wondering when the keyword-tools with Trends-links would come up - great on you for jumping into it! I've been thinking of some really neat applications for the Trends data, heh. I can't wait for some more neat stuff built from that.
Posted 21 October 2007 - 03:22 PM
Wow, people making so many milestone posts here!
Thank you, Pierre, for the many, many times you have helped me and for all of your thoughtful posts.
Posted 21 October 2007 - 05:21 PM
This is an idea I got a month or so ago and the 2k mark was motivation to actually do something about it. Hopefully with a ton of data to analyze, we can figure out some funky stuff. Like for example, can we assign a scale to the trends graphs? I wonder...
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users