Page 1 of 1

SEO url Canoniclization

PostPosted: Wed Sep 03, 2008 11:03 pm
by oneelephantpickle
I have been thinking about URL canoniclization these days. And i ran a test, just to be sure and objective from one google data center. Picked the Google data center IP from analytics and ran a test for how the google treats word seprators in URL*. My objective what would give us the maximum weight gain for a term in the URL. What i have found with our seo process in regards to naming convention of the URL is a bit of hangover from php. All URl are underscored.

This test should amply demonstrate that its better to have hyphens - then underscore_.


Do a google search for term daryl-kennedy (daryl hyphen kennedy) in Google.com and you ll get 2280 results
now do a search for the same with underscore_ which is very much the existing practice in techwyse now. Daryl_kennedy ( daryl underscore kennedy) you see a measly 3 results.* That because underscore is being considered as character by google algo.

What should be our URl canoniclization practise ?

Re: SEO url Canoniclization

PostPosted: Wed Sep 03, 2008 11:18 pm
by DJ
This thought is profound and a wonderful topic!

(no not the fact that you are searching for daryl kennedy online!) haha.

Im wondering what happens if you were to do the same search using "-" Using dashes has become very popular and recommended amongst many SEO's these days and Im wondering what that type of result reveals.

Great experiment sir!

Re: SEO url Canoniclization

PostPosted: Wed Sep 03, 2008 11:28 pm
by C-Note
Excellent post !! and i am glad you did it

I have been preaching using dashes for a while now.

back in 2005 or so, google admitted that dashes were treated as word separators.
there was rumour recently, that underscores would have a similar treatment, but haven't seen enough
evidence to support it, so definitely i recommend dashes.

perhaps moving forward team leads and pms can ensure all our new sites are using dashes :P

Re: SEO url Canoniclization

PostPosted: Thu Sep 04, 2008 12:24 am
by Prashant
First of all it took me a while to understand what "Canoniclization" meant. as far as i could gather from wiki "In computer science, canonicalization (abbreviated c14n, where 14 represents the number of letters between the C and the N[citation needed]) is a process for converting data that has more than one possible representation into a "standard" canonical representation. This can be done to compare different representations for equivalence, to count the number of distinct data structures, to improve the efficiency of various algorithms by eliminating repeated calculations, or to make it possible to impose a meaningful sorting order." i hope this what was referred to... my favorite dictionary.com did not provide any results :(

Anyways coming to the point. i perefer using dashes, i had intimated the same to DJ when starting on the TW site makeover. My preference has been greatly influenced by similar practice followed by Open Source CMS's, namely Wordpress. If you look up, One of the main factor which made wordpress blogs / CMS sites a friend of google is the SEO frindly url rewriting it offered. i.e. wordpress-is-seo-frinedly

waiting for my man ROB to post his comments.

Re: SEO url Canoniclization

PostPosted: Thu Sep 04, 2008 2:50 am
by oneelephantpickle

Re: SEO url Canoniclization

PostPosted: Thu Sep 04, 2008 2:58 am
by Prashant
Just for reference... check out http://www.mattcutts.com/blog/dashes-vs-underscores/ i think this has been one of the oldest posts covering this from 2005, its still active :shock: ... the last comment has some good test reults

Re: SEO url Canoniclization

PostPosted: Thu Sep 04, 2008 12:42 pm
by DJ
So there is our answer!

Dashes instead of underscores! Matt Cutts said it best here.

"That’s why I would always choose dashes instead of underscores. To answer a common question, Google doesn’t algorithmically penalize for dashes in the url. Of course I can only speak for Google, not other search engines. And bear in mind that if your domain looks like http://www.buy-cheap-viagra-online-whil ... g-porn.com, that may still attract attention for other reasons. "

Re: SEO url Canoniclization

PostPosted: Thu Sep 04, 2008 10:02 pm
by oneelephantpickle
That long winding discussions shows how pedantic and and hair splitting we SEO's can get. Matt cutts never intervenes and never backtracks in his support of underscores. I am begninng to think its a source of great amusement for him to think SEO getting so worked up trying to establish a rule for perhaps a minor ranking weightage. Also as he says elsewhere its a great source of ideas for him for next level of algo tweaks.

Because at the end of the indexing process all terms are tokenized, which means a sentence is not indexed " buy cheap viagra online" but its is broken into individual words. ex DOC1---- buy DOC1---cheap this goes on for entire corpus of document in collection till search engines build a dictionary. Tokenization thus strips any extra weightage a website may be getting for having a URL packed with keywords whether underscore or dashes. Once tokenized then the
td document matrix prepares it for the boolean type search format. Search "A or B" , "A and B". Once the ideal match for query is retrieved then other algo like co- occurence etc check it for other textual factors.

What i am gettin to is that by stripping the whole website to only words SE's may be neutralizing the advantage a website may be gettting for a underscore url or url with dashes.

Re: SEO url Canoniclization

PostPosted: Tue Sep 09, 2008 12:49 am
by C-Note

Re: SEO url Canoniclization

PostPosted: Tue Sep 09, 2008 12:51 am
by C-Note

Creating a Google-friendly URL structure

PostPosted: Wed Nov 05, 2008 7:11 am
by jay

Re: Google Adds Alternate & Hreflang Attributes

PostPosted: Mon Sep 13, 2010 10:06 pm
by jay
Google introduced new SEO specific attributes for handling multi-lingual pages and content. The new attributes are rel=”alternate” and hreflang=”x” -

Video search engine optimization

PostPosted: Mon Jan 17, 2011 6:17 am
by anneparker
This is something which everyone should know because we use url for the website recognizing that totally bring me the knowledge of SEO . SEO is like giving the technology hand to promote your business.

http://www.wizardmedia.net/