...few months ago, we deployed an XML-XSLT website that contains rich data content. There are rumors that it doesn't do much in SEO. Since we can't just accept something unless we tried it ourselves, we decided to deploy a version of a blog website....
Some additional considerations
On our early phase of development, we have learned that Google will crawl on HTML contents and XML contents, but treats the contents differently. As a proof, we did some search for indexed contents for HTML formated and XML formatted (of course with XSL) documents, but only HTML pages appears at Google's search engine results page (serp). XML documents on some point, do appear but only if the searched text is near an indexed link. So chances are, if you have an HTML content page, have a specific word located anywhere in the page, when it is indexed and you searched google, assuming that nobody else ranks for that keyword, you can see your page and the keyword in strong text, while on XML/XSLT content, you will be needing luck for it to be displayed (like adding the keyword on some linking page).
With this, we decided to make some changes, so the XML is rendered using HTML known tags that affects SEO. We added a title node, description, keywords.... the result so far was, the page is indexed like normal HTML content pages do, but still not indexed exactly like pages with HTML formatted contents.
Compatibility Issues
We also do experienced some compatibility issues. One is, when using first generation of Galaxy Tab. The google-based browser can't render XML-XSLT correctly and only displays XML documents as if it was plain text, even with content type of XML is provided from the server side..
Testing took some time as well, especially when making changes to complex XSL templates. When something went wrong, in Chrome it will just display nothing. Firefox will display an XSLT Error page, and IE too will be blank, but the XSLT Error Code is visible on the console.
SERP
The template we had follows a format, it has a common template for HTML, then only the title, meta (description, keywords), and a section for the contents of the page are included on the SERP. On an actual content perspective, I think plain text and xml are pure datas, so I find it odd that Google doesnt cache it completely like HTML formatted contents.
Resolution
We are going to revert it back to use XHTML-Strict. We tried HTML5 and it works, but unfortunately there are still lots of old browsers there. :(
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment