XHTML-to-RSS Extractor service [trial-release]

URL of source XHTML:
(see also W3C XHTML validator)
Convertor (XSLT file):
example data: example.xhtml (bristol copy).

What is this?

This service uses a generic webdata transformation service (an XSLT server) to convert from a dialect of XHTML to the proposed RSS 1.0 channel format.

Goal: author in XHTML, syndicate in RSS.

Disclaimer: RSS/HTML authors please note: this service, while functional, currently lacks documentation for a non-engineer audience. This brief note provides a quick overview of the basic idea; a better 'quick start' guide is clearly needed. Also we need to expand the various acronyms, finalise the HTML representation etc.

This page shows how a new Web technology (the XML "style sheet" mechanism, XSLT), can be used by online services to dynamically convert between document types.

Specifically, we provide a Web form that you can use to turn certain kinds of HTML document into the proposed RSS 1.0 channel / syndication format. This approach is designed to free content authors from the technical detail of evolving formats such as RSS, WAP/WML, RDF etc. Instead of learning dozens of new acronyms, content creators can produce XHTML documents, and have software tools do the rest.

HTML format details

TODO: more work needed on this section!

This service only understands the "XHTML" variation of HTML. XHTML is the World Wide Web Consortium's current recommendation for HTML content creation. (@@point to tutorials here). Your XHTML document will need to contain a few 'marker' attributes to help the extraction service turn your file into RSS. This is perfectly normal in HTML authoring: we are using the same mechanism that was designed to ensure that HTML pages can be viewed on devices of all kinds (PCs, televisions, mobile phones). In this case, the "device" viewing the file is itself a software program, and the result will be a another file designed for consumption by other machines.

Further Reading


Evic van der VList produced the example data and XSLT file. W3C run (except when its offline) a generic XSLT service. Dan Connolly has produced a number of nice examples of such 'semantic web screenscraping' (see RDF Interest archives for details).

Valid XHTML 1.0!

maintained by: danbri