I’m working on a project where I am using Grails as a content delivery layer for an XML based content management system. The content management system has the ability to publish XHTML content with inline links and images; however the inline img and a tags reference the content as relative paths. I need the links to internal pages to be absolute from the server root (ie /sub/content.html rather than sub/content.html) and I am hosting all static content like images at Amazon S3 so I need the image links to be absolute to a different DNS name (ie http://media.mywebsite.com/images/image.jpg rather than images/image.jpg).
Instead of implementing some hacks in the CMS itself, I opted to do the transformation in Grails as the content was rendered allowing greater flexibility. I created a taglib for rendering the XHTML block that does a Groovy replaceAll with regular expressions. The tag gets called like this: <g:processXHTML content=”${content.mainBody.text()}”/> . My GSP has reference to a content variable that is an XMLSlurped xml document with a mainBody tag. Here is the code for the tag:
/**
* Given XHTML content will return the same content with any relative a tag hrefs prefixed with a /
* and any relative images prefixed with the mediaURL
*/
def processIngeniuxXHTML = {attrs ->
def content = attrs.content
def rootURL = grailsApplication.config.mediaURL
if (content) {
// first look for links within the content that are relative URLS and prefix with a slash so they are
// resolved from the root rather than the current context
def regex = /(< \s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+)[\"'>]/
def replace = { fullMatch, a, b ->
if (b[0] != "/"){
// for some reason it's not outputing the closing quote so I added it at then end explicitly
"${a}/${b}\""
} else
"${fullMatch}"
}
content = content.replaceAll(regex, replace)
// second if there are any images referenced with relative URLs, prefix with the DNS for the S3 bucket
regex = /(< \s*img\s+[^>]*src\s*=\s*[\"'])(?!http)([^\"'>]+)[\"'>]/
replace = { fullMatch, a, b ->
if (b[0] != "/"){
// for some reason it's not outputing the closing quote so I added it at then end explicitly
"${a}${rootURL}/${b}\""
} else
"${fullMatch}"
}
out < < content.replaceAll(regex, replace)
}
}
And here are the test conditions to demonstrate what’s expected:
void testProcessXHTML() {
ContentUtilTagLib tl = new ContentUtilTagLib()
def i = """<div><a href="Explore/test"/>"""
def o = """<div><a href="/Explore/test"/></div>"""
tl.processXHTML(content: i)
assertEquals o, out.toString()
out.getBuffer().setLength(0)
i = """<div><a alt="blah" href="Explore/test"/></div>"""
o = """<div><a alt="blah" href="/Explore/test"/></div>"""
tl.processXHTML(content: i)
assertEquals o, out.toString()
out.getBuffer().setLength(0)
i = """<div><a href="Explore/test"/><img width="100" src="myImage/a.jpg"/></div>"""
o = """<div><a href="/Explore/test"/><img width="100" src="http://stub/myImage/a.jpg"/></div>"""
tl.processXHTML(content: i)
assertEquals o, out.toString()
out.getBuffer().setLength(0)
i = """<div><a href="/Explore/test"/></div>"""
o = """<div><a href="/Explore/test"/></div>"""
tl.processXHTML(content: i)
assertEquals o, out.toString()
out.getBuffer().setLength(0)
i = """<div><a href="http://Explore/test"/><img width="100" src="http://anotherhost/myImage/a.jpg"/></div>"""
o = """<div><a href="http://Explore/test"/><img width="100" src="http://anotherhost/myImage/a.jpg"/></div>"""
tl.processXHTML(content: i)
assertEquals o, out.toString()
out.getBuffer().setLength(0)
}
More related to this project to come…








