In addition to needing to be a valid Atom 1.0 feed, each entry needs a unique self referencing link: <link rel="self" type="application/atom+xml" href="post-id"/> The href does not have to be real, just unique, so I used example.com in my export from roller. The import process also does not accept empty tags for the post or comment author's email, name or uri (according to the rnc schema, only email cannot be empty).
The following is the template I used to export posts and comments from Apache Roller 4.0. It is based on an earlier export template for JRoller by Damien Bonvillain that output Atom 0.3, with updates for Roller 4.0, Atom 1.0 and blogger.com's undocumented quirks. The exported content still needs some post processing; removing or filling in empty author child tags, checking the truncation of comment titles or misformed content has not broken anything, and replacing relative references (which shouldn't be there in the first place), but generally it works for short blogs. There seems to be a hard-coded limit in getRecentWeblogEntries of 100 posts, so it needs rework to use a pager and an external script to fetch all the pages.
As with the original, paste the contents below into a new roller template, then use the template to access your blog. If you have too many posts for blogger.com or the export script to handle, you could try using a pager in the export template, or break up the export file manually after extracting everything.
#set($entries = $model.weblog.getRecentWeblogEntries('', 100)) <?xml version="1.0" encoding='utf-8'?> <feed xmlns="http://www.w3.org/2005/Atom" thr="http://purl.org/syndication/thread/1.0"> <id>$model.weblog.id</id> <title>$utils.escapeXML($model.weblog.name)</title> <subtitle>$utils.escapeXML($model.weblog.description)</subtitle> <updated>$utils.formatIso8601Date($model.weblog.lastModified)</updated> #foreach( $entry in $entries ) <entry> <id>$entry.id</id> <title>$utils.escapeXML($entry.title)</title> <author> <name>$entry.creator.fullName</name> </author> <published>$utils.formatIso8601Date($entry.pubTime)</published> <updated>$utils.formatIso8601Date($entry.updateTime)</updated> <content type="html"><![CDATA[$entry.text]]></content> <category scheme="http://schemas.google.com/g/2005#kind" term="http://schemas.google.com/blogger/2008/kind#post"/> <link rel="self" type="application/atom+xml" href="http://example.com/$entry.id"/> </entry> ## use Atom threading extensions for comment annotation #foreach( $comment in $entry.comments ) <entry> <id>$comment.id</id> <title>$utils.escapeXML($utils.truncate($utils.removeHTML($comment.content), 40, 50, "..."))</title> <author> <name>$utils.escapeXML($comment.name)</name> <uri>$utils.escapeXML($comment.url)</uri> <email>$comment.email</email> </author> <published>$utils.formatIso8601Date($comment.postTime)</published> <updated>$utils.formatIso8601Date($comment.postTime)</updated> <content>$utils.escapeXML($utils.removeHTML($comment.content))></content> <thr:in-reply-to ref="$entry.id" type="application/atom+xml" href="$entry.permalink"> <category scheme="http://schemas.google.com/g/2005#kind" term="http://schemas.google.com/blogger/2008/kind#comment"/> <link rel="self" type="application/atom+xml" href="http://example.com/$entry.id"/> </entry> #end #end </feed>
2 comments:
You had the pager along my template :-D I'll take your modifications, I still haven't imported my old posts ^^;
Hi Damien,
There were a few bugs in the template in this post, see the next post, after I had subjected it to more testing.
Post a Comment