Kunena 6.3.0 released

The Kunena team has announce the arrival of Kunena 6.3.0 [K 6.3.0] in stable which is now available for download as a native Joomla extension for J! 4.4.x/5.0.x/5.1.x. This version addresses most of the issues that were discovered in K 6.2 and issues discovered during the last development stages of K 6.3

Question Migrating from forum with HTML posts

More
12 years 7 months ago #1 by DeeEmm
G'day all.

I'm in the middle of writing a migration script to migrate from Dolphin to JomSocial. For the forum component I am using Kunena (of course ;) ) but have struck a bit of a stumbling block with regards to post formatting.

Dolphin uses TinyMCE as a WYSIWYG editor and after filtering the input the posts are stored in the database as HTML. Kunena escapes the HTML so that when it is displayed, the HTML code tags are displayed in the post, and not parsed as expected.



I started to write a HTML to Kunena BB parser to convert the HTML into Kunena friendly BB code using RegEx, but found that due to the way that TinyMCE creates the HTML code, the HTML syntax is almost infinitely varied which is a bit of a headfark to code as each possible variation needs to be considered. :blink: :side: :pinch:

Before I embark on spending hours writing an overly complicated parser, Is there any way to (easily) allow the HTML to be displayed? Maybe there is another better way?

I've read the posts relating to Kunena policy on HTML editors (interesting thread that one :silly: ) and can appreciate that what I a asking may be considered a security issue, but I would counter that the posts being migrated have already been filtered, and all subsequent posts will utilise the standard BB code.

Any help / comments / suggestions greatly appreciated.

TIA

/DM
Attachments:

Please Log in or Create an account to join the conversation.

More
12 years 7 months ago - 12 years 7 months ago #2 by DeeEmm
Got bored...

Decided to convert the main tags and discard anything else. Whilst this will lose any formatting that was contained within the post, I decided that for most people migrating this would probably not be an issue.

Might revisit this later.

Still open to suggestions if anyone has any ideas.
Code:
function html2kbb($html2kbb) { //convert hrefs $html2kbb = preg_replace('/<a\s[^>]*href=\"([^\"]*)\"[^>]*>(.*)<\/a>/siU', '[url="http://$1" ]$1[/url]', $html2kbb); //convert smileys before images $html2kbb = str_replace('<img src="../plugins/tiny_mce/plugins/emotions/img/smiley-laughing.gif" border="0" alt="Laughing" title="Laughing" />',':D', $html2kbb); $html2kbb = str_replace('<img src="../plugins/tiny_mce/plugins/emotions/img/smiley-embarassed.gif" border="0" alt="Embarassed" title="Embarassed" />',':-[', $html2kbb); //...more to add ... //convert images $html2kbb = preg_replace('/<img\s[^>]*src=\"([^\"]*)\"[^>]*\/>/siU', '[img]$1[/img]', $html2kbb); //convert <p> + <br> tags $html2kbb=str_replace("<br>", "\r\n", $html2kbb); $html2kbb=str_replace("<br/>", "\r\n", $html2kbb); $html2kbb=str_replace("<br />", "\r\n", $html2kbb); $html2kbb=str_replace("<p>", "\r\n", $html2kbb); $html2kbb=str_replace("</p>", "\r\n", $html2kbb); //convert headings $html2kbb = str_replace('<h2>', '[b]', $html2kbb); $html2kbb = str_replace('</h2>', '[/b]\r\n', $html2kbb); $html2kbb = str_replace('<h3>', '[b]', $html2kbb); $html2kbb = str_replace('</h3>', '[/b]\r\n', $html2kbb); $html2kbb = str_replace('<h4>', '[b]', $html2kbb); $html2kbb = str_replace('</h4>', '[/b]\r\n', $html2kbb); $html2kbb = str_replace('<h5>', '[b]', $html2kbb); $html2kbb = str_replace('</h5>', '[/b]\r\n', $html2kbb); $html2kbb = str_replace('<h6>', '[b]', $html2kbb); $html2kbb = str_replace('</h6>', '[/b]\r\n', $html2kbb); $html2kbb = str_replace('<strong>', '[b]', $html2kbb); $html2kbb = str_replace('</strong>', '[/b]', $html2kbb); $html2kbb = str_replace('<em>', '[i]', $html2kbb); $html2kbb = str_replace('</em>', '[/i]', $html2kbb); //strip everything else... $html2kbb = strip_tags($html2kbb); return($html2kbb); }
Last edit: 12 years 7 months ago by DeeEmm. Reason: Daydreamin

Please Log in or Create an account to join the conversation.

More
12 years 7 months ago - 12 years 7 months ago #3 by Matias
If you use our converter as a base, there is already a class that converts some TinyMCE tags into bbcode. It does that by using DOM, which is very reliable way compared to regexps that fail with long input.

Check export.php parseHTML() and export_example.php (and phpbb2 & ccboard) in models directory.

PS. You can fork and freely modify our importer in here:
github.com/Kunena/com_kunenaimporter-1.6

That also allows everyone to benefit from your work..
Last edit: 12 years 7 months ago by Matias.

Please Log in or Create an account to join the conversation.

More
12 years 7 months ago #4 by DeeEmm
G'day Matias,

Thanks for the reply.

You are correct - I have already found some limitations with the regex's - string length being one of them (not all posts are being converted).

I will take a look into the converter and let you know how I get on.

Thanks for the tips.

Regards.

/DM

Please Log in or Create an account to join the conversation.

Time to create page: 0.526 seconds