Wordpress Code Formatting
I finally got tired of dealing with reformatting that Wordpress does in its attempt to be “user friendly”. In general it does the right thing, but when you deal with code snipits inside <code> tags a lot it can quickly become a problem.
I wanted to accomplish two things
- Have whitespace matter
- Not have Wordpress add extra linebreaks or evaluate my >s as HTML
1. Is easily done with CSS
code {
white-space: pre;
}
This CSS will render the text including showing the spaces and line-breaks however they are in the source. This is just the right things for code.
2. Getting Wordpress to not muck with code blocks
This on the other hand requires coding a solution. Luckily someone has done the hard work for us. There is an existing plugin called Preserve Code Formatting that handles this. Basically it looks through the HTML source of a posting and looks for <code> and <pre> blocks. When it finds those blocks it removes all of the extra Wordpress formatting and handles escaping HTML entity characters.
The other thing I was running into was that Wordpress was “closing” things that looked like HTML. I ran into this when I was writing code snipits that contained Generics syntax.
I tracked that down to a writing setting in Wordpress. Under Options -> Writing there is a checkbox that says: “WordPress should correct invalidly nested XHTML automatically”. When this option is enabled, Wordpress will erroneously see certain things as HTML markup and try to create closing tags.
With this option selected I would get:
List<address> addresses;
</address>
Instead of the correct output I would get when I unselected the option:
List<Address> addresses;
With the plugin in place, a bit of CSS and turning off one option, I can now copy-and-paste code snipits into Wordpress and not have to deal with formatting.
Next step…syntax highlighting.
Update:
The other thing that I found in the functions-formatting.php file there is a method called ‘wpautop’. This method has a call to remove breaks from <pre> tags. So I copied the line and changed it to do the same thing to <code> tags.
$pee = preg_replace('!(<code.*?>)(.*?)</ code>!ise', " stripslashes('$1') . stripslashes(clean_pre('$2')) . '</ code>' ", $pee);



Brennan Stehling
Sep 26th 2006
For my Wordpress blog I simply turned off the wysiwyg editor and now hack the code as I like.
Now to make it smart enough to do syntax highlighting would be extremely useful. One approach would be to place your code into a code block with a class attribute denoting the language. Then I would active some Javascript to run through the code elements and highlight the keywords for the respective languages. That would be an excellent addition to one of the major Javascript frameworks. And then auto-linking the import/using statements to the official online documentation would be even better.
Geoff Lane
Sep 26th 2006
That’s an interesting idea. You could import other files based on the class name to use as your template for syntax highlighting.
I’m gonna start working on it! :)
Naveen
Jul 4th 2007
Gr8 work :)
I am PHP Programmer. I have a requirement from my client that needs to be fixed.
He have a wordpress-mu (1.2.1) intalles and there are several blog sites are running under it. This is his feedback,
” We’ve had issues posting HTML into wordpress. Wordpress wants to format the html after we post an article. Even if we only use the “Code” tab, the wordpress system will try to format what was pasted in. Here are some examples of the formatting:
formats out javascript tags
adds after each carriage return
changes — to –
Please look into creating a plugin that will bypass *all* the formatting when we post an article through the Code tab.”
Please suggest on this.
Thanks
Naveen
Geoff Lane
Jul 4th 2007
Nauveen,
Check out the link in the article to the Preserve Code Formatting plugin.
Ryan
Jan 19th 2008
Thanks, that did the job perfectly :)