Abandon all hope you who read this, for this is where you will find how simple programming problems can turn out to be really hard. For our current project we wanted to implement a seemingly simple functionality – when the user posts a comment via a text area new lines should be converted to "<br />" in the resulting HTML, HTML tags should be encoded so our project would not be vulnerable to script injections and users would be able to post HTML in the comments and finally URLs should be detected and converted to links. Sounds simple? As soon as I was charged with this task I remembered reading a
post by the wise Jeff Atwood on his famous blog
Coding Horror about how hard URLs can be. The post deals with the issue of a closing parenthesis at the end of a URL and I immediately decided that I was going to ignore this issue like most systems do. However while working on this task I hit many more walls...