While preparing the last set of upgrades for the no-blog I stumbled upon some strange issues with different technologies. I reasoned that I might help some of you or some poor googling soul by posting about them here.
So the first think I did was create a bbcode-like syntax for writing html tags. In short I am replacing [] tags with <> HTML tags. That is needed because I HtmlEncode all my input. To do this I used this very elite regex:
"\[((<A-Z><A-Z0-9>*)\b[^\]]*)](.*?)\[/\2]"
to capture the bracket tags and replace them with equivalent HTML tags. I used the backreference groups to preserve the different portions of the tag. At first I was iterating through matches and constructing the result string but then
Peter Dimov came up with much more elite solution:
regex.Replace(arg, "[$1]$3[/$2]");
The first backreference group contains the tag together with all the attributes, the second contains the tag name and the third contains the value of the tag. I was thinking that I could use recursion with the iteration approach to handle nested tags but I needed only hyperlinks and already had much more (any HTML tag) so I abandoned the idea. One more cool thing about this regex – did you notice the lazy star?
Next I went on to build an RSS feed. RSS format turned out to be quite easy. You can build your feed in many ways and you will not have many problems using only + operator and strings. However I chose the most elite approach which is LINQ to XML. I used
this guide as a reference. The cool thing here is how you embed the LINQ query inside of the XElement constructor. The author of the guide calls this approach "functional construction" of XML. It looks like LISP, doesn't it? Here is the example for those of you that do not want to bother with opening the link:
XDocument document = new XDocument(
new XDeclaration("1.0", "utf-8", null),
new XElement("rss",
new XElement("channel",
new XElement("title", "Employees of Northwind Traders Inc."),
new XElement("link", "http://www.northwindtraders.com"),
new XElement("description", "Employees of Northwind Traders Inc. ordered by last name."),
from emp in context.Employees
orderby emp.LastName
select new XElement("item",
new XElement("title", emp.LastName + ", " + emp.FirstName),
new XElement("description", emp.Notes),
new XElement("link", "http://www.northwindtraders.com/employees.aspx?id=" + emp.EmployeeID)
)
),
new XAttribute("version", "2.0")));
This builds the whole feed with the content that comes from Northwind database. The query saves us the iterations that we would have to do to generate the items if we were using standard approach. I bet some of you will find this ugly but I can bet these same people think LISP is ugly. I feel sorry for you. Obviously you do not grok the beauty of LISP. Go ahead with your primitive, iterative approach. Even if I try to explain you will not grok it.
While building the LINQ query I incidentally used wrong property in some string concatenation. The concatenation went to the SQL Server and it resulted in SQL Exception. I did not need this concatenation anyway but I wondered why it is not working. The exception stated that NTEXT (the type of the field) does not support the "+" concatenation operator. That is OK but that would mean that the LINQ to SQL provider has a bug. When I dug deeper it turned out that NTEXT is obsolete and someday it is gonna be deprecated. In SQL Server 2005 you are supposed to use NVARCHAR(MAX) so I guess I am more than 3 years behind. While NTEXT is mapped to string as it is supposed to be it is not fully supported by the LINQ to SQL provider so some of the operators fail. These operators are fully present for NVARCHAR(MAX). Another data type that is to be deprecated is IMAGE which is to be replaced by VARBINARY(MAX). I did not have any idea that these types are to be replaced so I used them for large values as I was taught in the university. I had some adventures with migration. As it turns out it is not that hard to alter the data type of the column but if you want maximum performance you need to update the column with the same value. Otherwise NVARCHAR(MAX) will work like NTEXT for short text which is supposed to be slow. After a single update the new column will be working normally (i.e. faster).
The other problem I had was that the OutputCache directive for the ASP.NET page was not working for my feed. It is good that I decided to test the caching because otherwise I would probably never know. As it turned out after some investigation the OutputCache directive does not work if you use Response.End() within the page. Some sources claim that Response.Flush() has the same effect but I did not test it. I was exploring some other options like using HTTP Handler for the feed but manual caching seemed too much work (i.e. more than five lines of code, plus I am not sure if manual caching will do better than OutputCache directive). If someone knows a way to automatically cache HTTP Handler please let me know. Now I have an empty page for the RSS feed. It is annoying that there is validation warning because there are no html and body tags but I guess unless someone points me to a better way I will have to deal with it.
So that is what I did last night. What did you do? :)