Crawlable AJAX Content with ASP.NET Web Forms

Crawlable AJAX Content with ASP.NET Web Forms
   A few days ago I was asked how one can develop a page in ASP.NET Web Forms that fulfills the following requirements:

   1. The page should display a product.
   2. There should be an AJAX pager (next/previous or normal pages).
   3. The user should be able to copy URL from the address bar and send it to other people who should see the same product.
   4. If the user pastes the link in Facebook the right product information should be displayed.

   So how can we fulfill these requirements using good old Web Forms with update panels?

   We will start with a history lesson. As you probably know the web is really broken. Because the web is broken it is full of ugly hacks. One of these hacks is known as JavaScript (invented by Netscape) and another is known as AJAX (invented by Microsoft). The AJAX technique allowed the document... I mean the application to change the content on the screen without a page refresh. However the document viewers... I mean the browsers did not have any way to change the URL without refreshing the page except for the hash (#) part. The hash part of the URL is not sent to the server and in order to read it the developer needs to use JavaScript that runs when the page is loaded and display the appropriate content via redirect or another AJAX call. To make matters worse the search engines and other crawlers cannot execute the JavaScript so the AJAX content is invisible to them. To fix this Google introduced yet another hack. Their crawler would read hashes that start with an exclamation mark (#!) and convert them to a query string argument named "_escaped_fragment_". The developer should then accept the query string argument and provide a static version of the page. This hack was accepted by other companies and this is how Facebook requests pages when the user posts a link. A better solution in the form of the History API is coming but we need a solution today (and yesterday when this hack was invented).

   How do we implement this in Web Forms? Here is how:

   The markup:

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default"
   ViewStateMode="Disabled" %>

<!DOCTYPE html>
<html>
<head id="Head1" runat="server">
   <title>Web Forms Is So Cool! Just like StarCraft.</title>
</head>
<body>
   <form id="form1" runat="server">
   <asp:ScriptManager ID="ScriptManager1" runat="server">
   </asp:ScriptManager>
   <script>
       var pagePath = "<%= Request.Path %>" + "?ID=";
       var index = location.hash.indexOf("!")
       if (index > -1) {
           location = pagePath + location.hash.substring(index + 1);
       }

       Sys.WebForms.PageRequestManager.getInstance().add_endRequest(
           function (sender, e) {
               var id = document.getElementById("hfProductID").value;
               location.hash = "#!" + id;
           });
   </script>
   <div>
       Some header content generated at:
       <asp:Literal ID="ltDateTime" runat="server"></asp:Literal>
       <%--Some BRs because I don't know anything about CSS--%>
       <br />
       <br />
       <asp:UpdatePanel ID="UpdatePanel1" runat="server" UpdateMode="Conditional">
           <ContentTemplate>
               <asp:Label ID="lbProductInfo" runat="server"></asp:Label>
               <br />
               <asp:Image ID="img" runat="server"></asp:Image>
               <br />
               <asp:LinkButton ID="lbtnPrevious" runat="server" Text="Previous" OnClick="lbtnPrevious_Click"></asp:LinkButton>
               <asp:LinkButton ID="lbtnNext" runat="server" Text="Next" OnClick="lbtnNext_Click"></asp:LinkButton>
               <asp:HiddenField ID="hfProductID" runat="server" ClientIDMode="Static" />
           </ContentTemplate>
       </asp:UpdatePanel>
       <br />
       <br />
       Some footer content like a real website.
   </div>
   </form>
</body>
</html>

The code behind:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;

public partial class _Default : System.Web.UI.Page
{
   private Product[] Products { get; set; }
   protected void Page_Init(object sender, EventArgs e)
   {
       //Real OOP programmers aim for encapsulation
       Products = new[]
       {
           new Product { ID = 1, Title = "StarCraft II: Wings of Liberty", Image="sc2WoL.jpg" },
           new Product { ID = 2, Title="StarCraft II: Heart of the Swarm", Image="sc2HoS.jpg"},
           new Product { ID = 3, Title="StarCraft II: Legacy of the Void", Image="sc2LoV.jpg"}
       };
       //this is why I'm encapsulating my data in my source code.

       ltDateTime.Text = DateTime.Now.ToString();
   }

   protected void Page_Load(object sender, EventArgs e)
   {
       if (!IsPostBack)
       {
           int productID;
           if (Request["_escaped_fragment_"] != null && Int32.TryParse(Request["_escaped_fragment_"], out productID))
           {
               DisplayProductInfo(productID);
           }
           else if (Request["ID"] != null && Int32.TryParse(Request["ID"], out productID))
           {
               DisplayProductInfo(productID);
           }
           else
           {
               DisplayProductInfo(1);
           }
       }
   }

   protected void lbtnNext_Click(object sender, EventArgs e)
   {
       int productID;
       if (int.TryParse(hfProductID.Value, out productID))
       {
           DisplayProductInfo(productID + 1);
       }
   }

   protected void lbtnPrevious_Click(object sender, EventArgs e)
   {
       int productID;
       if (int.TryParse(hfProductID.Value, out productID))
       {
           DisplayProductInfo(productID - 1);
       }
   }

   private void DisplayProductInfo(int productID)
   {
       var product = Products.SingleOrDefault(p => p.ID == productID);
       if (product != null)
       {
           hfProductID.Value = productID.ToString();
           lbProductInfo.Text = product.Title;
           img.ImageUrl = product.Image;
           img.AlternateText = product.Title; //If you don't set the alt text you are a bad person!

           if (Products[0] == product)
           {
               lbtnPrevious.Visible = false;                
           }
           else
           {
               lbtnPrevious.Visible = true;
           }

           if (Products.Last() == product)
           {
               lbtnNext.Visible = false;
           }
           else
           {
               lbtnNext.Visible = true;
           }
       }
   }
}

public class Product
{
   public int ID { get; set; }
   public string Title { get; set; }
   public string Image { get; set; }
}


   I have created a page that displays a single product (picture and description) in an UpdatePanel. There are a couple of LinkButtons for the next and previous product. I have simulated a datastore and paging with an array and some proof of concept code. Please do not copy this part. When the page is first requested it checks for a query string argument called ID and displays the product with that ID. Otherwise it displays the first product. However before checking for normal query string ID it checks for the "_escaped_fragment_" and if it is present it displays the product specified there. The page should check for "_escaped_fragment_" first because search engines will not remove the previous query string and if the normal query string ID takes precedence the "_escaped_fragment_" would never be handled. This logic runs only on GET request and if the user navigates the page via the LinkButtons the query string arguments will be ignored.

   Next thing we need to do is change the URL when the product is changed with an AJAX request. To achieve this I have added a HiddenField on the page and I put the ID of the currently displayed product in it. I subscribe for the endRequest client side event provided by the Microsoft AJAX framework, read the hidden field (which has just been updated because it is in the UpdatePanel) and put its value in the hash with the "#!" convention. At this point I have done enough for the bots to be able to see my page.

   There is more work to be done. Remember when I said that the hash is not sent to the server? Without additional code the server has no way to know what the currently displayed product is when the "next" or "previous" buttons are clicked. Good thing we have this hidden input on the page which value is posted to the server. I read the value in the input in the LinkButtons' click handlers and use it to determine the product that should be displayed next.

   The last thing I have done is check for a hash when the page is first loaded on the client and set the location property (i.e. redirect) to the equivalent URL without hash. This way the URL is normalized. Another option would be to set the value from the hash in the hidden field and force an AJAX request but this solution is more complex and would result in longer and more complicated URLs.

   There it is – AJAX + unique URLs + crawlable pages (you can test by posting an URL with hash in Facebook) and all these in Web Forms with UpdatePanels, no ViewState or other things that the Web fascists may consider evil.

   I have uploaded a demo to demonstrate this technique in action. The demo may be removed at some point in the future.

   Some additional considerations:

   1. If you are a time traveler coming from the future where everyone uses a browser that supports History API this example will work for you as well but instead of the hash you should modify the whole URL (via the History API) and you do not need to normalize the URL or read the "_escaped_fragment_". I am happy that Web Forms is still used in the future and we have survived the MVC genocide.
   2. In this case the "_escaped_fragment_" is just a numeric ID. However in the real world you may need to pass more information in the hash. In this case remember to UrlDecode the fragment as it is encoded.
   3. The search engine can crawl URLs posted elsewhere by the users but it cannot find other products via the website itself because the LinkButtons are not links. The simplest solution is to render actual links with normalized URLs and set "display:none" via CSS. In this way the links will be visible to the search engine but invisible to the user.
   4. I use next/previous LinkButtons here because it is simpler for this example but this works with actual paging. In this case instead of ID you should store the page number in the hidden field and set it programmatically on postback.
Tags:   english programming 
Posted by:   Stilgar
03:36 28.11.2011

Comments:

First Previous 1 Next Last 

Posted by   Guest (Unregistered)   on   19:45 14.07.2012

Posted by   Stilgar   on   00:29 15.07.2012

No, in this example it will crawl http://test.sietch.net/ajaxurl/Default.aspx?_escaped_fragment_=1 (no ID=) because the original URL in this particular example was
http://test.sietch.net/ajaxurl/Default.aspx?#!1
If it was
http://test.sietch.net/ajaxurl/Default.aspx?#!ID=1
Google would indeed crawl what you suggested.

Posted by   ehsan (Unregistered)   on   09:05 04.03.2014

tnx very good

First Previous 1 Next Last 


Post as:



Post a comment: