Firstly it should be noted that this is really just for public facing publishing sites. In other words we're talking about using just the WCM features of MOSS. You are not going to get your fully featured Intranet to be conformant in the current releases of SharePoint. The reason for this is that SharePoint generates out a lot of non W3C validating code. This is largely due to the richness of many SharePoint features like web parts.
Some of the techniques mentioned in this article are equally applicable to standard ASP.NET sites. So if you're totally unfamiliar with XHTML validation it may be worth having a look at the Microsoft technical article on Building ASP.NET 2.0 Web Sites Using Web Standards. It is also likely that some of these techniques will be unsupported by Microsoft; so use at your own risk.
Web.ConfigThe first step is to add the conformance configuration setting to your web.config file. This will force certain asp.net controls to only output attributes that comply with XHTML standards.
<xhtmlConformance mode="Strict" />
This is only really important if you are trying to conform to the strict standard. If the setting is omitted the default output for ASP.NET controls is XHTML transitional. I prefer to add the tag in any way to make it explicit. Unfortunately many SharePoint controls are still going to output non-compliant tags.
Master Page BasicsThe master page is where you are going to do the majority of the validation work. It really pays to start fresh with the Microsoft minimal master page. You are going to cause yourself a serious amount of pain if you try and tweak one of the out-of-the-box master pages. It is also best to start off using a page that is based on a blank page layout. This will help identify where any validation errors are actually coming from. The first thing to do with the master page is set the doctype to reflect the xhtmlconformance setting. For example if I'm using the strict standard my doctype will be:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
You can also add a few XHTML attributes to the html tag:
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
At this stage it is worth performing a W3C validation check to get a feel for the types of errors that need to be fixed.
My initial check yielded a horrifying 232 validation errors, with no content or navigation. Did I say this was going to be easy?
If you scroll through some of the errors you should notice that most of them are to do with invalid attributes. The majority of the errors come from only a few controls; namely the site actions menu and publishing console (referred to as Authoring Controls from here on).
Authoring ControlsThese controls aren't as big a problem as they first appear to be. We only really care about W3C compliance for anonymous (public) accessing users. As the authoring console is security trimmed when a normal public user accesses the site the authoring html won't be rendered at all. You can prove this by enabling anonymous access, logging out and then trying to view the page anonymously. Make sure you have checked in and published the master page and layout page or you won't be able to see the changes. The page will only display the welcome control displaying the 'Sign In' message. A quick validation check should show only a dozen or so validation errors. About half of these errors will have something to do with the site actions control. Although it has been removed from the page visually it is still rendering some HTML. There is an easy to solution to this problem; the SPSecurityTrimmedControl. This control allows blocks of content to only render when the user has specified a specified permission set. By wrapping the site action control inside the security trimming control and setting a required permission, the site action control is prevented from rendering any HTML at all.
<SharePoint:SPSecurityTrimmedControl PermissionsString="AddAndCustomizePages" runat="server">
<PublishingSiteAction:SiteActionMenu runat="server"/> </SharePoint:SPSecurityTrimmedControl>
Remove/clean non-compliant HTMLUpon making a closer examination of the remaining validation errors it becomes obvious that most of them are to do with poorly declared script tags. Unfortunately these script tags are usually generated by the HtmlForm control so it's not easy to override the output. The one technique that can be applied is to override the Page.Render method and do a bit of tag cleaning. This effectively lets us hijack the HTML rendering process and have a chance to add, modify or remove parts. It does involve writing a bit of inline code in the master page. Assuming that you have set your master page to allow inline code we can add some code similar to the following in the master page.
<script type="text/c#" runat="server">
protected override void Render(HtmlTextWriter writer)
// extract all html
System.IO.StringWriter str = new System.IO.StringWriter();
HtmlTextWriter wrt = new HtmlTextWriter(str);
// render html
string html = str.ToString();
// find all script tags
Regex scriptRegex = new Regex("<script[^>]*");
MatchCollection scriptMatches = scriptRegex.Matches(html);
// go through matches in reverse
for (int i = scriptMatches.Count - 1; i >= 0; i--)
// identify script tags with no type attribute
if (scriptMatches[i].ToString().IndexOf("type") < 0)
// add type attribute after script opening tag
// write the 'clean' html to the page
This code block uses some regex matching to find all script tags and add a type attribute to any tags that don't already have one. It is only a partial solution to the script tag problem, more code will need to be written to completely clean the tags.
Note that this code is crude and untested, but it should give you an idea of what can be done. It would be prudent to keep this kind of code to a bare minimum, so that performance is not affected. In other words you should only be using this method when you have no other option.
Meta TagsYou could extend the render code sample to remove all of your non-compliant code if you want. I would suggest that this is not the best way of doing things. Another technique is to replace standard SharePoint controls with your own custom built controls. A simple example of where this can be done is with the RobotsMetaTag. Using Lutz Roeder's Reflector I was able to extract the following (simplified) code for the RobotsMetaTag control:
public class RobotsMetaTag : SPControl
protected override void Render(HtmlTextWriter output)
output.Write("<META NAME=\"ROBOTS\" CONTENT=\"NOHTMLINDEX\"/>");
We can see that the META, NAME and CONTENT elements all violate the XHTML rule that stipulates all tags should be lower case. The offending line can be rewritten in a custom control as:
output.Write("<meta name=\"ROBOTS\" content=\"NOHTMLINDEX\"/>");
Now deploy the custom control to your SharePoint environment and add it to the master page in place of the RobotsMetaTag. Two more validation errors out of the way. I recommend using this approach wherever possible.
ContentBetween the security trimming control, render code and custom control methods it is now possible to fix all validation errors and eventually get a conforming web page. Great, we have just made a virtually blank web page comply. The next step is to add all the content and navigation back in. To achieve total compliance it will be necessary to create a custom control for many of the components on each page. The trick here is to make these controls as re-usable as possible.
The majority of the content in your site will come from page layouts and thankfully most of the field controls output XHTML compliant code out-of-the-box. When you come across a control that doesn't comply, again, you will need to make your own.
When it comes to navigation you have a couple of options. Either create custom navigation controls from scratch, or extend the SharePoint AspMenu control, as the source code has now been released for this control.