Archive

Archive for February, 2009

Preventing .aspx pages from being indexed by SharePoint

February 19, 2009 Leave a comment

One of the issues with having an ASP.NET web application hosted within MOSS 2007 is that pages with the extension .aspx are indexed by default.

Well, being SharePoint, it’s not quite that simple as revealed by the Search Visibilty page (Site Settings > Search Visibility):

SearchVisibility

In a nutshell, unless the lists or libraries within the site have broken the permission inheritance from the site, .aspx pages will be indexed.

This is bad. ASP.NET web application generally pages don’t contain anything of value for searching. Worse, if the page requires a value to be passed via a query parameter the search crawler will fail to parse the page and fill the crawl log with extraneous error reports.

Handily, as the screen shot above indicates, it is possible to turn off this behaviour. Unfortunately, this setting is made at the site level and therefore needs to be set for every site within the site collection.

A possible solution is to use code, as shown below:

web.ASPXPageIndexMode = WebASPXPageIndexMode.Never;
web.AllowAutomaticASPXPageIndexing = false;

This code snippet can be added to the site provisioning handler of a site definition, thus ensuring that all instances of the site will prevent indexing of .aspx pages.

Update

Unfortunately, this doesn’t actually seem to work. The only method by which I can prevent the ASPX pages from being crawled is to create a Rule within the Search Settings for the Shared Service Provider to exclude all URLs ending in *.aspx.

If anyone can suggest why the settings made at the site level don’t seem to have any measurable effect, please leave a comment below…

Advertisements