This is a tale of getting close to one of SharePoint’s holy grails but not quite…
Site collections are the most scalable SharePoint container and they offer lots of advantages over building site heirarchies with layers of sub-sites. However information in one site collection can’t be made visible to another site collection using out-of-the-box SharePoint components. This is a major inconvenience when trying to aggregate content on a portal for example.
Solution (almost)
The solution evolves from first using the SPSiteDataQuery class to run a query upon each site collection and then aggregating the results.
The example below queries all Calendar lists for events within a date range:
// for each site collection
var currentApp = SPContext.Current.Site.WebApplication;
foreach (SPSite site in currentApp.Sites)
{
var query = new SPSiteDataQuery()
{
RowLimit = 100,
Lists = @"<Lists ServerTemplate='106' />",
Webs = "<Webs Scope='SiteCollection' />",
Query =
String.Format(
@"<Where>
<And>
<Geq>
<FieldRef Name='EventDate' />
<Value Type='DateTime'>{0}</Value>
</Geq>
<Leq>
<FieldRef Name='EndDate' />
<Value Type='DateTime'>{1}</Value>
</Leq>
</And>
</Where>",
startDate.ToString("yyyy-MM-dd"), endDate.ToString("yyyy-MM-dd")),
ViewFields =
"<FieldRef Name='Title' />
<FieldRef Name='ID' />
<FieldRef Name='EventDate' />
<FieldRef Name='EndDate' />
<FieldRef Name='Location' />
<FieldRef Name='Description' />
<FieldRef Name='fAllDayEvent' />
<FieldRef Name='fRecurrence' />
<FieldRef Name='FileRef' />"
};
var results = site.RootWeb.GetSiteData(query);
// aggregate the results
...
}
This will work, however it clearly doesn’t scale well as it will query many sites and webs each time it is run.
The next step was to consider the CrossListQueryCache class which provides the ability to cache the results. In reality, apart from introducing the cache, this class doesn’t do much more than wrap the SPSiteDataQuery class and the call to SPWeb.GetSiteData().
Continuing the same example, swap out the SPSiteDataQuery with:
var query = new CrossListQueryInfo
{
UseCache = true,
RowLimit = 100,
Lists = @"<Lists ServerTemplate='106' />",
Webs = "<Webs Scope='SiteCollection' />",
Query =
String.Format(
@"<Where>
<And>
<Geq>
<FieldRef Name='EventDate' />
<Value Type='DateTime'>{0}</Value>
</Geq>
<Leq>
<FieldRef Name='EndDate' />
<Value Type='DateTime'>{1}</Value>
</Leq>
</And>
</Where>",
startDate.ToString("yyyy-MM-dd"), endDate.ToString("yyyy-MM-dd")),
ViewFields =
"<FieldRef Name='Title' />
<FieldRef Name='ID' />
<FieldRef Name='EventDate' />
<FieldRef Name='EndDate' />
<FieldRef Name='Location' />
<FieldRef Name='Description' />
<FieldRef Name='fAllDayEvent' />
<FieldRef Name='fRecurrence' />
<FieldRef Name='FileRef' />"
};
var cache = new CrossListQueryCache(query);
var results = cache.GetSiteData(site, site.RootWeb.ServerRelativeUrl);
Unfortunately attempting to use the cache for each site collection query causes problems – any attempt to query a site collection beyond the current one results in a error:
There is no Web named "/sites/WebSite".
I *think* this is due to the way that CrossListQueryCache class is written. When the UseCache property is set to true, the query reverts back to the current SharePoint context – as revealed by examining Microsoft.SharePoint.Publishing with .NET Reflector:
if (this.queryInfo.UseCache)
{
while (cachedArea != null)
{
this.selectedData =
cachedArea.GetCrossListQuery(query, SPContext.Current.Web);
goto Label_0022;
}
}
CommonUtilities.ConfirmNotNull(web, "web");
this.selectedData = web.GetSiteData(query);
goto Label_0022;
Aside: I know what you’re thinking – omigod, a goto statement.
So when the query is executed and set to use the cache, it attempts to pull data from the current site collection’s cache rather than the one being queried. Unfortunately, even setting the UseCache property to false doesn’t fix the issue.
Close but no cigar…
The net result is that a cross-site collection query is possible using SPSiteDataQuery but clearly this could have a serious performance impact with querying multiple site collections and sub-sites. It would be nice to be able to have the query results cached however this doesn’t seem possible using CrossSiteQueryCache.