Friday, December 23, 2011

SharePoint 2010 Migration

After a few months of preparation and testing we finally upgraded our production portal from MOSS 2007 to SharePoint 2010 about two weeks ago. There’s no major issue reported/found so far, and the overall feedback from the end users is positive.

The database approach was used in our migration, i.e. install a new SharePoint 2010 farm and mount the old MOSS 2007 databases. The migration process looks straightforward: run preupgradecheck then fix the issues, and run Test-SPContentDatabase then fix the issues, and finally run Mount-SPContentDatabase. Fixing those issues wasn't too hard but just a matter of time.

We had spent quite a lot of effort on content analysis and cleanup with our 60+G of content databases. At the end all our custom WebParts, dlls, user controls and some other SharePoint resources were repackaged into one solution called "SP2010Migration" that only builds one solution package for all.



The SP2010Migration solution package was supposed to deployed only once and that’s all. It shouldn’t be used anymore unless to build a new farm from scratch again. There’s no Feature in the solution package because it may bring trouble in later deployment. So in the future we can still package any WebPart or component into a Feature without concern of Feature conflict.

One interesting thing is that all user controls loaded by SmartParts were still working fine after migration. But we got some issues with the DataFormWebPart. For example the relative link was’t wrong after migration, and we had to write a console app to replace all "{@FileRef}" by "/{@FileRef}" inside each DFWP’s XSLT across the whole farm.

Another DFWP issue seemed to be more confusing where we only saw error in the DFWP page:

Unable to display this Web Part. To troubleshoot the problem, open this Web page in a Microsoft SharePoint Foundation-compatible HTML editor such as Microsoft SharePoint Designer. If the problem persists, contact your Web server administrator. Correlation ID:…

By Correlation ID we could easily find out the error detail from ULS log:

Error while executing web part: System.StackOverflowException: Operation caused a stack overflow. At Microsoft.Xslt.NativeMethod.CheckForSufficientStack() at SyncToNavigator(XPathNavigator , XPathNavigator ) at …

I tested the setting locally and everything seemed to be okay. How come a out-of-box SharePoint WebPart got a “stack overflow” error in production and it used to be working fine before migration? It turned out that time-out scenario occurred internally during XSL transform process in DFWP in production environment. That time-out threshold is only 1-second which means anything longer than 1 second will cause the error. We have a big list in our production server and the DFWP displays hundreds of rows in a big table which caused the time-out error.

That’s something new in SharePoint 2010 and also something annoying by Microsoft. It's great to introduce new stuff but it's also important to keep old stuff work right? Why not just turned off that new “time-out” feature by default and let end users to have an option to set it?

The worst thing is that there's no way to change that 1-second time-out setting! Microsoft provided "three solutions" for this issue:

1.) Tune XSL to get shorter transform time.
2.) Don't use DFWP instead use other WebPart.
3.) Write code to inherit DFWP and build your own.

Following the instruction we finally got the DFWP back after tweaking its settings, e.g. less columns and smaller page size. That's of course not an ideal way to solve the problem. We hope Microsoft could provide a better solution on this issue.

[2012-3 Update]: The time-out value of a DFWP now is configurable in Farm level with latest SharePoint CU. Refer to this.