Don’t hack the SharePoint 2010 database

The Channel Islands recently suffered from a massive power failure. Well, in fact, it was 2.5 power failures in the space of three days (one on the Monday, again on Wednesday around 7pm, then again at about 9pm, about 5 minutes after the power had been restored.)

One of my clients suffered from a fairly extensive failure as a result – a couple of disks in their SAN were lost. This had serious consequences for the SQL Server that was hosting some SharePoint databases on top of it. We experienced lots of page corruptions, and, long story short, the SharePoint content database was left in a mangled state. And to make matters worse, in a “perfect storm” of both independent sets of backups from the previous night were also lost. (How this could have been allowed to happen is a topic for another day.) After exhausting all known options, we were left with a situation where we had a backup which was about 2 weeks old and half a database from which to try and recover some of the recent data. We then broke the biggest SharePoint rule: never go anywhere near the SharePoint database.

I’ve always known it’s a very bad idea to do anything directly to the SharePoint database. Needless to say, there are some small things you may consider doing. But this task was much bigger. We had about 15 tables that were missing data, and unfortunately, common sense departed (or desperation set in) when we hoped that manually copying data from one to the other would solve the issues. It didn’t. We started with the big ones – AllUserData, AllDocs, and AllDocstreams and tried to get as much in as possible. On the face of it, this went pretty well – miraculously, missing documents did start to appear in the document libraries. This seemed to validate our approach, so we carried on – EventLog, EventCache, NameValuePair, etc. But unbeknownst to us, the more we did, the worse things got.

For instance, our custom workflows seemed to be alive and well. But following the status of the workflow from the ListItem, took us to the workflow summary page that said no workflow was running. Tasks that should have been attached to the workflow weren’t there. But they were in the Tasks List. Completing a task seemingly did nothing.

It got worse. Items that should have been moved by the Content Organiser were getting stuck in the Drop Off Library. Errors in the logs were unhelpful – “could not route the document due to the state of the object.” Delving further, the routing rules all existed, but editing the properties showed something was up – the source content types weren’t available. Worse, navigating to Site Content Types was a frightening site – a blank white page. Custom content types and default content types (and fields) were missing. But they all existed in the database. At this point, desperation set in, and I used up one of our Partner’s calls in to Microsoft. This ultimately was futile, because once you’ve touched the content DB, Microsoft won’t support it.

Back to basics. We’ve restored the old backup, and now provisioning a fresh/standalone server to which we can restore the (broken) content DB. Fortunately, parts of it do work – the important bits being the document libraries – I can at least write some custom code to pull the content out, with some XML for the metadata, which can then be manually fed back in to the live server.

Moral of the story? Never, ever, ever touch the SharePoint DB.

1 Comment

Leave a Reply

Your email address will not be published.