The Long Path to Content Deployment


Recently I’ve had the pleasure to use the content deployment feature of MOSS between some of our SharePoint environments.

It was not a smooth ride and I believe that I have now met every single obstacle and bug in that feature. This posting is the result of several months’ worth of frustration.

We got three environments Dev, Test and Prod and we want to use content deployment to move content from the production servers to the development and test servers, simply to ensure current test data.

As a side node, you might want to read the Microsoft brief “Team-based Development in Microsoft Office SharePoint Server 2007” for another good reason to make content deployment one of your skills.

Well, without beating around the bush anymore the following are the steps needed to do a content deployment (with some minor obvious exceptions) in order of appearance.

Setup path

Content deployment requires you to first setup a path and then one or more jobs for that path. Simply put the path is the server to server connection configuration, the job is a specification of what to do and when.

I try to use security best practices so all my both service users are just standard domain users, any special rights they might need are assigned by MOSS during the configuration wizard. In the same vein my farm administrator accounts are standard domain users that are assigned extra permissions only through SharePoint.

 

The first problem with creating a path is an access denied problem during the path setup page (/_admin/DeploymentPath.aspx), when you select the source web application and site collection. For the destination site collection you are required to provide an explicit user with sufficient rights.

Figure 1: The “Access denied” message

It seems that your logged in user needs to be able to read the IIS metabase when you select a web application. There are many ways to grant access, I choose to add my farm administrator to the IIS_WPG local security group on the server. In my opinion the SharePoint team forgot to impersonate the call to read the IIS metadata. Will hopefully be fixed in some future service pack.

The second problem (on the very same page) occurs if you connect to another farm using SSL (which you should!) – You get the exception “Requested registry access not allowed” when you submit the page.

After some tracing I’ve learned that the problem is that the server tries to store the SSL key for your destination server in the registry hive for the system user, which is the correct one, but apparently the SharePoint configuration wizard tightened the security on the keys in question.

To get around this: Grant your user (farm admin) membership to the local WSS_RESTRICTED_WPG security group and grant that group “full control” to HKEY_USERS\.DEFAULT\Software\Microsoft\SystemCertificates. You could also opt for granting your user direct access to the key.

To sum it up:

  1. Your logged in user should be a farm administrator
  2. Should be a member of either the local IIS_WPG or WSS_WPG groups
  3. Grant access to HKEY_USERS\.DEFAULT\Software\Microsoft\SystemCertificates
    1. Add user to WSS_RESTRICTED_WPG group
    2. Grant that group “full control” access to the key

That’s it! You should now be able to setup a path.

Setup Job

Next setup the job to your liking, choose whatever options you desire.

Run the job. If it succeeds then stop reading now and save a few minutes of your time. No? Carry on then…

Hotfix deployment

At this point I got the following exception:

User cannot be found. at Microsoft.SharePoint.SPUserCollection.GetByID(Int32 id) at Microsoft.SharePoint.SPWeb.get_Author() at Microsoft.SharePoint.Deployment.WebSerializer.GetDataFromObjectModel(Object obj, SerializationInfo info, StreamingContext context) at Microsoft.SharePoint.Deployment.DeploymentSerializationSurrogate.GetObjectData(Object obj, SerializationInfo info, StreamingContext context) at Microsoft.SharePoint.Deployment.XmlFormatter.SerializeObject(Object obj, ISerializationSurrogate surrogate, String elementName, Boolean bNeedEnvelope) at Microsoft.SharePoint.Deployment.XmlFormatter.Serialize(Stream serializationStream, Object topLevelObject) at Microsoft.SharePoint.Deployment.ObjectSerializer.Serialize(DeploymentObject deployObject, Stream serializationStream) at Microsoft.SharePoint.Deployment.SPExport.SerializeObjects() at Microsoft.SharePoint.Deployment.SPExport.Run()

Looking like this:

Figure 2: Deployment error – no creator/owner of site

The exception occurs fairly quickly during the preparation phase. It obviously indicates that a creator of a (sub) site it not to be found in the SharePoint user database.

In my case it happened because the farm was originally deployed using a site collection backup/restore from another AD domain, the creators of various sites would then be users in the original SharePoint farm which would be unknown in the new (which is now my source). I suppose you might see this error if you deleted some users as well.

There is nothing you can do about this error; Microsoft however, has a hotfix for this (which also solves a few other bugs). Hotfix number is 313183 and the knowledge base article you are trying to address is kb936867. At the moment of writing this is private hotfix that can only be obtained by contacting MS support. Sucks but there is a way at least.

The hotfix solves a total of 11 bugs including one more in relation to Content Deployment entitled “Violation of PRIMARY KEY” – it seems I avoided one snag after all.

Feature problems

Next error in line occurred right after the last during the export phase. I received the very informative exception:

Failed to compare two elements in the array.
at System.Collections.Generic.ArraySortHelper`1.QuickSort[TValue](T[] keys, TValue[] values, Int32 left, Int32 right, IComparer`1 comparer)
at System.Collections.Generic.ArraySortHelper`1.QuickSort[TValue](T[] keys, TValue[] values, Int32 left, Int32 right, IComparer`1 comparer)
at System.Collections.Generic.ArraySortHelper`1.QuickSort[TValue](T[] keys, TValue[] values, Int32 left, Int32 right, IComparer`1 comparer)
at System.Collections.Generic.ArraySortHelper`1.Sort[TValue](T[] keys, TValue[] values, Int32 index, Int32 length, IComparer`1 comparer)
at System.Collections.Generic.ArraySortHelper`1.Sort(T[] items, Int32 index, Int32 length, IComparer`1 comparer)
at System.Array.Sort[T](T[] array, Int32 index, Int32 length, IComparer`1 comparer)
at System.Collections.Generic.List`1.Sort(Int32 index, Int32 count, IComparer`1 comparer)
at System.Collections.Generic.List`1.Sort(IComparer`1 comparer)
at Microsoft.SharePoint.Deployment.WebSerializer.GetDataFromObjectModel(Object obj, SerializationInfo info, StreamingContext context)
at Microsoft.SharePoint.Deployment.DeploymentSerializationSurrogate.GetObjectData(Object obj, SerializationInfo info, StreamingContext context)
at Microsoft.SharePoint.Deployment.XmlFormatter.SerializeObject(Object obj, ISerializationSurrogate surrogate, String elementName, Boolean bNeedEnvelope)
at Microsoft.SharePoint.Deployment.XmlFormatter.Serialize(Stream serializationStream, Object topLevelObject)
at Microsoft.SharePoint.Deployment.ObjectSerializer.Serialize(DeploymentObject deployObject, Stream serializationStream)
at Microsoft.SharePoint.Deployment.SPExport.SerializeObjects()
at Microsoft.SharePoint.Deployment.SPExport.Run()
*** Inner exception:
Object reference not set to an instance of an object.
at Microsoft.SharePoint.SPFeature.EnsureProperties()
at Microsoft.SharePoint.SPFeature.get_TimeActivated()
at Microsoft.SharePoint.Deployment.WebSerializer.ExportFeatureComparer.System.Collections.Generic.IComparer .Compare(ExportObject exportObject1, ExportObject exportObject2)
at System.Collections.Generic.ArraySortHelper`1.QuickSort[TValue](T[] keys, TValue[] values, Int32 left, Int32 right, IComparer`1 comparer)

Going to the source server I tried to pinpoint the error by running “stsadm –o export …” on the base site collection url (same error) and then the first level of sub sites (all exported fine).

That command is exactly the same as a content deployment just without the transfer and import part on the destination end (you are responsible for transfer and then use the import command).

The exception basically means that some of the features activated at the site collection level no longer exist on disk. Their feature definition files have probably been deleted. This can easily occur if you delete some features from your solution packs without deactivating/reactivating all features on deployment (and who cares to do that?).

If you know exactly what features are the problem (might be more than you know) I suppose you might be able to reinstall, deactivate and then uninstall to fix the problem. You might also be able to create dummy features with the correct ids and then try to install, deactivate and uninstall.

I did neither; code had to be written 😉

What you need to do is:

  1. Go recursively through your web application / site collection / root web / web
  2. Each of these “parent” objects have a Feature collection
  3. Each of these features in the feature collection should be examined
    1. If there is no feature.Definition (== null) then this is one of the faulty features.Simple remove it by executing parent.Features.Remove( id, true ). The force parameter is needed since the feature is not properly installed anymore, so you just remove it without any knowledge of deactivation event handlers etc.

I wrote an aspx page for this that is installed as a feature in the Central Administration site:

Figure 3: My page to list and possibly remove features from web app / site / web

Pretty cool 🙂

In due time, I will clean up the code and publish another article about this administration feature along with a few others that I’ve developed and since found indispensable.

Server Name Problem

Finally the export phase can be completed. Next problem is the transfer phase, which simply moves the exported files from the server assigned the task of performing the content deployment job to the destination server. In the path specification the location of the Central Administration site on the destination server is specified along with the destination web application and site collection.

The source server tries to export the file directly to the destination server handling content deployment jobs, which might or might not be the same server that’s running the Central Administration site. It does so by resolving the FQDN (Fully Qualified Domain Name) of the server, which might very likely be a problem to you.

If you deploy content between servers in separate network segments this won’t work out of the box, e.g. the source server can probably not find your destination server by the name “my_dest_server.my_domain”, which is only known within the immediate local AD domain of that server.

There’s no reason to think too deeply about these names – just try to do a content deployment and if it fails during the transport phase it’ll report “The remote upload Web request failed” along with the name it’s trying to resolve. A similar event log entry is also created:


Event Type: Error
Event Source: Windows SharePoint Services 3
Event Category: Timer
Event ID: 6398
Date: 6/12/2007
Time: 9:48:07 PM
User: N/A
Computer:
Description:

The Execute method of job definition Microsoft.SharePoint.Publishing.Administration.ContentDeploymentJobDefinition (ID 2f94ff2b-2aa1-498b-96ba-649c2e75ada7) threw an exception. More information is included below.

The remote name could not be resolved: ‘servername.dev.local’

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

The fix is simple:

  1. Open your host file, usually located at c:\windows\system32\drivers\etc\hosts
  2. Add a new line (don’t delete existing lines): “qqq.xxx.yyy.zzz servername.dev.local“. Be sure to use the correct ip address and name for your destination server. Remember that it should be an ip address that works from the source server (in some cases internal other external ip). You can probably get away with specifying the same ip address you used for the Central Administration site in that path definition – if you used a dns name then just ping the dns and grab the ip address.
  3. Save the file
  4. Retry your content deployment. No need to restart any services the fix is immediate.

Feature problems – part II

We’re now in the import phase and almost through. During the import SharePoint will check whether all the activated features at the source site collection (and sub sites) are also available at the destination site collection.

If not you’ll get an informative error, similar to: “Content deployment job ‘Remote import job for job with sourceID = 71bb6ada-762c-4e78-8bc3-2a105bbe5988’ failed.The exception thrown was ‘Microsoft.SharePoint.SPException’ : ‘Could not find Feature xxxxxx.’

The solution is obvious – install the same solutions/features at the destination web application as on the source.

Specified Name is already in use

If you get an error similar to “Content deployment job ‘Remote import job for job with sourceID = 71bb6ada-762c-4e78-8bc3-2a105bbe5988’ failed.The exception thrown was ‘Microsoft.SharePoint.SPException’ : ‘The specified name is already in use. A list, survey, discussion board, or document library cannot have the same name as another list, survey, discussion board, or document library in this Web site. Use your browser’s Back button, and type a new name.’

You forgot to read the manual (ok, blogs) that specifies that the destination site collection should be a brand new blank site collection – in my case I tried to export to a newly created publishing site collection.

The Small Print

Finally your content deployment ought to be complete 🙂

If your site looks a bit strange it’s probably because the master page settings wasn’t copied, so you’ll have to assign the correct master page through the site settings. Might also be the case for the welcome page, though I haven’t confirmed it.

During the export every file of your site will be copied to the deployment package so if any files have gone missing you’ll get appropriate warnings during the export, but it’ll still complete.

This is actually a great way to detect if any aspx files have been deleted by accident, e.g. an AllItems.aspx page for a custom list might have gone missing if somebody changed the list definition (probably deployed within a feature).

Phew! At least it works for me now…

(Updated) Other bugs

This is a small list of other bugs I’ve heard/read about

1. (Thanks Harry!) The Cab-files are not always deleted and remain in the folder C:\WINDOWS\Temp\ContentDeployment. Regardless of the content deployment setting of the number of jobs to keep, they will not be deleted. Roll your own workaround and delete the files. You could schedule a job to delete everything older than 1 day (as I’m sure your job can complete in less time than that)

And of course: Move the directory location to a non-system drive

2. (Thanks Harry!) Specific sites within the site collection: I select only the language-variations but not the root. After Test Job or Run Now the root has been always added to the Scope (Fix 937208 apparently solves the problem)

3. Some characters are mangled after deployment, specifically “nbsp;” in html fields (probably a lot of others too). Update: This has been fixed with hotfix 938536 (private hotfix for now – sorry) 🙂

4. Missing disk space on destination server will be reported (on the source server) as
Failed to read package file. at Microsoft.SharePoint.Deployment.ImportDataFileManager.Uncompress(SPRequest request) at Microsoft.SharePoint.Deployment.SPImport.Run() *** Inner exception: Failure writing to target file at Microsoft.SharePoint.Library.SPRequest.ExtractFilesFromCabinet(String bstrTempDirectory, String bstrCabFileLocation) at Microsoft.SharePoint.Deployment.ImportDataFileManager.<>c__DisplayClass2.<Uncompress>b__0() at Microsoft.SharePoint.SPSecurity.CodeToRunElevatedWrapper(Object state) at Microsoft.SharePoint.SPSecurity.<>c__DisplayClass4.<RunWithElevatedPrivileges>b__2() at Microsoft.SharePoint.Utilities.SecurityContext.RunAsProcess(CodeToRunElevated secureCode) at Microsoft.SharePoint.SPSecurity.RunWithElevatedPrivileges(WaitCallback secureCode, Object param) at Microsoft.SharePoint.SPSecurity.RunWithElevatedPrivileges(CodeToRunElevated secureCode) at Microsoft.SharePoint.Deployment.ImportDataFileManager.Uncompress(SPRequest request)

5. If you choose to copy “all” security information between servers not in the same domain you might get the following error (copying role definitions only works fine)
A duplicate name “62c4fcbb-7ff7-4cc3-842e-17476b2e6219” was found. at Microsoft.SharePoint.SPFieldCollection.AddFieldAsXmlInternal(String schemaXml, Boolean addToDefaultView, SPAddFieldOptions op) at Microsoft.SharePoint.Deployment.ListSerializer.CreateOrUpdateField(SPList list, String fieldName, XmlNode fieldNode) at Microsoft.SharePoint.Deployment.ListSerializer.UpdateListFields(SPList list, Dictionary`2 listMetaData) at Microsoft.SharePoint.Deployment.ListSerializer.SetObjectData(Object obj, SerializationInfo info, StreamingContext context, ISurrogateSelector selector) at Microsoft.SharePoint.Deployment.XmlFormatter.ParseObject(Type objectType, Boolean isChildObject) at Microsoft.SharePoint.Deployment.XmlFormatter.DeserializeObject(Type objectType, Boolean isChildObject, DeploymentObject envelope) at Microsoft.SharePoint.Deployment.XmlFormatter.Deserialize(Stream serializationStream) at Microsoft.SharePoint.Deployment.ObjectSerializer.Deserialize(Stream serializationStream) at Microsoft.SharePoint.Deployment.ImportObjectManager.ProcessObject(XmlReader xmlReader) at Microsoft.SharePoint.Deployment.SPImport.DeserializeObjects() at Microsoft.SharePoint.Deployment.SPImport.Run()

On hotfixes: Please note that hotfixes are cumulative and later numbers super seeds earlier ones – so the game is to get the highest number of all 😉 

(Updated) Conclusion (sort of)

Some of you have asked (in the comments) whether or not it all worked in the end. It actually did work! 🙂 All of the above steps where enough to solve our immediate problems with the content deployment between our three environments.

Great! So do we use it? No.

We really really wanted to use this feature and all looked so well after our the latest hotfix deployment that we planned to use it for the existing publishing site as well as for a some additional upcoming site collections we’re developing.

One particular annoying feature (sorry for the pun) is breaking the content deployment which I have yet to find a solution for. The new sites are based on custom site definitions and a number of features that creates custom content types and site columns from a number of xml files. Some of these site columns are lookup columns that which cannot be created with xml files the same way as every other site column, because they need to refer to an existing list by list id (in the xml file) but the list will be assigned a dynamic id by the system when it’s created (by another feature that the site columns is dependent on).

To get around that problem a feature activation handled is executed that creates the lookup site columns using custom code that find the dynamic list id from the name. That code is roughly based on code found on codeplex (here). Some minor differences: I fixed some bugs with internal/display name mixup, reactivation problems and a missing “webid” (had to be dynamic as well) in the constructed field. The missing “webid” did cause content deployment to fail immediatly. For some reason I have yet to track down it now fails if the content type using one of the lookup site columns is in use, i.e. a list item of that type exists anywhere in the site collection. For lookup site columns created through the web interface there are no problems. Bummer.

The choice we’re facing (barring that I manage to solve the problem before long) is

1. Either: We can define the content type through xml files

2. Or: we can use content deployment and create content types manually through the web interface – they will be copied as part of the content deployment)

We chose the first option as content deployment at the current level of maturity seems too unstable. Furthermore we also decided to use only one strategy and therefore not use the content deployment for the first site collection that this article originally targetted (one where all content types where created through the web interface).  

How do we do it now? We use “stsadm -o backup/restore” to deploy new versions of the site. It’s essentially a database backup with all the benefits and drawbacks of such. It’s very stable. Specifically history of all items are maintained, you get a messed up user database where you’ll (eventually) find users from all environments, you need to explicitly set new ownership (to a valid user that can be found in the relevant environments AD), you get the luxury of copied security permissions sets and groups (which I might still build a tool to import/export).

Advertisements