Sitecore Data Migrations with SPE

Many Sitecore solutions grow organically as business needs change. Sometimes with new requirements, existing data stored on Sitecore items needs to be reorganized. This can be a pain when the items you need to update are located in unknown areas of the content tree. Manual updates aren’t feasible after the content tree grows past a certain size.

I’ve been working through this problem for the past few days, and have found that writing Sitecore Powershell Extensions (SPE) scripts can be one of the best tools to facilitate these data migrations. Sitecore Powershell Extensions is installed as a Sitecore package on ContentManagement or Standalone servers. It comes with a built in library of a few critical methods that enable accessing and altering Sitecore data. The main ones that I’ve found useful for querying Sitecore items are Get-Item and Find-Item. Get-Item is fairly limited in the number of items it can return, and the query it can execute.

However, Find-Item allows you to write basically any query you want against an existing search index. In my case, all of the data I needed to identify the relevant content items was already present in the sitecore_master_index. My use case was copying data from a file field into a general link field, but you could easily update almost any field data using this approach.

I identified a few criteria for the items & item versions we wanted to update:

  • Items with a specific template id
  • The latest version of the item in the english language
  • The item needs to be located under /sitecore/content (we don’t want to update branch templates, standard values, etc)

A query like the one below allows us to query item versions matching those criteria, and store them in memory. The ContentSearch API will only return 1000 records at most in one query page, so you may need to utilize pagination to gather all of the relevant items.

[Sitecore.Data.ID]$templateId = "{templateIdHere}"

$criteria = @(
    @{Filter = "Contains"; Field = "_fullpath"; Value = "/sitecore/content/"},
    @{Filter = "Equals"; Field = "_template"; Value = $templateId.Guid.ToString("N").ToLower()},
    @{Filter = "Equals"; Field = "_latestversion"; Value = "true"},
    @{Filter = "Equals"; Field = "_language"; Value = "en"}
    
)
$props = @{
    Index = "sitecore_master_index"
    Criteria = $criteria
}

$searchHits = [System.Collections.ArrayList]@()

$pageSize = 250
$offset = 0
$continueQuerying = $true
while($continueQuerying) {
    $pagedItems = @(Find-Item @props -Skip $offset -First $pageSize)
    if($pagedItems) {
        $lastCount = $pagedItems.Count
        $offset += $lastCount
        $searchHits.AddRange($pagedItems) > $null
    } else {
        $continueQuerying = $false
    }
}

After gathering the relevant content search results in memory, you can perform whatever updates to each associated item, then save that data back to the master database.

foreach ($searchHit in $searchHits)
{
    [Sitecore.Data.Items.Item] $itemToUpdate = $searchHit.GetItem()

    // Existing fields that we want to extract the data from
    [Sitecore.Data.Fields.FileField] $existingFileField = $itemToUpdate.Fields["File"]
    [Sitecore.Data.Fields.Field] $existingLinkTextField = $itemToUpdate.Fields["File Link Text"]

    // New field that we want to write data to
    [Sitecore.Data.Fields.LinkField] $newLinkField = $itemToUpdate.Fields["New Link Field"]

    if($existingFileField.MediaItem -ne $null -and ![string]::IsNullOrEmpty($existingFileField.MediaItem.Id.ToString())){
        $linkText = If (![string]::IsNullOrEmpty($existingLinkTextField.Value)) { $existingLinkTextField.Value } Else { "" }
        $newLinkFieldValue = "<link text=`"$($linkText)`" linktype=`"media`" target=`"`" id=`"$($fileField.MediaItem.Id)`" />"
        if($newLinkField.Value -ne $newLinkFieldValue){
            $itemToUpdate.Editing.BeginEdit()
            $viewMoreLinkField.Value = $newLinkFieldValue
            $itemToUpdate.Editing.EndEdit()
        }
    }
}

It’s worth noting that you should always have a backup of your master database before performing this type of data migration. If you have any trouble running your scripts, you can always utilize Write-Output to print data to the screen, or set breakpoints and debug the code within the Sitecore Powershell Extensions script environment. Using this pattern, you can easily script out data migrations – once you’re done with it, you can just delete the script, and hopefully have a slightly cleaner database!

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *