Tuesday, June 17, 2014

Care When Packaging Item Buckets

If you ever have to create a package for items within an Item Bucket, take care to include all the parent folders along the item path.

If you only include the items themselves, Sitecore doesn't remember that they used to be in Item Buckets and the folder structure it creates to store them in will be of the template "System\Node".

Let's say you want to create a package with the following items. The "Courses" folder is your top-level Item Bucket...



The quickest and easiest way would be to go to the "58" folder, and use the "Add with subitems" button.



However, if the Sitecore instance you install the package on does not already have that exact Item Bucket folder structure... it will create the folders using the "System\Node" template.

I've ended up with folder structures that are a mix of Item Buckets and System\Nodes because of this, and according to Sitecore's support this is something to be avoided as it could potentially cause problems.

So to be safe, go through and add each Item Bucket folder in the chain, as well as any items you want to add. It's more tedious, but it will let you avoid having to Sync a bunch of System/Node folders after you've installed your package.









Monday, June 9, 2014

Mass Delete Through Serialization

Item Buckets in Sitecore 7 are great, but I've run into a problem with them. Deleting mass quantities of data through the UI can be slow. I have one bucket that has ~27k records of data underneath it that I want to clean out and re-import. Any time I try to delete this folder and its children, the site times out again and again.

I tried writing a quick method call to run the delete. I added a button that I could click from the Content Editor, which would make an async call to a method that deleted all the children of the folder. This worked with no time-out, but it took a very long time to loop through all the children and delete everything.

Then I was introduced to a trick using the Serialization functionality inherent in Sitecore.

Sitecore has the ability to serialize content items into text files. This is essentially what TDS is doing behind the scenes. Serializing the content items and managing the created files for you. These Serialization options can be found in the Developer ribbon.


You can also Revert Tree on a content item, which reads the serialized text files and builds the content item and its children accordingly. This is where the trick comes in.





For our example we have a content item called "Courses". The children of this item represent the ~27k pieces of Course data that I have imported and want to delete.






 Check that the content item has been serialized. You can check this by going to the path C:\inetpub\wwwroot\CCBC\Data\serialization\master\sitecore\content\Data\CCBC, and looking for the "Courses" folder and "Courses.item" serialized file. If it hasn't been serialized, then go back to the content item and click the "Serialize Item" button in the Developer ribbon.


If the entire tree has been serialized, you can go into the Courses folder and see more folders and files that represent the structure that Item Buckets build when items are added to them.
 

Go into the Courses folder and delete everything in there. You want this folder to be empty.

Now go back to the Courses item in the Content Editor. Click on "Revert Tree" in the Developer tab. You'll get a dialog box that might last a few minutes while it does its thing.


Once it's done, refresh the Courses folder and you'll see that all of its children are gone!

In the end, this Serialization method took maybe 3-4 minutes to delete ~27k records. Doing it programmatically was taking 15-30+ minutes (not to mention code time, deploying, etc.). Deleting through the UI just wasn't happening at all.

There is one small issue with this however. You'll notice that if you do a search on the Item Buckets folder, it will still return a result count. No actual items get returned, but Sitecore still *thinks* it has returned all the items that used to exist.



This will remain until you do a full index rebuild. Just doing a "Re-Index Tree" isn't enough for this one. This is an issue known by Sitecore, the "Re-Index Tree" functionality only deals with new or updated records. It won't clear deleted items from the index.





Friday, June 6, 2014

Refresh Tree, Partial Re-Indexing

I've been doing a lot of work recently with large data loads, and I've been trying to find ways to optimize the time that my site has to spend rebuilding indexes.

It used to be, after each data import I would kick off a full index rebuild programmatically. Because of the large amounts of data and computed fields, each rebuild was taking a long time and led to me using Hot Swappable Indexes.

An optimization I found that helped with this was how to only re-index the new data that I just added. You can call the method IndexCustodian.RefreshTree() on a folder, and only it and its children will be re-indexed.

The code for this...

     var database = Sitecore.Data.Database.GetDatabase("master");
     Item dataFolder = database.GetItem(new ID(Settings.GetSetting("CourseFolderId")));
     SitecoreIndexableItem indexableFolder = new SitecoreIndexableItem(dataFolder);
     Sitecore.ContentSearch.Maintenance.IndexCustodian.RefreshTree(indexableFolder);

This is essentially the same functionality as clicking the Re-Index Tree button in the Developer ribbon.


Doing this reduced the time spent on re-indexing after an import by a large amount.

One side-note, the RefreshTree() method returns a collections of Jobs that you could then monitor their progress if you wanted. You can read more about Jobs here.

Tuesday, June 3, 2014

Hot Swappable Indexes with Sitecore 7

Recently I had a project that was heavily using somewhat complex computed fields to index a large amount of data. Because of this, rebuilding the entire index was taking a substantial time.

The problem was that our search pages for the site relied on these indexes to function. So anytime a full index was being kicked off, the search would basically be unavailable for the time it took the index to rebuild.

Thankfully, I found out that in Sitecore 7 there is a new feature that lets me avoid this problem. A configuration setting called SwitchOnRebuildLuceneIndex, that essentially rebuilds the index in a second directory. While the index is building, the site runs off the first directory as normal. When it's finished, it swaps over to using the second freshly updated index. No downtime in between.

This was really easy to setup to. Just go to the configuration file Sitecore.ContentSearch.Lucene.Index.Master.config (and don't forget to repeat this for the Web index, and any other instances you have), and replace the following line...

<index id="sitecore_master_index" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">


with...

<index id="sitecore_master_index" type="Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex, Sitecore.ContentSearch.LuceneProvider">

That's it! Now, rebuild the index twice, and you should notice a new directory in your indexes folder.

I wish this was a default setting, it would have saved me a little headache.