Since 2005 I’ve been heavy involved in SharePoint consulting, development and administration. One of the strongest aspects of SharePoint is its document management capabilities.
When I started focussing on Microsoft CRM I was really amazed by the poor integration between CRM and SharePoint. I seriously wondered if the development teams at Microsoft have spoken to each other. The functionality CRM provides to store documents on SharePoint is very primitive and doesn’t benefit from the strength of SharePoint (on http://msdn.microsoft.com/en-us/library/gg327818.aspx you can read about the CRM/SP integration).
The document integration is based on storing documents inside folders within a document library. From a document management perspective this is a big No No!
By storing the documents inside folders, SharePoint gets crippled terribly. I’ll explain why.
From a functional perspective:
Within a Document Management System (DMS) documents are enriched by adding meta data. Meta data can be seen as a set of properties assigned to the document. Meta data gives a meaningful context to the document. Documents are presented and retrieved using meta data. * wow… that is a lot of meta data in a few sentences *
When a document is uploaded from CRM into SharePoint, the meta data aspect is ignored completely. A folder structure is added to the document, that’s it. Within a SharePoint document library folders doesn’t exist physically, a folder is nothing but meta data. This is the fundamental difference between SharePoint and your file system. In a file system, folders do exist.
I believe folders are one the main reasons Document Management Systems got invented in the first place. If you look at the image below, you’ll see why.
Questions arising: Which document is leading? Is it the version used by manufacturing or the version used by marketing?
If you use folders within a document library in SharePoint you will end up with multiple versions of documents as well. Which means that you moved the problem from the file system to the document library in SharePoint.
Now you probably wonder why Microsoft did come up with this solution… I suppose Microsoft came up with this solution is that this is an almost fail safe solution. By storing the document using just its name and specifying a folder structure, CRM is using the absolute core functionality of the document library within SharePoint. The core functionality is always there no matter what.
By using meta data, CRM should have knowledge about the document library and the field definitions used inside the library (either directly specified or via content types).
From a technical perspective:
Like CRM, SharePoint has a limit of rows that can be retrieved in a single query. The reason Microsoft implemented the limits is to reduce the load on the system. As you know both CRM and SharePoint use SQL Server to store data. When you use folders inside a document library and start querying, the query gets escalated into a table scan (this means that SQL Server has to query every single row in the table).
SharePoint will abort the query in case there are more than 5000 (default limit) rows in the table; you won’t be able to retrieve your documents from SharePoint using a query.
Instead you need to know exactly where your document is located and you need to retrieve the document by specifying it’s URL. In case you don’t use folders in SharePoint, the query can be resolved using one of the indexes.
In the next weeks I’ll investigate this problem and see if I can find a solution.