Freeware - SharePointUpload
Note: Although the tool was designed for SharePoint's naming restrictions, the tool can be used to extract all the above to a normal folder (e.g. C:\Temp).
Windows Sharepoint Server (WSS) v3.0 has a feature in that you can copy files directly to it using standard UNC paths. e.g. if your SharePoint document library path is http://sharepoint.mydomain/SDLCSearch/ documents/RequirementDocuments/AllItems.aspx then you can upload documents directly by copying them into \\sharepoint.mydomain\SDLCSearch\ documents\RequirementDocuments. To do this you must have the 'WebClient' service started from the PC which you are initiating the request.
Below are some examples of SDLC data you want to index and how this could be copied to a Sharepoint site for indexing.
- Visual Source Safe - Using VSS automation you can conduct a 'GetLatest' direct into the WSS Document Library UNC path. An advantage of this technique is that VSS only copies over files that have changed.
- E-mails - Using the Redemption Data Objects (RDO) you can extract e-mails from PST, Public Folders and Exchange Mailboxes into MSG, RTF or HTML format, these can be copied directly into the WSS Document Library.
- Quality Centre - Test Cases can be extracted as a single HTML document via a report, however for indexing purposes it is much more useful to have each test case as its own document. Using HTML parsing over the single Test Case HTML extract you can split the single HTML report into each Test Case.
- Windows Sharepoint Lists /Wikis - Lists and Wikis on other sites that need to be included in indexing need to be pulled into the SDLC Searching site. Using WSS List Web Services you can run a query and extract each list item into seperate documents HTML documents.
The SharepointUpload tool that is available from Gluegood can effectively do all of the above and has the following features :-
- Freeware - with source included.
- Command line driven - This allows the tool to be scheduled. The SharepointUpload tool has no UI. All activity is driven by a config.xml file with errors and logs to text files. Additionall has a Debug entry in the Config.xml which allows you to see the actions being performed, additionally any errors are dumped into a \Trace\ folder. Program terminates on any exceptions.
- Honours WSS invalid file and path name conventions - See here for a more detailed list. Anything it doesn't like it replaces with '_' underscore, or trims to required length.
- Supports full reloads - sometimes you want to wipe all the content you previously uploaded. Setting a flag in the config.xml will do this by running the command cmd /c del /q /f /s "<SharepointUploadPath>"
- Has a 'High Water Mark' concept - so that if you have already extracted a file it doesn't re-extract. Used in EMAIL by only extracting those e-mails that have a 'SentOn' date greater than or equal to the 'High Water Mark' date. Used in WSSList extract by only extracting those list items that have a 'Modified By' date greater than or equal to the 'High Water Mark' date. This is reset after a successful run in the Config.xml file(except if you manually set this date to '2000-Jan-01' - this allows for re-runs, but not removing old content using the Full Reload feature)
- Has matching for e-mails - E-mail is a bit of a monster if extracting to file names. Not only do you have to worry about the filename, but you need to work out if you have a new e-mail, or an e-mail you have already extracted. e.g. in SharepointUpload it appends _1 for the 1st duplicate found and _2 for the 2nd duplicate, and so on. If you are re-running the program and it tries to write to the same filename, before it overwrites it looks at the Subject and the SentOn Date/Time to check if they are the same - If they are then overwrite existing, else create a new filename looking for the next one available.
- Regular Expression filtering - Available on VSS (file and folder paths), and E-mails (folder restriction only)
- Automatic subfolder generation - To ensure that folders don't become full of content and become unusable in WSS each extract will create subfolders (except VSS) based on hard-coded settings. e.g. E-mail will great a subfolder based on the month and year (e.g. YYYYMM), WSSLIST and QC will create based on every 10,000 Ids. e.g. folder for 10000 and a folder for 20000.
Download. VB.NET source is included in download.
1. Copy and Unzip the SharepointUpload.zip archive to a folder on your harddrive. Keep both \Trace and \Log folders.
2. For Visual Source Safe (VSS) functionality if VSS 6.0 isn’t installed on the PC that you’ll be running SharePointUpload then copy the Visual Source Safe automation dlls SSUS.dll and SSAPI.DLL to your VSSMonitor folder. Register SSAPI.DLL.
3. For e-mail (EMAIL) functionality install the Developer Version of Redemption (tested against v4.4) - download and install as per directions from the site. Note: Ensure that if Outlook isn't installed on the PC that you either install Outlook or install the ExchangeMapiCdo.exe which is a dependency of the Redemption component.
4. Configure the config.xml file which your VSS, EMAIL, WSSLIST, WSSWIKI, QC, and DB settings. Samples are provided in the config.xml file.
5. Launch SharepointUpload with the command line parameter of the config.xml file path. e.g. SharepointUpload.exe config.xml or use the SharepointUpload.bat file which logs the output to a file based on current month and year.
Config file overview
... many patterns
Below are attributes common to all Patterns :-
- Name - Name of the pattern. e.g. "All Documents from VSS"
- SharepointDestination - Destination UNC path. The SharePoint path where documents should be uploaded.
- HighWaterMark - Sets the last date that the Pattern was run and provides filtering for next run against EMAIL and WSSLIST types.
- Debug - Sets level of verbose logging. True for verbose, false for totals only
- PatternType - Type of the pattern. See below list and additional parameters per type.
Each Pattern is based on PatternType
Visual Source Safe extract using Get Latest automation.
- SourcePath - location of VSS INI file
- Username - Username for VSS
- Password - Password for VSS
- PathTopLevel - Location to commence searching in the tree.
- RegExFilter - Restriction on what to extract based on folder and filename. Blank is all.
WSSLIST (Windows SharePoint Server List)
Extracts each list item to a simple HTML document. The name of the document is in the format of <ID> - <Title>.htm. Special OWS fields are excluded.
- SourcePath - location of URL to list. e.g. http://sharepoint.mydomain/site/Lists/NewReq/MyView.aspx.
- ListName - Name of the List. Use either name - "New Requirements" or GUID. See here to find the GUID of the list.
- ViewGUID - GUID of the view to use. See here to find the View GUID.
WSSWIKI (Windows SharePoint Server List)
Extracts each Wiki item to a simple HTML document. The name of the document is in the format of <ID> - <Title>.htm. Special OWS fields are excluded.
- SourcePath - location of URL to AllPages view of the Wiki. e.g. http://sharepoint.mydomain/site/Wiki/Forms/AllPages.aspx.
- ListName - Name of the List. Use either name - "Wiki Pages" or GUID. See here to find the GUID of the list.
- ViewGUID - GUID of the view to use. See here to find the View GUID.
Extracts each e-mail (PST, Public Folder, or Exchange Mailbox) as MSG. The name of the e-mail is in the format of <subject>_<version>.msg.
- SourcePath - PST file path or Public Folder path (must start with \\Public Folders\..), or Exchange Mailbox Name
- Username - ExchangeMailboxName, need a base name
- RegExFilter - Matching of folder names. e.g. 2007 will extract for e-mails in the 2007 folder and subfolders. Blank is all.
- ExchangeServer - Location of the ExchangeServer
Extracts each QualityCentre test case as HTML. The name of the QC Test case is in the format of TestCase <TestCaseIdt> - <TestCaseTitle>.htm. Note: This should work for QualityCentre v9.0, however the extract isn't that intelligent and the HTML parsing may not work for your instance of QualityCentre.
- SourcePath - Path to QualityCentre TestCase report.
I'm not even going to attempt to document this, read the code and suit it to your own purpose. Uses DataSets, not DataReaders to improve locking contention.
** Legal **
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.