Improving the Stability of ECTS

ECTS, the external collaboration toolkit for sharepoint, has been a bit of  a thorn in my side.  While fulfilling a vital need at our institution, it is difficult to support, and contains many bugs that apparently are not going to get fixed.

However, I recently I managed to pull off a few minor tweaks for ECTS that should improve stability in our environment.

Fix 1:  ECTSEx

First off, I installed the EXTS Extensions available in Codeplex:

My thanks to David McWee for providing these accelerators to the ECTS management interfaces.  We are now able to pull up info on ECTS users considerably more easily.  These extensions also allow password reset information to go directly to the user for whom they are intended.  Good stuff.  Unfortunately, the interface is not behaving on all nodes of our SharePoint web farm, which brings us to our next fix…

Fix 2:  Load Balancer reconfiguration

We have been experiencing intermittent problems which suggests duplication of actions by the ECTS processes.  For example, a user may receive two account activation notices, each with different passwords.  A user may change a password, only to have the new password expired immediately, followed by prompting that your password has been reset by an administrator.

My assumption is that somehow requests to either the web front end servers of the backing AD LDS instance are being replicated, resulting in replicated account management operations.  To combat this, I have reconfigured our F5 load balancers as follows:

  1. For the Big-IP Virtual Server which handles traffic to “” (our ECTS front-end), I set the Persistence Profile to “none”.  This prevent users from continuing to use a LB node if a more preferred node is online.
  2. For the Big-IP pool connected to PartnerPoint, we activate “Priority Group Activation” with the “Less than 1” attribute.  This will cause clients to be routed to a different Pool member priority group if there are less than one servers remaining in the highest priority group.
  3. We then set the “sharepoint3” node to be in priority group 2, and “sharepoint2” to priority group 1.  Now when users connect to the virtual server, they are routed to priority group 2 (sharepoint3), and only go to sharepoint2 if sharepoint3 is down.
  4. Repeat these three steps for the AD LDS virtual server.  This limits LDAP traffic to sharepoint3, unless sharepoint 3 is down.
  5. Now we need to limit traffic to the ECTS admin interfaces to connect preferentially against sharepoint3 as well.  We do this with iRules.  Create a new iRule which will intercept traffic to the ECTS management pages, and route it to a different pool:
    when HTTP_REQUEST {
     if {[HTTP::uri] starts_with "/ECTS/"} {
     pool WSS_partner_admin
  6. We then create a Pool with priority group activation, as outlined in steps 1-3, above, and assign this iRule to the main SharePoint virtual server.

I am able to verify though packet captures that all requests to the ECTS admin pages, and all traffic to the ECTS front ends is being routed preferentially to “sharepoint3”.  So far, this seems to have helped.  If nothing else, the admin pages are now loading consistently, which was not true for us in the past.

Killing Process as a Scheduled Task

A client of mine recently configured a Windows Scheduled Task to kill unwanted/hung process on a nightly basis.  One could argue the merits of these tasks, but I just need to “get things done”, so moving on…

The problem with these tasks is that they return error codes if the task in question is not actually running.  This generates errors in the OS, and I get paged as a result.  What is going on here?

Looking at the scheduled task, we see that it consists to a single executable… “taskkill.exe /f /im [process]”.  If the process is running, it gets terminated with no error code.  If it is not running, we get rc=80.

I figured we could do better a bit better, and an hour of bashing my head against the limitations of cmd.exe later, we have a script:

@echo off
REM killProc.bat – J. Greg Mackinnon, 2010-04-09
REM uses "taskkill.exe" to terminate the process specified by the first input argument
REM PROVIDES: the return code of the "taskkill" process, or "100" if a process name was not provided as input
REM REQUIRES: "tasklist.exe", "taskkill.exe", cmd.exe running under Windows 2000 or later.

set exitCode=0
set proc=%1
if NOT DEFINED proc goto error

tasklist.exe /FI "IMAGENAME eq %proc%" 2>NUL | find /I "%proc%" >NUL
taskkill /f /im %proc%
set exitCode=%errorlevel%

goto end

echo You must provide a valid process image name as an argument to this script.
set exitCode=100
goto end

exit /B %exitCode%

Now we just create a scheduled task with the following syntax:

cmd.exe /c killProc.bat [processName.exe]

If the process is running, we get the return code from “taskkill.exe”.  If the process is not running we get rc=0.  If no process name was specified, we get rc=100.


Automatic Maintenance Mode after Windows Update

Since our original deployment of Microsoft Operations Manager 2005 (now System Center Operations Manager 2007 R2), we have struggled with handling of alerting during maintenance windows for our servers.  As many a SCOM admin can tell you, your management server can do a whole lot of squawking if you fail to set maintenance mode before a server reboot.

In the past, we configured notification blackout windows during scheduled Windows Update times.  This worked, sort of… While we did not get paged during system reboots, we none-the-less had a lot of alerts that needed to be closed out every time a system rebooted.  Why?  Because suppressing notification does not suppress alerting.  What we really needed to do was to put systems into maintenance mode before the Windows Update triggered reboot.  This always seemed to difficult to implement, so we never did it.  Scripting of Maintenance Mode requires that any account attempting to start MM have "admin" rights on the RMS.  We did not want to go there with distributed scripts, and so we were at an impasse.  The other big disadvantage to this approach was that it suppressed notifications for all systems in the infrastructure during scheduled maintenance windows, not just the ones that needed a reboot at that particular time.  If there were no Windows Update-induced reboots that evening, too bad.

But now, thanks to Group Policy Client-Side Extensions, PowerShell, various improvements in SCOM 2007 R2 Maintenance Mode, and the fine work of blogger and SCOM developer Derek Harkin, we have a workable solution to this problem.


  • GP Client-Side Extensions with item-level targeting (requires that CSE update be applied to any Server 2003 systems under management, and also use of a Vista-later system running a current Group Policy Management Console)
  • Derek Harkin’s Maintenance Mode Management Pack for SCOM 2007 R2:
  • Our additional VBScript "WUTriggerMaintMode.vbs" to create events triggered by Windows Updates
  • "eventtriggers.exe" on Server 2003 (should be included with the OS)
  • "schtasks.exe" on Server 2008 (definately included with the OS)
  • Windows Scripting Host to run VBScripts on all managed targets (should be there by default…)
  • PowerShell on the RMS (required to install SCOM 2007 R2 anyway…) 

How it Works:

  • Group Policy Preferences run every day to refresh the scripts and Scheduled Tasks on all managed systems.  "MaintModeTrigger2008.xml", "WUTriggerMaintMode.vbs" and "MaintMode.vbs" are distributed to all systems, and tasks are created which perform the following task:
    Watch the "SYSTEM" event log for event ID 22 from source "WindowsUpdateClient".  When this event is seen, run the "MaintMode.vbs script with the arguments "ON 20M"
  • The MaintMode.vbs script will create a WSH entry in the Application event log which will state that the system needs to enter Maintenance Mode for 20 minutes.
  • A trigger set by the Maintenance Mode management pack will detect this event, and run a PowerShell script on the Root Management Server (RMS) that will put the desired system into Maintenance Mode.


  1. Configure your Windows Update policy to delay at five minutes after update application before initiating system reboot.
  2. Install the Maintenance Mode management pack following the included directions.
  3. Create a Group Policy object which is linked to your managed servers for the purpose of configuring Windows Update-triggered Maintenance Mode settings.
  4. Copy the MaintMode.vbs script and, our "WUTriggerMaintMode.vbs", and custom XML file to all managed Windows servers using Group Policy Preferences.  I copied my scripts to “%SystemDrive%localscripts” on all managed system, but you could just plan to run the scripts off of a trusted, highly-available network share.
    NOTE: If you will be copying the files from a network share to a local directory using GP Preferences, then you must grant “Domain Computers” read/execute/traverse rights to the parent directory that contains your files.  If you will be executing the files directly from the file server, you need grant “Domain computers” rights to only the files themselves.
  5. Create a Scheduled Task preference targeted to Server 2003 and Server 2003 R2 systems.  This task will use cscript.exe to run "WUTriggerMaintMode.vbs".  I configured this task to run daily, but you could adjust the interval to suit your needs (weekly, monthly, on next reboot, run once, etc.)
  6. Create a Scheduled Task preference targeted to Server 2008 and Server 2008 R2 systems.  This taks will run schtasks.exe to create a scheduled task from the "MaintModeTrigger2008.xml" task specification file.  I used the command line:
    schtasks.exe /Create /RU SYSTEM /TN "WUTriggerMaintMode" /XML [pathToFile]MaintModeTrigger2008.xml /F

    Note:  I was unable to use the "Vista and later" version of the Scheduled Task preference, which is exposed when running GPMC on a Windows 7 client.  I am unclear why this did not work, but just to be safe you probably will need to use the "legacy" Scheduled Task tool.
    Again, you will need to set a schedule for this task that suits your environment.

  7. Force a Group Policy update on one of each OS type that you are managing to ensure that the GP Preferences are being distributed.
  8. Run the Scheduled Tasks that are created by the the policy to make sure that they work (at least on Server 2003, you can create synthetic events in the SYSTEM event log using "EVENTCREATE.exe".  You can set of the event triggers using this tool.).
  9. Enjoy the silence during your next system maintenance.

Required Files:

MaintModeTrigger2008.xml –

the file which creates a Server 2008 scheduled task that will run the MaintMode.vbs script after Windows Update signals that the server will be rebooted (an Event ID 22).


      <QueryList><Query Id="0" Path="System"><Select Path="System">*[System[Provider[@Name='Microsoft-Windows-WindowsUpdateClient'] and EventID=22]]</Select></Query></QueryList>
      %SystemDrive%localscriptsMaintMode.vbs ON 20M

WUTriggerMaintMode.vbs –

Uses "eventtriggers.exe" on Server 2003 systems to create an event trigger (and corresponding Scheduled Task) which will watch the system event log for Windows Update-triggered reboots (event ID 22) and run MaintMode.vbs when this happens.  By the way, I know that this is crumby code… it could/shold be updated to provide better error handling and return codes.


' WUTriggerMaintMode.vbs - Trigger Maintenance Mode based on pending Windows
' Update-triggered reboot, as seen in the System event log
Option Explicit

Dim bSetTrig
Dim fLog
Dim oShell, oExec, oExec2, oFile
Dim sLine, sSub, sSysDrive

Set oShell = WScript.CreateObject("WScript.Shell")
sSysDrive = oShell.ExpandEnvironmentStrings("%SystemDrive%")

' Create Log File
Set oFile = CreateObject("Scripting.FileSystemObject")
Set fLog = oFile.CreateTextFile(sSysDrive & "localscriptsWUTriggerMaintMo" _
	& "de.log", true)

' Query for existing triggers:
Set oExec = oShell.Exec("eventtriggers.exe /query")

' If Existing triggers previously created by this script are present,
' delete them:
Do While Not oExec.StdOut.AtEndOfStream
	sLine = oExec.StdOut.ReadLine
	if InStr(sLine,"Windows Update - Reboot") then
		fLog.WriteLine("Existing trigger detected")
		sSub = Left(LTrim(sLine), 1)
		fLog.WriteLine("Trigger ID is: " & sSub)
		set oExec2 = oShell.Exec("eventtriggers.exe /delete /tid " & sSub)
		Do while oExec2.Status = 0
			WScript.sleep 100
		fLog.WriteLine("Deletion of Event trigger attempted.")
	end if

' Create latest event trigger:
set oExec2 = oShell.Exec("eventtriggers.exe /create /tr ""Windows Update - R" _
	& "eboot Impending"" /l SYSTEM /eid 22 /tk ""cscript.exe %SystemDrive%l" _
	& "ocalscriptsMaintMode.vbs ON 20M"" /ru ""System"" ")
Do while oExec2.Status = 0
	WScript.sleep 100
fLog.WriteLine("Creation of event trigger attempted.")

SharePoint and Mirrored Databases

Un-fun thing about Windows SharePoint Services 3.0:  Support for mirrored databases.  I did my testing of SharePoint and our mirrored SQL 2008 servers using our SharePoint 2010 beta instance.  That worked great!  Unfortunately, WSS 3.0 does not have the same native support for mirroring.  Instead, we have to go though a bunch of rigmarole using SQL database aliases, some ugly SQL jobs to run the failovers (all T-SQL is ugly, if you are more used to object-oriented scripting languages), poorly documented SQL alerts to monitor mirrorg, and scheduled tasks to change the local aliases on the SharePoint web front ends. 

There is no single reference for this topic, either… just a bunch of incomplete pointers. Each one assumes that you are either an expert on SQL and know nothing about SharePoint, or that you are a SQL dunce who will have to ask the SQL DBA for help, and that you can comprehend only Sharepoint. What about the SharePoint admin who has to be the DBA because your company does not have one of those?

The most complete reference from Microsoft on SharePoint and SQL mirroring can be found here:
This is a much updated version of the doc originally published in the SharePoint team blog, and later updated and added to the TechNet Library:
This is a really poor document, though, since it spends a lot of time telling you how to set up mirroring, but only hints at how to make SharePoint deal with the new configuration.
There also is a published case study on SharePoint mirroring here:
This study is more fleshed out, but lacks details on how to create and configure SQL alerts.
Fortunately, there is a good ref on SQL alerts and database mirroring here:

While the MS guidance is good at telling you what needs to be done, it is short on the details of how to do it.  We know that we need to accomplish the following tasks to enable SharePoint database failover:

  1. On the SQL Servers, detect when any one SharePoint database has lost “Primary Mirror” status, and failover the remaining databases so that they are all running on the same host.
  2. On the SharePoint web front end servers, we need a script that will test the database server specified in the SQL Server aliases entry, and perform the following actions if it is not the current “Primary” mirror role holder:
    1. Change the alias to the current “Primary” mirror
    2. Cycle SharePoint services

Ideally, we would have the script created in step 1 also send a signal to the web front ends to perform the SQL Alias change and SharePoint services cycle.

Microsoft gives us a sample script for the database monitor and failover script (see bullt “1”, above), but they only hint the remaining scripts that will be required, and they do not teall you the best way to tie all the script together.

Here is what I have come up with so far…

T-SQL Script for Database Mirror Status Monitoring and Failover:

The following is adapted from the MS Sample script.  It has been modified to force failover of all mirrored database on the current server if any mirrored DB has failed over.

conditionalFailover.sql - J. Greg Mackinnon, 2010-04-09
SQL Script to detect condition where not all /synchronized/ mirrored databases
are in the primary role.
Script will force failover of all mirrored databases if above conditions are met.

use master;
--Declare and initialize variables
DECLARE @SyncDBCount int;
DECLARE @PrimaryDBCount int;
SET @SyncDBCount = 0
SET @PrimaryDBCount = 0

--Collect total number of mirrored databases
SET @SyncDBCount = (
  SELECT COUNT(ALL database_id)
  FROM sys.database_mirroring
  WHERE mirroring_state=4
PRINT N'Total Synchronized Mirrored Database Count: '
	+ (CAST(@SyncDBCount AS varchar(4)));

--Collect total number of databases holding the "primary" database role.
SET @PrimaryDBCount = (
  SELECT COUNT(ALL database_id) FROM sys.database_mirroring WHERE
 (mirroring_state=4 and mirroring_role=1)
PRINT N'Total Mirrored Databases holding the "Primary" state: '
	+ (CAST(@PrimaryDBCount AS varchar(4)));

-- Begin IF condition when not all synced mirror DBs are Primary.
IF ((@PrimaryDBCount > 0) and (@PrimaryDBCount < @SyncDBCount)) BEGIN
	Print N'Beginning Failover of mirrored databases...'
	FOR SELECT name FROM sys.databases WHERE database_id IN
		(SELECT database_id
			FROM sys.database_mirroring
			WHERE (mirroring_state=4 and mirroring_role=1))

		DECLARE @name varchar(255)
		DECLARE @cmd varchar(1000)
		OPEN i

		WHILE (@@fetch_status  -1) BEGIN
			IF (@@fetch_status  -2) BEGIN
				set @cmd = 'ALTER Database [' + @name + '] SET PARTNER FAILOVER;'
				PRINT N'Running command: ' + @cmd
				exec (@cmd)
			END --end if
		END --end while
END -- End IF conditional
	PRINT N'No failovers will be performed at this time.'

T-SQL Script to determine availability of a primary mirror for a given SQL Server and Database:

The following script can be run from a client system to determine if the named SQL Server and Database are online, and is the current “Primary” mirror holder.  It can be launched using the “SQLCMD.exe” utility, and requires the SQL Native Client to be present on the client system.  Syntax for launching is:

sqlcmd.exe -S [SQLHost] -i [scriptFile.sql] -v dbName=[dbName]

The script will return a text output stating that the database is in one of three possible states… “Primary”, “Secondary”, or “unknown”.  If the database or database server is not availble, then a SQLCMD error will be returned.  Here is the script:

Detect Mirroring status of a single database
Returns char with value Primary|Secondary|unknown
(detection of "noSync" has been removed as a primary with a downed partner
will also be out of sync, but that does not mean that we don't want to use
the primary)
Database to be queried can be set by changing the @DBname variable

use master;

-- Declare and initialize variables
Declare @DBname AS varchar(128);
Declare @mirState AS char(9);
Set @DBname = '$(dbname)';
-- End Declarations

-- Conditional logic to determine if @DBname is Primary or Secondary
IF (
	select mirroring_role from sys.database_mirroring where (
		database_id IN (select database_id from sys.databases where name = @DBname)
) = 1 set @mirState = 'Primary'
	BEGIN --NOTE:  THIS WILL NOT RETURN THE EXPECTED RESULT for a user in the "Public" role on the DB server.
		-- That's okay, because the "primary" switch does work as expected.
	IF (
	select mirroring_role from sys.database_mirroring where (
		database_id IN (select database_id from sys.databases where name = @DBname)
	) = 2 set @mirState = 'Secondary'
	ELSE set @mirState = 'unknown'
--End Primary/Secondary selection

-- Conditional logic to determine if database is Synchronized
/*IF (
	select mirroring_state from sys.database_mirroring where (
		database_id IN (
			select database_id from sys.databases where name = 'VMware_VCDB'
	) != 4 set @mirState = 'noSync'
--End Sync determination

-- Generate Output
Print N'Database ' + @DBname + N' has status: ' + @mirState;

 PowerShell Script to monitor validity of current DB alias, and update if no longer valid:

This script calls on the T-SQL script above to check on the status of the DB Alias, update it if necessary, and cycle SharePoint services on the web front end if the alias is updated. Be sure to update the script to use your SQL Host, Alias, and Database names before using in your environment.

#Invokes "sqlcmd" script to determine holder of current primary mirror server role for the SharePoint databases.
#Sets SQL aliases accordingly

Set-PSDebug -Strict

##### Start script variables: ###
##### End script variables: #####

##### Start script functions: ###

function mirState {
	#Determines the mirroring state of a specified database
	#REQUIRES: SQLCMD.exe in the path specified in the function,
	#	the "determinMirDBState" sql script, in the path specified.
	#	input strings "$server" (for the SQL server or instance), and "$database" (the database to test)
	#PROVIDES: returns a string with possible values Primary|notThere|stateUnknown|other
	param ([string]$server,[string]$database)

	#Build SQLCMD command:
	[string]$sqlCom = 'c:Program FilesMicrosoft SQL Server100ToolsBinnSQLCMD.EXE'
	[string]$sqlScript = "c:localscriptssqldetermineMirDBState.sql"
	#Invoke the command and capture to string $comOut
	[string]$comOut = & $sqlCom -S $server -i $sqlScript -v dbname=$database 2>&1

	if ($comOut.contains("Primary") -eq $true) { #if RearWindow is the primary role holder then...
	# Removed "Secondary" selection as this string will not be returned in a low-privilege environment.
	#} elseif ($comOut | Select-String "Secondary") {#if RearWindow is the secondary role holder then...
	#	[string]$rc="Secondary"
	# Removed "noSync" detection as a primary with a downed partner will be out of sync, but it is not actionable.
	#} elseif ($comOut | Select-String "noSync"){ #if the mirroed database is not in sync then...
	#	[string]$rc="noSync"

	} elseif ($comOut.contains("not accessible") -eq $true) { #SQL error occured, meaning server is in some way unavailable.
	} elseif (($comOut.contains("unknown") -or ($comOut.contains("Secondary"))) -eq $true) { #db state is anything other than primary
	} else { #something else unexpected.
	return $rc
} #end mirState

function getCurServer {
	#Determines current database alias setting for the string $DBAlias.
	#Requires: An input string $DBAlias.
	#	Assumes the existance of the alias in the registry,
	#	and that the alias is using the "DBMSSOCN" (TCP/IP) connection method.
	#Returns: A string containing the host name of the current database server for the alias
	param ([string]$dbAlias)

	[string]$curAliasSetting = (get-itemproperty 'HKLM:SOFTWAREMicrosoftMSSQLServerClientConnectTo').$dbAlias
	[string]$curServer = $curAliasSetting.Replace("DBMSSOCN,","")
	return $curServer

function setAlias {
	#Sets in the regustry the host name for the provided SQL server alias
	#REQUIRES the names of the SQL host and its alias
	#PROVIDES nothing
	param ([string]$dbHost,[string]$dbAlias)
	[string]$aliasSetting = "DBMSSOCN," + $dbHost
	Set-ItemProperty -Path 'HKLM:SOFTWAREMicrosoftMSSQLServerClientConnectTo' -Name $dbAlias -Value $aliasSetting

function cycleServices {
	#Restarts system services associated with SharePoint
	# This will expedite reconnection to the SharePoint databases in the event of a SQL mirror failover.
	& iisreset
	# We use the "if" conditional below to determine if the service is already running before attempting to restart it.
	if ((Get-Service -Name "spadmin").status -match "Running") {Restart-Service -Name "spadmin"}
	if ((Get-Service -Name "sptimerv3").status -match "Running") {Restart-Service -Name "sptimerv3"}
	if ((Get-Service -Name "spsearch").status -match "Running") {Restart-Service -Name "spsearch"}
	if ((Get-Service -Name "osearch").status -match "Running") {Restart-Service -Name "osearch"}
	Write-Output "SharePoint services cycled."
##### End script functions: #####

##### Start script body: ###
[string]$curServer = getCurServer -dbAlias $dbAlias
Write-Output "The database alias $dbAlias currently is set to: $curServer"

[string]$mirStateVar = mirState -server $curServer -database $dbName
Write-Host "Database $dbName on server $curServer is in state $mirStateVar."

if ($mirStateVar.contains("Primary")) {
	Write-Output "$curServer is already defined as the SQL Alias, and the server is functioning as expected.  Exiting."
} else {
	switch -regex ($curServer) {
		$db1 {[string]$newServer = $db2}
		$db2 {[string]$newServer = $db1}
	setAlias -dbHost $newServer -dbAlias $dbAlias
	Write-Output "Set alias $dbAlias to value $newServer."
	## Removed the "cycleServices function as the databases will get reconnected on their own.
	## You still can run "cycleServices if you want to expedite DB reconnection,
	## but keep in mind that this will cause all of the ASP.NET code to recompile.  Boo!
	#Write-Output "Cycling SharePoint services..."
	Write-Output "Done."
##### End script body: #####

Tying it all together

So now to make it all work togther…
The fastest thing to do is to implement the first T-SQL script as a SQL Server Agent job running on both SQL mirror servers. The jobs are configured to run once per minute, and have been staggered to run 30 seconds apart from each other. The script will not be triggered by databases in a “suspended”, we should be able to avoid job collision between the two servers. (I hope).

We then could implement the PowerShell script (and accompanying T-SQL script) as a scheduled task on the web front end servers. We need to run the script as an account in the PUBLIC role on the SQL servers, which can run as a batch job on the WFEs.
Easy, right? we shall see.