Wednesday, June 30, 2010

Ran into a problem with a SQL server were the disk got corrupted. 

Microsoft SQL Serevr Manager would not start, would instead return the error:

    "Could not continue scan with NOLOCK due to data movement."

In the event log I found:

SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:3403; actual 44272:1262791920). It occurred during a read of page (1:3403) in database ID 32767 at offset 0x00000001a96000 in file 'E:\Microsoft SQL Server\MSSQL\data\mssqlsystemresource.mdf'.

There were also a couple of other databases that were messed up, IMS and SUSDB.  IMS is an internal database used by the media services folks for controlling the multuimedia equipment.  SUSDB is a database used by Windows Server Update Services..

Restored the 'E:\Microsoft SQL Server\MSSQL\data\mssqlsystemresource.mdf' from backups and was then able to get the Microsoft SQL Serevr Manager to start.

A useful command that I came across was:

C:\Program Files\Microsoft SQL Server\90\Tools\Binn\osql

which allowed me to:

ALTER DATABASE IMS SET EMERGENCY
ALTER DATABASE IMS SET SINGLE_USER
DBCC CHECKDB ('IMS', REPAIR_ALLOW_DATA_LOSS)
ALTER DATABASE IMS SET MULTI_USER
ALTER DATABASE IMS SET ONLINE

None of these fixed anything, but did give me some more insite to the issue at hand.  Only way to fix was simply to restore the .mdf files from backups.

Update 7/27/2010

Wow, after years of never having a MSSQL database corruption, in less than a month I have had two. This time however the corruption was not the the system tables as above. but in a table that we simply use for logging. The error this time was "SQL Server detected a logical consistency-based I/O error: torn page". The above commands listed above did work this time.

I think that the root cause may actually be a iSCSI issue with VMWare ESX as they had a upadte that fixed an issue that sounded a whole like what we are seeing. Keeping fingers crossed, otherwise my next certification might have to be in MSSQL administration.

Monday, June 21, 2010

Using MIT's Kerberos for Active Directory authentication has been challenging to say the least.

In Windows 7 and Windows Server 2008 R2, the DES cipher suites are disabled by default. Why this matters.

  • Windows 2000: maximum encryption type is DES
  • Windows 2003: maximum encryption type is DES
  • Windows 2003RC2: maximum encryption type is RC4, relationship defaults to DES
  • Windows 2008: maximum encryption type is AES, relationship defaults to RC4
So, when you setup a trust relationship between the Active Directory with Server 2003 and the Kerberos Realm, it is DES. This means that you have two options for authenticating Windows 7 to the Kerberos realm, enable DES or change the trust encryption type.

The Configure encryption types allowed for Kerberos policy setting is located in Computer Configuration\Security Settings\Local Policies\Security Options.  Initially, we enabled DES on the windows 7 machines. Intermittingly or at GPO update frequency times, the windows 7 machine would quit using DES and start using AES. The result was that Windows 7 would authenticate with a Active Directory account, but not a Kerberos account through the trust relationship due to it trying to use AES and failing.

The second option is to change the trust encryption type. You have to use the ktpass, from the Windows 2003 Resource kit *service pack2*, available from Microsoft web site.

ktpass /MITRealmName UNIX.EXAMPLE.COM /TrustEncryp RC4
For Windows 2008, the same operation can be done with the ksetup, installed by default.

ksetup /SetEncTypeAttre EXAMPLE.COM AES256-SHA1
Changing the trust encryption type has so far proven to be the better solution.



Update: 8/13/2010

The intermitting authentication issue no longer appears to have anything to do with the configured cyphers. It now appears to be more related to the policies being down loaded to the machine. Intermittingly, the machine and user preference files that get downloaded to the C:\Users\All Users\Microsoft\Group Policy\History directory on the local machine are intermittingly getting corrupted. Under normal circumstances, the file should look like normal XML, however bringing up a corrupted file in a test editor shows it at just being a bunch of NULLs. Removing the History directory and running a gpupdate seems to work around the problem.