Friday, May 25, 2012

Backup duration statistics

Here is a simple query which can list backup duration statistics for all database, including the max, min, avg of the backup duration.  you can estimate the backup duration by this query when you plan your backup job



 SELECT
  DBNAME,
  backup_type,
  MAX(sizeMB) as MaxBackupFileSizeMB,
  Max(duration) as MaxDurationSEC,
  MAX(speedMBSEC) as MaxSpeedMBperSEC,
  MIN(sizeMB) as MinBackupFileSizeMB,
  MIN(duration) as MinDurationSEC,
  MIN(speedMBSEC) as MinSpeedMBperSEC,
  AVG(sizeMB) as AvgBackupFileSizeMB,
  AVG(duration) as AvgDurationSEC,
  AVG(speedMBSEC) as AvgSpeedMBperSEC
 FROM
  (
   SELECT
    backup_set_id,
    DBNAME,
    backup_type,
    SUM(sizeMB) sizeMB,
    MAX(duration) duration,
    SUM(sizeMB)/MAX(duration) speedMBSEC
   FROM (
    SELECT
       bs.backup_set_id,
       bs.database_name AS DBNAME,
       bs.backup_start_date, 
       bs.backup_finish_date,
       DATEDIFF(ss, bs.backup_start_date, bs.backup_finish_date) duration,
       CASE bs.type 
        WHEN 'D' THEN 'Full' 
        WHEN 'L' THEN 'Log' 
        WHEN 'I' THEN 'Diff'
       END AS backup_type, 
       bs.backup_size/1048576 sizeMB
    FROM   msdb.dbo.backupmediafamily  bmf
       INNER JOIN msdb.dbo.backupset bs ON bmf.media_set_id = bs.media_set_id 
    Where DATEDIFF(ss, bs.backup_start_date, bs.backup_finish_date) >= 1
    ) a
   GROUP BY
    backup_set_id,
    DBNAME,
    backup_type
   
  ) b
 GROUP BY  DBNAME,backup_type
 order by DBNAME asc, backup_type asc, AvgDurationSEC desc

Tuesday, May 15, 2012

Ad hoc query optimization in SQL Server

When ad hoc queries are executed in sql server, if it is executed without parameters, and it is simple,  SQL Server parameterizes the query internally to increase the possibility of matching it against an existing execution plan, that's called "Simple Parameterization"

For instance:

--clear plan cache first
DBCC FREEPROCCACHE

--run first ad hoc query
select * from dbo.Sales where SalesID= 567

--check query plan

WITH XMLNAMESPACES
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT
COALESCE(DB_NAME(p.dbid), p.query_plan.value('(//RelOp/OutputList/
ColumnReference/@Database)[1]',
'nvarchar(128)')) AS DatabaseName
,DB_NAME(p.dbid) + '.' + OBJECT_SCHEMA_NAME(p.objectid, p.dbid) + '.' +
OBJECT_NAME(p.objectid, p.dbid) AS ObjectName
,cp.objtype
,cp.cacheobjtype
,p.query_plan
,cp.UseCounts
,cp.plan_handle
,cp.size_in_bytes
,CAST('<?query --' + CHAR(13) + q.text + CHAR(13) + '--?>' AS xml)
AS SQLText
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q
ORDER BY DatabaseName, UseCounts DESC


you can find 2 rows below, one is the ad "hoc query" object, the other is the "
Prepared statement"

the "Prepared statement" is created by sql server Simple Parameterization. then if run a new ad hoc statment below

--run second ad hoc query
select * from dbo.Sales where SalesID= 123

This time there are 2 "ad hoc" objects, but only 1 "Prepared statement" object whose "UseCounts" is 2 now. Because of this simple parameterization, SQL Server recognizes that the following two statements generate essentially the same execution plan and reuses the first plan for the second statement.

Under the default behavior of simple parameterization, SQL Server parameterizes a relatively small class of queries. However, you can specify that all queries in a database be parameterized, subject to certain limitations, by setting the PARAMETERIZATION option of the ALTER DATABASE command to FORCED

ALTER DATABASE [dbname] set PARAMETERIZATION  FORCED

Forced Parameterization is not appropriate in all enviroments and scenarios. It is recommanded that you use it only for a very high volumne of concurrent queries, and when you are seeing hight CPU from a lot of compilation/recompilation.you can monitor the perfmon sql statistics below
• SQL Server: SQL Statistics: Batch Requests/sec
• SQL Server: SQL Statistics: SQL Compilations/sec
• SQL Server: SQL Statistics: SQL Recompilations/sec
Ideally, the ratio of SQL Recompilations/sec to Batch Requests/sec should be very low.

let's try another ad hoc statment below

--clear plan cache first
DBCC FREEPROCCACHE

--run third ad hoc query
select * from dbo.Sales where SalesID= 4444 or CustomerID=903 and CustomerID=17541

As this ad hoc statment is complicated, so simple parameterization doesn't work this time, there is only a "ad hoc" object, No "Prepared statement" object. Then we enable the "Forced Parameterization", and redo the test

 ALTER DATABASE [dbname] set PARAMETERIZATION FORCED
--clear plan cache first
DBCC FREEPROCCACHE

--run fourth ad hoc query
select * from dbo.Sales where SalesID= 4444 or CustomerID=903 and CustomerID=17541

 you can find the "Prepared statement" object created. if you run similiar statment below, the execution plan can be reused.

select * from dbo.Sales where SalesID= 333 or CustomerID=123 and CustomerID=345

Although we can eliminate the execution plan Recompilations of the ad hoc query by enable "Forced Parameterization", every time you run the ad hoc query, there is a new "ad hoc"object created, and it eat up your memory.   there is a system parameter "optimize for ad hoc workloads" which can save the memory usage of ad hoc statement. let run the testing below:

--enable the parameter
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'optimize for ad hoc workloads', 1
GO
RECONFIGURe with override
GO

--clear plan cache first
DBCC FREEPROCCACHE

--run fifth ad hoc query
select * from dbo.Sales where SalesID= 567

then check the plan cache, there is only "Compiled Plan Stub" object instead of "Compiled Plan".
So when this option is set to 1, the Database Engine stores a small compiled plan stub in the plan cache when a batch is compiled for the first time, instead of the full compiled plan. This helps to relieve memory pressure by not allowing the plan cache to become filled with compiled plans that are not reused.

let's run the same query again
--run sixth ad hoc query
select * from dbo.Sales where SalesID= 567

this time the "Compiled Plan Stub" become to "Compiled Plan". So when the batch is invoked (compiled or executed) again, the Database Engine compiles the batch, removes the compiled plan stub from the plan cache, and adds the full compiled plan to the plan cache.

Brett Hawton has a query which can help you determin if you need to use 'optimize for ad hoc workloads'
-- Do not run this TSQL until SQL Server has been running for at least 3 hours
SET NOCOUNT ON
SELECT objtype AS [Cache Store Type],
        COUNT_BIG(*) AS [Total Num Of Plans],
        SUM(CAST(size_in_bytes as decimal(14,2))) / 1048576 AS [Total Size In MB],
        AVG(usecounts) AS [All Plans - Ave Use Count],
        SUM(CAST((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(14,2)))/ 1048576 AS [Size in MB of plans with a Use count = 1],
        SUM(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END) AS [Number of of plans with a Use count = 1]
        FROM sys.dm_exec_cached_plans
        GROUP BY objtype
        ORDER BY [Size in MB of plans with a Use count = 1] DESC
DECLARE @AdHocSizeInMB decimal (14,2), @TotalSizeInMB decimal (14,2)
SELECT @AdHocSizeInMB = SUM(CAST((CASE WHEN usecounts = 1 AND LOWER(objtype) = 'adhoc' THEN size_in_bytes ELSE 0 END) as decimal(14,2))) / 1048576,
        @TotalSizeInMB = SUM (CAST (size_in_bytes as decimal (14,2))) / 1048576
        FROM sys.dm_exec_cached_plans
SELECT @AdHocSizeInMB as [Current memory occupied by adhoc plans only used once (MB)],
         @TotalSizeInMB as [Total cache plan size (MB)],
         CAST((@AdHocSizeInMB / @TotalSizeInMB) * 100 as decimal(14,2)) as [% of total cache plan occupied by adhoc plans only used once]
IF  @AdHocSizeInMB > 200 or ((@AdHocSizeInMB / @TotalSizeInMB) * 100) > 25  -- 200MB or > 25%
        SELECT 'Switch on Optimize for ad hoc workloads as it will make a significant difference' as [Recommendation]
ELSE
        SELECT 'Setting Optimize for ad hoc workloads will make little difference' as [Recommendation]
GO

Saturday, May 12, 2012

Use Powershell script to verify backup files

Before restoring backup, we always verify the backup file first, and run the 3 command below:

1. RESTORE HEADERONLY
    Returns a result set containing all the backup header information for all backup sets on a particular backup device.

2. RESTORE FILELISTONLY
    Returns a result set containing a list of the database and log files contained in the backup set.

3. RESTORE VERIFYONLY
    Verifies the backup but does not restore it, and checks to see that the backup set is complete and the entire backup is readable

sometimes we also want to keep that info for refererence later. If there is only one backup file, it is easy to handle. you can verify it from SSMS, or by sqlcmd. But if there are many backup files, it will be time consuming to verify them.

Here is a little Powershell Script which can help you verify the backup files, and export the results to file. For instance:
 in the folder below, we have 2 backup device, every backup device has 3 backup files



1. Run the powershell script first
2. Run verifyBackup following by the folder path, here is the sample:
verifyBackup "C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Backup\AWD"

then you will get the output like:

you can also get the detail report for each backup set in the same folder

some notes:
1. the current version only works on 1 backup set per device.
2. there are 3 parameters for verifyBackup function
         a) $backupPath: the bakcup file path
         b) $tempPath: the temp file path for "restore verifyonly". By default , it is same as $backupPath
         c) $sqlserver: by default, it use localhost as sql instance name

Powershell script:
##################################################################################
$global:backupsets = New-Object System.Collections.Hashtable
$global:restoreHeaderOnly = New-Object System.Collections.Hashtable
$global:restoreFilelistOnly = New-Object System.Collections.Hashtable
$global:restoreScripts= New-Object System.Collections.Hashtable
function global:run-sql (
 [String]$sql,
 [String]$server,
 [String]$database="master"
)
{
 $connectionString = "Server="+$server+";Database="+$database+";Trusted_Connection=yes"
 $conn = new-object System.Data.SqlClient.SqlConnection $connectionString
 $conn.Open()
 $comm = $conn.CreateCommand()
 $comm.CommandText = $sql
 $reader = $comm.ExecuteReader()
 while($reader.Read())
 {
  $row = new-object PSObject
  for($i = 0; $i -lt $reader.FieldCount; ++$i)
  {
   add-member -inputObject $row -memberType NoteProperty -name $reader.GetName($i) -value $reader.GetValue($i)
  }
  write-output $row
 }
 $reader.Close()
 $conn.Close()
}
function global:verifyBackup(
 [String]$backupPath,
 [String]$tempPath=$backupPath,
 [String]$sqlserver="localhost"
)
{
 $global:backupsets = New-Object System.Collections.Hashtable
 $global:restoreHeaderOnly = New-Object System.Collections.Hashtable
 $global:restoreFilelistOnly = New-Object System.Collections.Hashtable
 $global:restoreVerifyOnly = New-Object System.Collections.Hashtable
 $files=gci -Path $backupPath | where{$_.Extension -match "bak|trn"} | select name | sort-object name
 #Write-Output $files


 foreach($file in $files) {
  $filepath=$backupPath+"\" + $file.Name
  $sqlcmd="restore headeronly from disk='" + $filepath+"'"
  $headinfos=global:run-sql $sqlcmd $sqlserver

  foreach ($headinfo in $headinfos)
  {
   if ($headinfo){
    $key=$headinfo.DatabaseName #+"."+$headinfo.CheckpointLSN
    #Write-Output $key
    if ($global:backupsets.ContainsKey($key))
    {
     $global:backupsets.Item($key).add($filepath)
    }
    else
    {
     $temp=New-Object System.Collections.ArrayList
     $temp.add($filepath)
     $global:backupsets.Add($key, $temp)
    }
   }
  }
 }

 # run restore headeronly
 foreach ($fileKey in $global:backupsets.keys)

 {
  $sqlcmd="restore headeronly from "
  $backupfiles=$global:backupsets.Item($fileKey)
  foreach ($backupfile in $backupfiles)
  {
   $sqlcmd=$sqlcmd+"disk='"+$backupfile+"',"
  }
  if ($sqlcmd.EndsWith(","))
  {
   $sqlcmd=$sqlcmd.TrimEnd(",")
  }


  $restoreInfo=global:run-sql $sqlcmd $sqlserver
  $global:restoreHeaderOnly.Add($fileKey, $restoreInfo)

  $csvfile=$backupPath+"\"+$fileKey+"_headeronly.csv"
  $restoreInfo | Export-Csv -Path $csvfile

 }

 #run restore filelistonly
 foreach ($fileKey in $global:backupsets.keys)
 {
  $sqlcmd="RESTORE FILELISTONLY from "
  $backupfiles=$global:backupsets.Item($fileKey)
  foreach ($backupfile in $backupfiles)
  {
   $sqlcmd=$sqlcmd+"disk='"+$backupfile+"',"
  }
  if ($sqlcmd.EndsWith(","))
  {
   $sqlcmd=$sqlcmd.TrimEnd(",")
  }



  $restoreInfo=global:run-sql $sqlcmd $sqlserver
  $global:restoreFilelistOnly.Add($fileKey, $restoreInfo)

  $csvfile=$backupPath+"\"+$fileKey+"_filelistonly.csv"
  $restoreInfo | Export-Csv -Path $csvfile

 }

 #run restore verifyonly
 foreach ($fileKey in $global:backupsets.keys)
 {
  $sqlcmd="RESTORE VERIFYONLY from "
  $backupfiles=$global:backupsets.Item($fileKey)
  foreach ($backupfile in $backupfiles)
  {
   $sqlcmd=$sqlcmd+"disk='"+$backupfile+"',"
  }

  if ($sqlcmd.EndsWith(","))
  {
   $sqlcmd=$sqlcmd.TrimEnd(",")
  }

  $sqlcmd=$sqlcmd + " with "

  $fileList=$global:restoreFilelistOnly.Item($fileKey)

  foreach ($file in $fileList)
  {
   $logcalName=$file.LogicalName
   $physicalName=$file.PhysicalName | split-path -leaf
   $sqlcmd=$sqlcmd+ " move '"+$logcalName + "' to '" + $tempPath+"\"+ $physicalName+"',"
 
  }

  if ($sqlcmd.EndsWith(","))
  {
   $sqlcmd=$sqlcmd.TrimEnd(",")
  }
  $csvfile=$backupPath+"\"+$fileKey+"_verifyonly.txt"
  $cmd="sqlcmd -E -S " + $sqlserver + " -Q `" " + $sqlcmd + " `"  -o `""+ $csvfile+"`""
  invoke-expression $cmd
  $msg="Verify backup of database [" +$fileKey+ "] is done: `n"
  $msg+= gc $csvfile
  Write-Output $msg

  Add-Content $csvfile "`n"
  Add-Content $csvfile "$sqlcmd"
 }


}

Wednesday, May 9, 2012

Backup database in CMS

Sometime you know the database name which you want to backup, but you don't know which server it is on. Normally we will
1. run query on CMS to find out which server the database is on
2. then connect to the server and run backup query.

If you backup only a database, it might be easy to do the steps upper, but if you need to backup dozen of databases, it would take you some time.

Here is a script you can use to backup databases from CMS directly, we don't need to know which server it is on:
1. connect to CMS, Open the query below
2. edit the @type and dbname session with what you need, then run it

the script can create backup for all database you input, and create backup file under last bakcup file path or default backup path.

/******************************************************************
* Backup database in database farm, you can run it on the central management server
* 1. edit the script with the database list which you want to bakcup,
* 2. edit the @type with value "Full" or "Diff"
* the backup file will be created in the latest backup path. if the database don
* Type:  T-SQL
* Version:     1.00.0000
*
* Author: James Xu
* blog: http://jamessql.blogspot.com/
*
*
'*****************************/
DECLARE @dbName varchar(200)
DECLARE @type varchar(200)
DECLARE @path varchar(200)
DECLARE @sql varchar(max)
DECLARE @curtime varchar(30)
DECLARE @rc int
--edit the backup type : 'Full' - full backup, 'Diff' - Differential backup
SET @type='Full'
--SET @type='Diff'
--edit the db name here and add more as you need
DECLARE DB_cursor CURSOR FOR
SELECT 'RS' AS DBNAME
UNION
SELECT 'aaaaa' AS DBNAME
UNION
SELECT 'DataCollector' AS DBNAME
--edit the db name here and add more as you need
SET @curtime=replace(replace(replace(convert(varchar(30), GETDATE(), 120), '-', ''), ' ', ''), ':','')
OPEN DB_cursor;
FETCH NEXT FROM DB_cursor
INTO @dbName
WHILE (@@FETCH_STATUS = 0)
BEGIN
      if (exists(select * from sys.databases where name=@dbName))
      begin
            select top 1 @path=reverse(substring(reverse(physical_device_name),charindex('\',reverse(physical_device_name)),len(physical_device_name)))
            from  msdb..backupmediafamily bmf inner join msdb..backupset bs on bmf.media_set_id=bs.media_set_id
            inner join sys.databases db on bs.database_name=db.name
            inner join sys.database_mirroring dbm on db.database_id=dbm.database_id
            where bs.database_name =@dbName and bs.type='D' and isnull(dbm.mirroring_role,1) = 1  order by bs.backup_finish_date desc
           
            if (@path is null)
            begin
    exec @rc = master.dbo.xp_instance_regread   N'HKEY_LOCAL_MACHINE', N'Software\Microsoft\MSSQLServer\MSSQLServer',N'BackupDirectory', @path output
            end
           
            if (@path is not null)
            Begin
                  set @path=@path+@dbName+'_'+@type+'_'+@curtime+'.bak'
                  set @sql='BACKUP DATABASE ['+ @dbName +'] TO DISK='''
                  if (@type='Full')
     set @sql=@sql+@path+''' WITH NOFORMAT, NOINIT,  NAME = N'''+@dbname+'-Full Database Backup'', SKIP, NOREWIND, NOUNLOAD,  STATS = 10'
      else
     set @sql=@sql+@path+''' WITH DIFFERENTIAL, NOFORMAT, NOINIT,  NAME = N'''+@dbname+'-Differential Database Backup'', SKIP, NOREWIND, NOUNLOAD,  STATS = 10'
                 
                  EXEC(@sql)
                  PRINT 'DONE - ' + @SQL
            end
            else
            begin
    PRINT 'There is no existing backup for database ' + @dbname
            end
                 
      end
      FETCH NEXT FROM DB_cursor
      INTO @dbName
END
CLOSE DB_cursor;
DEALLOCATE DB_cursor;