Sunday, October 20, 2013

is count(1) faster than count(*)?

Recently I saw a post regarding the count(1) and count(*)

the post said "
You know that doing COUNT(*) on a table with a lot of columns and a lot of rows can take a lot of time and memory
"
so use count(1) instead of count(*).  Is count(1) really faster than count(*)? Is it true? let's do testing.

first, let's setup the testing db 

--Create DB
CREATE DATABASE [test]
 CONTAINMENT = NONE
 ON  PRIMARY 
( NAME = N'test', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\test.mdf' , SIZE = 512000KB , FILEGROWTH = 102400KB )
 LOG ON 
( NAME = N'test_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\test_log.ldf' , SIZE = 102400KB , FILEGROWTH = 102400KB )
GO
use [test]
go


1. Heap Table

--Create Heap table
CREATE SEQUENCE dbo.mysequence
      START WITH 1
      INCREMENT BY 1
GO
select NEXT VALUE FOR dbo.mysequence AS [ID],
a.name, b.create_date,
c.type_desc into mytable 
from sys.objects a, sys.objects b, sys.objects c
GO
--Save table page status
CREATE TABLE DBCCResult (
PageFID NVARCHAR(200),
PagePID NVARCHAR(200),
IAMFID NVARCHAR(200),
IAMPID NVARCHAR(200),
ObjectID NVARCHAR(200),
IndexID NVARCHAR(200),
PartitionNumber NVARCHAR(200),
PartitionID NVARCHAR(200),
iam_chain_type NVARCHAR(200),
PageType NVARCHAR(200),
IndexLevel NVARCHAR(200),
NextPageFID NVARCHAR(200),
NextPagePID NVARCHAR(200),
PrevPageFID NVARCHAR(200),
PrevPagePID NVARCHAR(200)
)
GO

INSERT INTO DBCCResult EXEC ('DBCC IND(test,mytable,-1) ')
GO
--with DBCC IND, we can know there are total 4747 pages in table mytable

a) test count(*)
--Clean Cache
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
--test count(*)
SET STATISTICS IO ON
set STATISTICS time on
set STATISTICS profile on
GO
select count(*) from mytable
GO
set STATISTICS profile off
set STATISTICS time off
set STATISTICS io off
GO
--save all buffer pool page
select * into Buffer1
from sys.dm_os_buffer_descriptors where database_id=db_id('test')
GO
--check if all data pages have been read to buffer pool
select count(*)

from Buffer1 b inner join DBCCResult d on b.page_id=d.PagePID 

below is the execution plan and IO statistics
























and it shows all 4747 pages have been read to buffer pool

b) test count(1)
--Clean Cache
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
--test count(1)
SET STATISTICS IO ON
set STATISTICS time on
set STATISTICS profile on
GO
select count(1) from mytable
GO
set STATISTICS profile off
set STATISTICS time off
set STATISTICS io off
GO
--save all buffer pool page
select * into Buffer2
from sys.dm_os_buffer_descriptors where database_id=db_id('test')
GO
--check if all data pages have been read to buffer pool
select count(*)
from Buffer2 b inner join DBCCResult d on b.page_id=d.PagePID

you will see the execution plan is the same as count(*), and so is IO statistics


























and you can see 4747 pages read to buffer pool as well.

2. Clustered Index Table

created clustered index on table

ALTER TABLE dbo.mytable ADD CONSTRAINT
PK_mytable PRIMARY KEY CLUSTERED 
(
ID
)  ON [PRIMARY]

GO

and rerun the test upper, I still got the same result against count(1) and count(*), they both use clustered index scan(table scan), and read the same page into buffer pool
























3. Heap + Non-Clustered Index

create the heap table with the script of first testing, then create non-clustered index below. 
CREATE NONCLUSTERED INDEX IX_mytable_date ON dbo.mytable
(
create_date
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO


then rerun the testing with count(1) and count*), still no difference, same execution plan and io statistics. but we see index scan this time, which is expected.

























so based on my testing, there is no performance difference between them, if you see any difference between count(1) and count(*), please let me know , I am interesting on researching it :)

Tuesday, October 1, 2013

Enable Instant File Initialization to accelerate database restore

Today my colleague come to me and ask me why her database restore query was hang. She was going to restore a database with 200GB data file and 11GB log file.

She had run the restore command for about 15 minutes, but the restore process is still in 0%, it is not the normal situation she knows. I connected to the server and checked the running query status, the "restore database" command was waiting for the "ASYNC_IO_COMPLETION"









then I checked the disk performance, the database file is still in writing, 110MB/sec






before the database restore start writing data back to the data file, the data file needs to be initialized first. if instant file initialization is enabled, this step will be skipped for data file(only data file), looks like instant file initialization is not enabled on this server.

In order to prove it, I run the script from http://blogs.msdn.com/b/sql_pfe_blog/archive/2009/12/23/how-and-why-to-enable-instant-file-initialization.aspx

DBCC TRACEON(3004,3605,-1)
GO
CREATE DATABASE TestFileZero
GO
EXEC sp_readerrorlog
GO
DROP DATABASE TestFileZero
GO
DBCC TRACEOFF(3004,3605,-1)

in the error log file, I do see the
2013-10-01 18:31:05.56 spid104     Zeroing E:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\TestFileZero.mdf from page 0 to 131232 (0x0 to 0x40140000)

so the server is not enabled  instant file initialization. the data file is 200GB, disk write speed is 100-110MB/sec, we get : the file initialization will take about 33 minutes.

I told her just waited for half hour, then you would see the progress. and finally, she told me the restore estimate time started changing after 35 minutes :)

This case proves that enabling Instant File Initialization is really important for sql performance, including database restore.