Sunday, October 20, 2013

is count(1) faster than count(*)?

Recently I saw a post regarding the count(1) and count(*)

the post said "
You know that doing COUNT(*) on a table with a lot of columns and a lot of rows can take a lot of time and memory
"
so use count(1) instead of count(*).  Is count(1) really faster than count(*)? Is it true? let's do testing.

first, let's setup the testing db 

--Create DB
CREATE DATABASE [test]
 CONTAINMENT = NONE
 ON  PRIMARY 
( NAME = N'test', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\test.mdf' , SIZE = 512000KB , FILEGROWTH = 102400KB )
 LOG ON 
( NAME = N'test_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL2012\MSSQL\DATA\test_log.ldf' , SIZE = 102400KB , FILEGROWTH = 102400KB )
GO
use [test]
go


1. Heap Table

--Create Heap table
CREATE SEQUENCE dbo.mysequence
      START WITH 1
      INCREMENT BY 1
GO
select NEXT VALUE FOR dbo.mysequence AS [ID],
a.name, b.create_date,
c.type_desc into mytable 
from sys.objects a, sys.objects b, sys.objects c
GO
--Save table page status
CREATE TABLE DBCCResult (
PageFID NVARCHAR(200),
PagePID NVARCHAR(200),
IAMFID NVARCHAR(200),
IAMPID NVARCHAR(200),
ObjectID NVARCHAR(200),
IndexID NVARCHAR(200),
PartitionNumber NVARCHAR(200),
PartitionID NVARCHAR(200),
iam_chain_type NVARCHAR(200),
PageType NVARCHAR(200),
IndexLevel NVARCHAR(200),
NextPageFID NVARCHAR(200),
NextPagePID NVARCHAR(200),
PrevPageFID NVARCHAR(200),
PrevPagePID NVARCHAR(200)
)
GO

INSERT INTO DBCCResult EXEC ('DBCC IND(test,mytable,-1) ')
GO
--with DBCC IND, we can know there are total 4747 pages in table mytable

a) test count(*)
--Clean Cache
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
--test count(*)
SET STATISTICS IO ON
set STATISTICS time on
set STATISTICS profile on
GO
select count(*) from mytable
GO
set STATISTICS profile off
set STATISTICS time off
set STATISTICS io off
GO
--save all buffer pool page
select * into Buffer1
from sys.dm_os_buffer_descriptors where database_id=db_id('test')
GO
--check if all data pages have been read to buffer pool
select count(*)

from Buffer1 b inner join DBCCResult d on b.page_id=d.PagePID 

below is the execution plan and IO statistics
























and it shows all 4747 pages have been read to buffer pool

b) test count(1)
--Clean Cache
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
--test count(1)
SET STATISTICS IO ON
set STATISTICS time on
set STATISTICS profile on
GO
select count(1) from mytable
GO
set STATISTICS profile off
set STATISTICS time off
set STATISTICS io off
GO
--save all buffer pool page
select * into Buffer2
from sys.dm_os_buffer_descriptors where database_id=db_id('test')
GO
--check if all data pages have been read to buffer pool
select count(*)
from Buffer2 b inner join DBCCResult d on b.page_id=d.PagePID

you will see the execution plan is the same as count(*), and so is IO statistics


























and you can see 4747 pages read to buffer pool as well.

2. Clustered Index Table

created clustered index on table

ALTER TABLE dbo.mytable ADD CONSTRAINT
PK_mytable PRIMARY KEY CLUSTERED 
(
ID
)  ON [PRIMARY]

GO

and rerun the test upper, I still got the same result against count(1) and count(*), they both use clustered index scan(table scan), and read the same page into buffer pool
























3. Heap + Non-Clustered Index

create the heap table with the script of first testing, then create non-clustered index below. 
CREATE NONCLUSTERED INDEX IX_mytable_date ON dbo.mytable
(
create_date
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO


then rerun the testing with count(1) and count*), still no difference, same execution plan and io statistics. but we see index scan this time, which is expected.

























so based on my testing, there is no performance difference between them, if you see any difference between count(1) and count(*), please let me know , I am interesting on researching it :)

3 comments:

  1. Using count(1) is faster than count(*), because it saves having to press the Shift key. ;)

    But seriously, thanks for the demonstration. I've always suspected that to be the case, but it's nice to see the confirmation!

    ReplyDelete
  2. As far as I've heard, the only difference between the two is when there is column-level security on the table. Meaning COUNT(1) does not check for any, while COUNT(*) does enforce security per column. Never tested it myself. I'd rerun the tests you did with additional security on the table and see what happens. Doubt you'd see any difference without a large number of columns. But I could see security stopping the query entirely if you don't have privileges.

    ReplyDelete
  3. COUNT(1) and COUNT(*) are both blocked if the user is denied SELECT on any column in the table.

    USE YourDatabaseName
    GO
    SET NOCOUNT ON;
    GO
    CREATE TABLE dbo.TestColSecurity (Id INT, Name VARCHAR(100));
    GO
    INSERT INTO dbo.TestColSecurity
    (Id, Name)
    VALUES (1, 'Joe');
    GO
    CREATE LOGIN TestColSecurityUser WITH PASSWORD = 'hello', CHECK_POLICY = OFF;
    GO
    CREATE USER TestColSecurityUser FROM LOGIN TestColSecurityUser;
    GO
    GRANT SELECT ON dbo.TestColSecurity TO TestColSecurityUser;
    DENY SELECT ON dbo.TestColSecurity (Name) TO TestColSecurityUser;
    GO

    EXECUTE AS USER = 'TestColSecurityUser'
    GO
    PRINT 'COUNT(*) - error'
    SELECT COUNT(*) FROM TestColSecurity
    GO
    PRINT 'SELECT Name - error'
    SELECT Name FROM TestColSecurity
    GO
    PRINT 'COUNT(1) - error'
    SELECT COUNT(1) FROM TestColSecurity
    GO
    PRINT 'SELECT Id - works, no error'
    SELECT Id FROM TestColSecurity
    GO
    REVERT
    GO

    DROP TABLE dbo.TestColSecurity;
    DROP USER TestColSecurityUser;
    DROP LOGIN TestColSecurityUser;
    GO

    ReplyDelete