Tag Archives: SQL SERVER

SQL Server Problem Tombola: Giant Reads

A SQL Server is as fast as you tune your workload, one of the things I try to do is tune the workload aka the queries executing on my SQL server in order to get a better performing server. I will start to look for problems with certain queries who do too many reads to retrieve a small result set. A problem however is when working with vendor applications, you might not be able to tune the applications queries. This limits your options but there are a lot of thing you can still do to optimize your workload.

Today we will do some testing with a giant read query and tune it in different ways by using real life scenarios

First of all, we are going to create a database and create two tables on it. We will be executing the scripts on SQL Server 2016 so some syntax might have to be changed when working with lower versions.

DROP
DATABASE
IF
EXISTS BigRead

GO

CREATE
DATABASE BigRead

GO

USE BigRead

GO

CREATE
TABLE dbo.BigReadTable_1

(

    BID int
identity (-2147483648 ,1),

    TheDate datetime,

    TheCategory varchar(20),

    TheLink int

)

GO

CREATE
TABLE dbo.BigReadTable_2

(

    BID_2 int
identity(-2147483648,1),

    TheDate datetime,

    TheCategory varchar(20),

    TheValue decimal(18,0)

)

GO

Once we have these table we will insert some data into it

— INSERT 1000 ROWS

SET
NOCOUNT
ON

insert
into dbo.BigReadTable_2

select
DATEADD(day,ABS(Checksum(NewID())
% 10),getdate()),
‘Category ‘+cast(ABS(Checksum(NewID())
% 10)
as
varchar),
cast((rand()*100000)
as
decimal(18,0))

GO 1000

 

insert
into dbo.BigReadTable_1

select
DATEADD(day,ABS(Checksum(NewID())
% 10),getdate()),‘Category ‘+cast(ABS(Checksum(NewID())
% 10)
as
varchar),(select
top 1 BID_2 from dbo.BigReadTable_2 order
by
newid())

GO 1000

 

insert
into dbo.BigReadTable_1

select TheDate,TheCategory,TheLink from dbo.BigReadTable_1

GO 9

 

insert
into dbo.BigReadTable_2

select [TheDate],[TheCategory],[TheValue] FROM dbo.BigReadTable_2

GO 9

After inserting the rows we will add a clustered index for each table

CREATE
CLUSTERED
INDEX CI_1 ON dbo.BIGREADTABLE_2

(

    BID_2

)

 

GO

 

CREATE
CLUSTERED
INDEX CI_2 ON dbo.BIGREADTABLE_1

(

    TheLink

)

Now we are all set up to start our testing. The query we are going to use is the following

select
*
from dbo.BigReadTable_2 C

inner
join

(

    select
AVG(TheValue) Average,TheCategory from dbo.BigReadTable_2

    group
by TheCategory

)B on C.TheCategory = B.TheCategory

inner
join

(

    select
SUM(TheValue) Summer, TheCategory from dbo.BigReadTable_2

    GROUP
BY TheCategory

)A on A.TheCategory = B.TheCategory

inner
join

    (select
CAST(AVG(CAST(TheDate AS
FLOAT))
AS
DATETIME) AVGDate,TheLink

    FROM dbo.BigReadTable_1

    Group
By TheLink) D

on D.TheLink = C.BID_2

OPTION(RECOMPILE)

This is the plan we get with this query at this moment

And these are the client statistics

Now what are the different levels of tuning possible for this query?

  1. We are able to edit code
  2. We are not able to edit code
  3. We are not able to edit code & cannot add indexes

We are able to edit code

As we can see this query is not the most optimal in filtering out data. In this case our tuning might be quite simple. If you can discuss with your business users that you can only return the fields they will use you will already see a performance increase. Let’s say for this example we only need the data for the BigReadTable_2

—LEAVE OUT SOME FIELDS

select B.*,A.*,D.*,C.TheCategory from dbo.BigReadTable_2 C

inner
join

(

    select
AVG(TheValue) Average,TheCategory from dbo.BigReadTable_2

    group
by TheCategory

)B on C.TheCategory = B.TheCategory

inner
join

(

    select
SUM(TheValue) Summer, TheCategory from dbo.BigReadTable_2

    GROUP
BY TheCategory

)A on A.TheCategory = B.TheCategory

inner
join

    (select
CAST(AVG(CAST(TheDate AS
FLOAT))
AS
DATETIME) AVGDate,TheLink

    FROM dbo.BigReadTable_1

    Group
By TheLink) D

on D.TheLink = C.BID_2

OPTION(RECOMPILE)

Here are the result for the client statistics ( the plan will stay the same )

We see that we are reducing the resources that we used when selecting all columns, so without adding additional indexes or filtering data we have tuned the amount of bytes received by 25%. This shows you how important it is to just select the fields you need instead of selecting everything.

There are many different ways of altering the code, but we will not cover all of these.

We are not able to edit the code

Now things are getting a little trickier, you no longer have control over your query. Good news though, we can still add indexes. So what can we do for our query

In SQL Server 2016 I’m using a new cool trick which can be used even in the express edition. The trick is adding an empty non-clustered columnstore index to the table to benefit from batch mode (for more information check out Niko’s blog on http://www.nikoport.com/ or check this post which explains the trick more thoroughly (http://sqlmag.com/sql-server/what-you-need-know-about-batch-mode-window-aggregate-operator-sql-server-2016-part-1 )

Let’s add 2 non-clustered columnstores

— No Possible Index NO Code edit

DROP
INDEX
IF
EXISTS NCCI_PERF ON dbo.BigReadTable_2

GO

CREATE
NONCLUSTERED
COLUMNSTORE
INDEX NCCI_PERF ON d    bo.BigReadTable_2

(

BID_2, TheDate, TheCategory, TheValue

)

WHERE BID_2 =
1 and BID_2 =
2

GO

DROP
INDEX
IF
EXISTS NCCI_PERF2 ON dbo.BigReadTable_1

GO

CREATE
NONCLUSTERED
COLUMNSTORE
INDEX NCCI_PERF2 ON dbo.BigReadTable_1

(

BID, TheDate, TheCategory, TheLink

)

WHERE BID =
1 and BID =
2

GO

Here is what we get as a result for our client statistics

We get the same plan, but we are using batch mode which means processing 1000 rows at a time instead of 1 row at a time.

By adding these 2 empty indexes we are reducing our execution time by 50%! Great result I feel for adding something empty!

No Code Edit and No Additional Indexes allowed

This is just annoying, almost all tools have been taken from you! But you can still tune the query! Mostly these queries would have the highest cost on your server. So by altering your cost threshold for parallelism you might just force those queries to go into parallel execution speeding them up!

Let’s check the cost of the plan:

The plan has a cost of 15, so let’s lower the value for my cost threshold for parallelism to 14 (current value = 50) and see what result we get.

As you can see we lowered the total execution time of our query by 30-40% by altering a server setting! If we look at the plan

you can see we get the benefit from the parallel execution this specific query benefits from the parallel execution! This however can also have some downsides!

Conclusion

When dealing with Big read queries you got a few tricks up your sleeve when working with SQL Server. In most cases you have 3 different scenario’s:
1. You can change the code
2. You can’t change the code but can add indexes
3. You can’t change the code and can’t add indexes

For the first scenario’s things are “easy”, you have multiple ways of tuning your query and only one was shown here. For the second scenario things get more complicated, you might be able to add an index to gain a seek, but a golden bullet for big reads does not exist. The COLUMNSTORE trick does help in most cases, but you will have to evaluate this for yourself. Then in the last scenario things are the trickiest, you can alter your server settings, but this might come at a price. In the example given today, the parallel plan was beneficial for this query, but it might slow down other processes or even overload your processor slowing everything down.

Thank you for reading & stay tuned!

 

 

Advertisements
Tagged , , , , , ,

SQL Server Problem Tombola: Too Many Indexes

When auditing SQL Server databases one of the more common problems I see is the fact that people are adding too many indexes on their tables. This has a significant performance impact on your SQL Server, first off your data inserts will be slower because for every update/delete/insert because we will have to maintain all indexes added for that specific table. But this also has an impact on your read performance, as we all now SQL Server is pretty fast when it comes to reading data out of memory. However, when you have a lot of indexes the data for these indexes also needs to be stored inside your memory, which will increase the amount of memory needed to have this full table in memory. If the data has to be fetched from disk, we will see a performance impact. While indexes are created to increase read performance, one might hurt their read performance when adding indexes while their system is in memory pressure.

To show you what the actual effects of having too many indexes on your table is I have created a few tests.

  1. We are going to test how fast we can insert/Update/Delete data while having different amount of indexes to our table (1 index, 2 indexes, 4 indexes)
    The indexes will be the biggest kind AKA Covering indexes
  2. We are going to test how much memory we are using extra for each index we add.

 

Inserts, Updates and Deletes

I have created a script which is going to insert 100 000 rows into a table

CREATE
TABLE IndexMania

(

IndexManiaID int
identity (1,1),

IndexColumn1 datetime,

IndexColumn2 char(1000),

IndexColumn3 char(500),

IndexColumn4 decimal (18,2),

IndexColumn5 int

)

You can see that I added a char(    1000) & char (500) field to make sure I get a good amount of data to be inserted. If my data inserts are too small SQL Server will make use of the lazy writer and not show any writes with the insert statement. By increasing the size of one row & importing 100 000 rows I will bypass this issue. I am also clearing the SQL Server memory before every statement.

The indexes I created are the following

— CLUSTERED INDEX ON ID

CREATE
UNIQUE
CLUSTERED
INDEX [CI_IndexMania1] ON [dbo].[IndexMania]

(

    [IndexManiaID] ASC

)

— CREATE COVERING DATE INDEX

CREATE
NONCLUSTERED
INDEX [NCI_1] ON [dbo].[IndexMania]

(

    [IndexColumn1] ASC

)

INCLUDE (     [IndexManiaID],

    [IndexColumn2],

    [IndexColumn3],

    [IndexColumn4],

    [IndexColumn5])

GO

— 2 Additional NCI

CREATE
NONCLUSTERED
INDEX [NCI_2] ON [dbo].[IndexMania]

(

    [IndexColumn2] ASC,

    [IndexColumn3] ASC

)

INCLUDE (     [IndexManiaID],

    [IndexColumn1],

    [IndexColumn4],

    [IndexColumn5])

CREATE
NONCLUSTERED
INDEX [NCI_3] ON [dbo].[IndexMania]

(

    [IndexColumn4] ASC,

    [IndexColumn3] ASC

)

INCLUDE (     [IndexManiaID],

    [IndexColumn1],

    [IndexColumn2],

    [IndexColumn5])

 

The results

HEAP TABLE
Amount of writes (about the same for Delete insert and update)

total_logical_writes

20020

Memory Usage

Object

Type

Index

Index_Type

buffer_pages

buffer_mb

IndexMania

USER_TABLE

 

HEAP

20001

156

Time statistics

INSERT

SQL Server Execution Times:

CPU time = 688 ms, elapsed time = 1299 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 19 ms.

UPDATE

SQL Server Execution Times:

CPU time = 407 ms, elapsed time = 623 ms.

SQL Server parse and compile time:

CPU time = 15 ms, elapsed time = 20 ms.
DELETE

SQL Server Execution Times:

CPU time = 468 ms, elapsed time = 1129 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Clustered Index Table
Amount of writes (about the same for Delete insert and update)

total_logical_writes

20040

Memory Usage

Object

Type

Index

Index_Type

buffer_pages

buffer_mb

IndexMania

USER_TABLE

CI_IndexMania1

CLUSTERED

20082

156

INSERT

SQL Server Execution Times:

CPU time = 1171 ms, elapsed time = 2745 ms.

SQL Server parse and compile time:

CPU time = 16 ms, elapsed time = 25 ms.
UPDATE

SQL Server Execution Times:

CPU time = 297 ms, elapsed time = 550 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 20 ms.
DELETE

SQL Server Execution Times:

CPU time = 734 ms, elapsed time = 3201 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Clustered Index & 1 Covering index

Amount of writes (about the same for Delete insert and update)

total_logical_writes

40694

Memory Usage

Object

Type

Index

Index_Type

buffer_pages

buffer_mb

IndexMania

USER_TABLE

NCI_1

NONCLUSTERED

20131

157

IndexMania

USER_TABLE

CI_IndexMania1

CLUSTERED

20083

156

TOTAL

     

40214

313

INSERT

SQL Server Execution Times:

CPU time = 1953 ms, elapsed time = 4575 ms.

SQL Server parse and compile time:

CPU time = 16 ms, elapsed time = 32 ms.
UPDATE

SQL Server Execution Times:

CPU time = 2375 ms, elapsed time = 5780 ms.

SQL Server parse and compile time:

CPU time = 16 ms, elapsed time = 23 ms.
DELETE

SQL Server Execution Times:

CPU time = 1718 ms, elapsed time = 7761 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Clustered Index & 3 Covering Indexes

Amount of writes

Insert Writes

Update Writes

Delete Writes

124154

182542

125677

Memory Usage

Object

Type

Index

Index_Type

buffer_pages

buffer_mb

IndexMania

USER_TABLE

NCI_2

NONCLUSTERED

40275

314

IndexMania

USER_TABLE

NCI_3

NONCLUSTERED

23331

182

IndexMania

USER_TABLE

NCI_1

NONCLUSTERED

20145

157

IndexMania

USER_TABLE

CI_IndexMania1

CLUSTERED

20083

156

Total

     

103834

809

INSERT

SQL Server Execution Times:

CPU time = 42468 ms, elapsed time = 47094 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 21 ms.

UPDATE

SQL Server Execution Times:

CPU time = 108063 ms, elapsed time = 125766 ms.

SQL Server parse and compile time:

CPU time = 10 ms, elapsed time = 10 ms.

DELETE

SQL Server Execution Times:

CPU time = 54922 ms, elapsed time = 73162 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Conclusion

As seen in our test, we can conclude that each added index has an impact on our write performance & on the amount of data stored in memory to facilitate the index.

Amount of writes

When inserting data to a heap table we needed about 20020 writes to insert 100 000 records, there was almost no difference with the amount of writes while writing to a clustered index. When adding a covering index, the amount of writes doubled, which is normal because we will keep the data 2 times. When adding another two covering indexes we are doing about 4 times the original amount of writes. There are some discrepancies in the amount of writes, but this is due to the fact that to execute some of the queries we needed to spill to TEMPDB, which gave a larger amount of writes as the amount of indexes increased. But after multiple tests we can see that we will need 4 times the amount of writes purely for the indexes. (DELETE UPDATE & INSERT)

Time consumed

When inserting data we have seen that while inserting into a heap table we have an average of about 1 second, when adding a clustered index this increased to an average of 2-3 seconds. When adding 1 covering index this increased to 4-6 seconds, and with two additional covering indexes the amount of time exploded to a whopping 45 – 50 seconds to insert the same workload.

Memory Used by object

After inserting 100 000 rows into the heap table we have seen that we use 156MB of the memory to facilitate this table, when adding the clustered index this amount stays exactly the same. When adding 1 covering index this doubled, which means we are now keeping 2 copies of the data inside the memory. When adding the 2 additional covering indexes we see that we are keeping the same data 4 times inside our memory.

General Conclusion

We should take care when adding additional indexes, this does not only have an impact on our insert/update/delete performance but this also impacts the amount of memory used for that specific table. We want to be able to get all items out of our cache which will be in nanoseconds instead of milliseconds when retrieving data from disk, so we have to keep in mind that every index we add might take additional space inside your cache, removing other objects out of your cache & slowing them down.

In the example of today we were using covering indexes which is actually saying double my data. Keep in mind however that this post is not a pledge to stop using indexes even covering indexes. Indexes will help your performance incredibly if used in the correct way. As a general recommendation I would say try to keep your indexes as small as possible & only include the fields you need to reduce the overhead of your index.

Thank you for reading and stay tuned!

 

 

 

Tagged , , , , , ,

SQL Server Problem Tombola: MAXDOP 1

As you all may now many vendors force you to put the MAXDOP of your SQL Server to 1 (SAP, SHAREPOINT, AX). This is by design for these application because another MAXDOP setting might interfere with the otherwise “performant” workload of the application. Today I would like to show you what the impact is of setting your MAXDOP to 1 on your server actually is.

To start off I would like to give you a small introduction on what MAXDOP actually does. The MAXDOP setting limits the amount of Cores used by one query in your SQL server instance (On SQL Server 2016 you can do this on database level THANK GOD! J ), for example if you put your MD to 2 then the query receives the option to create a parallel plan with 2 cores to execute a query. The MD setting works in parallel with another setting on instance level: Cost Threshold for Parallelism (default 5). The CTFP is the value you set for your instance to start evaluating the possibility to use a parallel plan if you have MD 2/4/8/…. For example, if your CTFP is set to 60 and your query plan only has a cost of 10 your query will not be evaluated to run as parallel, if then however you have another query which has a cost of 80 the query optimizer will check if it is possible to run this query in parallel (depending on your MD setting).

To show you what the impact to your system is I am going to create an OLTP workload & OLAP workload to my Virtual Machine running on my laptop. The OLTP workload will do update inserts and deletes, the OLAP workload will do some read intensive queries onto the system. The OLTP workload will be small queries with a low cost, the OLAP workload will have more impact and calculations which will make them have a higher cost.

The virtual machine will be configured with 4 cores

To run the workload I will be using SQLQueryStress which can be found at ( https://github.com/ErikEJ/SqlQueryStress ) This will use 1 thread and will do 100 iterations for the two workloads.

I will be running all my test against in memory tables to reduce any disk overhead, and just purely look to the CPU usage.

I will be running 5 tests with following setup.

  1. MAXDOP 1 OLAP & OLTP together
  2. MAXDOP 2 COST THRESHOLD FOR PARALLELISM 1 OLAP & OLTP together
  3. MAXDOP 4 COST THRESHOLD FOR PARALLELISM 1 OLAP & OLTP together
  4. MAXDOP 2 COST THRESHOLD FOR PARALLELISM 6 OLAP & OLTP together
  5. MAXDOP 2 COST THRESHOLD FOR PARALLELISM 6 OLAP & OLTP together

I have altered my CTFP setting to 6 because then only my OLAP queries will be eligible for a parallel plan since they have a plan cost of 7.

Here are the results

Workload

MAXDOP

COST THRESHOLD

CPU/ITERATION

SECONDS/ITERATION

ELAPSED TIME

Client Seconds

LogicalReads

OLAP

1

NONE

0,6654

1,0881

22,305

0,2113

21,4595

OLTP

1

NONE

0,2831

0,5072

27,17

0,2522

NONE

OLAP

2

1

0,6595

0,4862

25,263

0,2301

19,4194

OLTP

2

1

0,2898

0,2495

19,1247

0,1479

NONE

OLAP

4

1

0,5921

0,4178

17,8526

0,1658

25,3333

OLTP

4

1

0,1635

0,1648

13,1542

0,1046

NONE

OLAP

2

6

0,4739

0,3581

17,9613

0,1618

31,7011

OLTP

2

6

0,1929

0,1869

12,7859

0,102

NONE

OLAP

4

6

0,4576

0,3861

21,2969

0,1918

32,5532

OLTP

4

6

0,3335

0,3118

19,9687

0,174

NONE

 

When looking at the full elapsed time we can see that following configuration were the fastest.

Shortest OLTP RUN

12,7859 Seconds

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

Shortest OLAP RUN

17,8526 Seconds

MAXDOP 4

COST THRESHOLD FOR PARALLELISM

1

Shortest Full run

30,7472 Seconds

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

 

When Looking at the CPU user for every iteration we can see that following configurations used the least amount of CPU

Smallest OLTP CPU/Iteration

0,1929

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

Smallest OLAP CPU/Iteration

0,4739

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

Smallest FULL CPU/Iteration

0,6668

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

 

When looking at the amount of seconds used by the client for every iteration we see the following test to take the least amount of time

Shortest OLTP Client/Iteration

0,1020 Seconds

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

Shortest OLAP Client/Iteration

0,1618 Seconds

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

6

Shortest FULL Client/Iteration

0,2638 Seconds

MAXDOP2

COST THRESHOLD FOR PARALLELISM

6

 

When looking at the full elapsed time following configuration were the slowest

Slowest OLTP RUN

27,1700 Seconds

MAXDOP 1

COST THRESHOLD FOR PARALLELISM

NONE

Slowest OLAP RUN

25,2630 Seconds

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

1

Slowest Full run

49,4750 Seconds

MAXDOP 1

COST THRESHOLD FOR PARALLELISM

NONE

 

When looking at the CPU usage for every iteration we can see that following configuration uses the most of the CPU

Largest OLTP CPU/Iteration

0,3335

MAXDOP 4

COST THRESHOLD FOR PARALLELISM

6

Largest OLAP CPU/Iteration

0,6654

MAXDOP 1

COST THRESHOLD FOR PARALLELISM

NONE

Largest FULL CPU/Iteration

0,9485

MAXDOP 1

COST THRESHOLD FOR PARALLELISM

NONE

 

When looking at the amount of seconds used by every client for every iteration we see that following test were the slowest.

Slowest OLTP Client/Iteration

0,2522 Seconds

MAXDOP 1

COST THRESHOLD FOR PARALLELISM

NONE

Slowest OLAP Client/Iteration

0,2301 Seconds

MAXDOP 2

COST THRESHOLD FOR PARALLELISM

1

Slowest FULL Client/Iteration

0,4635 Seconds

MAXDOP 1

COST THRESHOLD FOR PARALLELISM

NONE

 

Conclusion OLTP & OLAP

 

As we can see in the results the MAXDOP 1 setting is the slowest in comparison to all other tests in full. This means that in most cases you will limit the throughput of your SQL Server if you set your MAXDOP to one.

The fastest configuration for my laptop has been proven to be MAXDOP 2 with COST THRESHOLD FOR PARALLELISM 6 for this workload with this hardware configuration. This is however not an exact science; this has to be tested for every system.

The MAXDOP 1 was the slowest because we are limiting our Read intensive workload to use one core for every query it is executing. Which increases the time for every read query in total, this makes that the workload was the slowest.

The fastest configuration is when we allow the Read intensive workload to use 2 cores and limit the write/update/delete workload to 1 core. Because we are running both workloads together with 1 thread, we are letting our read intensive workload use 2 cores and leaving the 2 other cores available for our OLTP workload. Which means that we are getting an ideal throughput for our workload as seen to our hardware.

What with Data warehouses?

Our data warehouse will primarily have a read intensive aka OLAP workload, with one or more daily loads (INSERT/UPDATES/DELETES) happening. This means that we should get the most throughput on our read queries. I have been to some clients who have set their MAXDOP to 1 for their data warehouse this will have a negative effect on the speed with which your queries will execute. I will show you that now by test doing just the OLAP workload without the OLTP workload interfering.

For this I will do 3 tests

  1. 100 Iterations 1 thread OLAP workload MAXDOP 1
  2. 100 Iterations 1 thread OLAP workload MAXDOP 2
  3. 100 Iterations 1 thread OLAP workload MAXDOP 4

These are the results on my laptop

Workload

MAXDOP

CPU ITERATION

Seconds Iteration

Elapsed Time

OLAP

1

0,5297

0,5353

27,5624

OLAP

2

0,3703

0,2058

10,9843

OLAP

4

0,3117

0,1769

9,3593

 

As you can see in the table above the fastest way of doing read queries is by giving the workload as much cores as possible to increase the amount of reads which can be done in the least amount of time.

MAXDOP 1 is the slowest run, with MAXDOP 2 being 17 SECONDS faster. There is not a lot of difference between the run of MAXDOP 4 and MAXDOP 2 because of hardware limitations.

Conclusion OLAP

Because of the Read intended workload of a Data warehouse we should not put our MAXDOP to 1. As seen in the test above the more cores you have the more read throughput you will have which will make your queries in your DWH go faster. In these test we were 60% faster when using multiple cores for our workload instead of one core.

What with pure write workload?

When you are just writing data (UPDATE INSERT DELETE ) I would also like to know what MAXDOP setting would be the best. To test this, I have created 3 test scenarios

  1. 100 Iterations 1 thread OLTP workload MAXDOP 1
  2. 100 Iterations 1 thread OLTP workload MAXDOP 2
  3. 100 Iterations 1 thread OLTP workload MAXDOP 4

These are the results on my laptop

MAXDOP

CPU SECONDS

ACTUAL SECONDS

ELAPSED TIME

CLIENT SECONDS

1

0,0013

0,0048

1,1576

0,0467

2

0,0037

0,0183

1,3125

0,0568

4

0,006

0,0305

1,2988

0,0579

 

Conclusion OLTP

When inserting/updating/Deleting data we will be the fastest when running these queries in in a serial way. When running these processes in parallel we will increase the amount of CPU needed for the query and this will increase the time elapsed. When writing to in memory tables as we are doing in this test we see that writing data in amount of CPU time is 50% faster on MAXDOP 1. This means that going in a serial way will be faster than doing this in a parallel way.

Overall Conclusion

We should use caution when setting our MAXDOP setting, we can see that for the two different workloads we have different needs. The read intensive workload will in most cases benefit from a parallel execution, while the write intensive workload will benefit of a serial execution. Because most systems are not READ or WRITE only, I generally think the way to go is try to filter out those read-intensive high cost plans and allow these plans to make use of parallel execution. Thus limiting the amount of queries to be executed in parallel but not removing the option by default. This way we will not hold back our OLTP queries and we will be able to speed up those queries who can really benefit of the parallel execution. This means you should always use the MAXDOP setting in combination with your COST THRESHOLD FOR PARALLELISM SETTING. There is however no golden bullet for these settings and they will always need to be tested out to cope with the workload of your system.

Thank you for reading and stay tuned!

Tagged , , , , , , ,

Index maintenance on a VLDB: How to tame the beast!

When working with large data volumes, doing normal day to day maintenance becomes a more difficult issue, in a previous post (https://swyssql.wordpress.com/2015/08/06/dbcc-how-to-check-the-uncheckable-dbcc-on-vldb/) I talked about how to handle DBCC CheckDB on a multi-terabyte database. Today I will tackle another issue I came across while dealing with multi-terabyte databases, Index Maintenance.

Like most of you my absolute favorite solution when it comes to index maintenance is the Ola hallengren solution, this is a perfectly written and solid maintenance solution, but I noticed on VLDB’s this solution will not get passed the fragmentation checking stage of your index.

So in order to tame the VLDB beast, I wrote my own script for handling Index Maintenance on a VLDB. This script can be found on https://www.dropbox.com/s/mmhlxjx7fhln2ky/IndexColStatMaintenance.sql?dl=0

Now first things first: How does the script work?

When using other maintenance solutions, your maintenance will start all over every day, even if your maintenance did not reach all indexes during its maintenance window. To cope with this the VLDB maintenance script will create a few tables which will keep track of the progress of your index maintenance (Checks & Rebuild/Reorganizes), and not start over every day. By doing this we are able to check all indexes & rebuild all indexes. When the whole cycle has been done, it will start over again. What we actually do is divide the full index maintenance workload over a few days.
Another issue I was having that when rebuilding an index which is larger than your TLOG size, your TLOG will grow & cause your disk to get full with all problems which come with it. So in the script is a check built in which will skip the index if not enough space is available at this current moment or in total. It will then get reported in the reporting table, so that you can check which indexes are not getting rebuild and try to get them rebuild in another way.

The parameters

As a normal script I have defined some parameters which you will have to enter to create the index maintenance.

SET @ManagementDatabase =‘master’–> Database in which objects have to be created

This will be the database which you will use to create the objects needed for the script to work

SET @Database =‘WideWorldImporters’–> Database which will be maintained

This will be the database on which the maintenance has to be done

SET @AmountOfMemory = 30 –> Percentage of memory to be used for Online Rebuild Operations(0-100) (EnterpriseOnly)

This will be the amount of memory available for online rebuild. To ensure you don’t use all your memory for one rebuild.

SET @ColumnStatisticsMaintenance ‘Yes’ –>Do you want to maintain Column Statistics? (Yes/No)

If you also want to maintain the column statistics you will have to say yes here

SET @MaxRunTime = 300 –> Amount of minutes the Maintenance can run each night (300 = 5hours)

The amount of minutes your script can run every night, after this amount of time no new statements will be launched

SET @JobStartHour = 1 –> F.E. 1 job will start at 01:00 (possible options 1 2 3 4 5 6 7 8 9 10 11 12 13 … 24)

The start hour of your job, this will be the hour the jobs created by the script will be scheduled at.

SET @LockTimeout = 360 –> Set the locktimeout in seconds

The amount of time in seconds a lock can be kept by de index rebuild/reorganize before cancelling the operation

SET @WithOnline ‘YES’ –>Rebuild indexes with the ONLINE rebuild option (YES OR NO)

This will give you the choice to use online rebuilds or not (in case of enterprise: I would suggest YES,but I had one case where it was NO J )

Objects that will be created

Tables

The ColumnStatistics table will contain the workload which will be done for all Column Statistics. The script uses this table to define whether or not it should resample the statistics

The IndexFragTable will contain all information the script needs to rebuild or reorganize your indexes & where it has to continue since last rebuild

The IndexRebuildColumnStatLoggingTable is the most important table for you because it will contain all information on failed rebuilds and error messages and indexes which are not being checked.

Stored Procedures

The dbo.IndexStatsOptimize is the main stored procedure which will be executed to handle you index maintenance.

The dbo.Update IndexStatsTable will update your table and add new indexes & reset after the cycle has been run.

Job


This job will execute the stored procedures. If you need to make alteration to your schedule this is the place to go, and also if you want to change your parameters you can do it here or by altering the dbo.IndexStatsOptimizer stored procedure.

Availability Groups

The script has been made AG-Aware, the only thing you have to do is execute it on both nodes.

I made a mistake

If you made a mistake with your parameters just reexecute the script It will recreate all objects. If you made an error with the name of your database, then you have to manually remove all objects listed above.

I have had a lot of table changes in my database.

The script will check daily for new tables and indexes add them to the list of indexes to be checked and rebuild, it will also look for deleted table and delete them from the list.

If you have any questions feel free to ask.

Thanks for reading and enjoy the script!

Stay tuned!

Tagged , , , ,

Distributed Availability Groups: Availability groupception!

On the all new SQL Server 2016 we now have a feature which reminds me of the movie Inception with Leo DiCaprio & Tom Hardy. There you had the dreams in dreams in dreams, for the availability groups we now have AG in AG. This makes the Distributed Availbility groups a dream come true for Geographical differences between availability groups. Today’s post will show you how you set up a distributed availability group and what the options are.

I started with setting up 3 nodes in a Hyper-V environment on my local laptop, on these nodes SQL Server 2016 Developer edition was installed. After these installations I added them in my local domain name SQLSTIJN, and installed a failover cluster with the 3 nodes inside.

Now I will start off with setting up my primary availability group on the SS2016 & SS2016P nodes.

CREATE AVAILABILITY GROUP [AG_Ception]
FOR
DATABASE [AGTest] 
REPLICA
ON
N’SS2016\SS2016′
WITH
(ENDPOINT_URL =N’TCP://SS2016.SQLSTIJN.COM:5022′,
FAILOVER_MODE AUTOMATIC,
AVAILABILITY_MODE SYNCHRONOUS_COMMIT,
BACKUP_PRIORITY = 50,
SECONDARY_ROLE(ALLOW_CONNECTIONS=NO),
Seeding_Mode AUTOMATIC),
N’SS2016P\SS2016P’
WITH (
    ENDPOINT_URL N’TCP://SS2016P.SQLSTIJN.COM:5022′,FAILOVER_MODE=AUTOMATIC,
AVAILABILITY_MODE SYNCHRONOUS_COMMIT,
BACKUP_PRIORITY = 50,
SECONDARY_ROLE(ALLOW_CONNECTIONS=NO),
Seeding_Mode AUTOMATIC);

Because we are using the Seeding mode automatic, we still have to join the database and grant the availability group rights to create a database on the secondary replica. In my case I connect to the SS2016P node and execute following command

ALTER AVAILABILITY GROUP [AG_Ception] JOIN
ALTER AVAILABILITY GROUP [AG_Ception] GRANT CREATE ANY DATABASE
GO

Now our the database should be created on the secondary node and start synchronizing using automatic seeding (so easy without the backup restore … J ) (In case of error, check your errorlog à make sure your folders are the same and existing f.e. if you have on the primary d:\data en you have named the folder on the secondary s:\dataa it will not work. So make sure all folders and drives exist and have the same name J )

After you executed all the commands you should see something like this. This has been our normal setup so far, so this is apart from the automatic seeding has no difference with the SQL Server 2014 version.

Now we are going to create a listener for the first availability group.

USE [master]
GO
ALTER AVAILABILITY GROUP [AG_Ception]
ADD LISTENER N’AGCeption1′
(
WITH IP((N’33.0.0.90′,N’255.255.255.0′)), PORT=65000);
GO

After this we are going to create a single node secondary availability group for the SS2016DR node. We are not going to specify a database this time because we will again use the automatic seeding functionality of SQL Server 2016. We will however create it with the seeding_mode automatic to ensure the ability to seed after we join it through a distributed AG. IMPORTANT: Use the same mirroring endpoint otherwise after a local failover you will not be able to replicate.

USE [master]
GO
CREATE AVAILABILITY GROUP [AG_Ception2]
FOR REPLICA ON N’SS2016DR\SS2016DR’
WITH
(ENDPOINT_URL N’TCP://SS2016DR.SQLSTIJN.COM:5022′,
FAILOVER_MODE=MANUAL,
AVAILABILITY_MODE=SYNCHRONOUS_COMMIT,
BACKUP_PRIORITY= 50,
PRIMARY_ROLE(ALLOW_CONNECTIONS=ALL),
SECONDARY_ROLE(ALLOW_CONNECTIONS=NO),
SEEDING_MODE =AUTOMATIC);
GO

Now we create a listener for the second AG_Ception2

USE [master]
GO
ALTER AVAILABILITY GROUP [AG_Ception2]
ADD LISTENER N’AGCeption2′
(
WITH IP((N’33.0.0.91′,N’255.255.255.0′)), PORT=65000);
GO

When we have configured both availability groups it is finally time to add our Distributed Availability group. We will execute this on our current primary node SS2016\SS2016

CREATE AVAILABILITY GROUP [DISTRICEPTION]
WITH (DISTRIBUTED)
AVAILABILITY GROUP ON ‘AG_CEPTION’
WITH (
LISTENER_URL =‘TCP://AGCeption1:5022’,
AVAILABILITY_MODE=ASYNCHRONOUS_COMMIT,
FAILOVER_MODE=MANUAL,
SEEDING_MODE =AUTOMATIC
),
‘AG_CEPTION2’
WITH
(
LISTENER_URL =‘TCP://AGCeption2:5022’,
AVAILABILITY_MODE=ASYNCHRONOUS_COMMIT,
FAILOVER_MODE=MANUAL,
SEEDING_MODE =AUTOMATIC
);
GO

Notice that I use the endpoint port and not the listener port. This is needed otherwise you will get errors J

Now I’m going to join the secondary availability group, I will execute this on the SS2016DR\SS2016DR (Secondary AG)

Before doing this, check your endpoints, apparently mine was not created. Execute this on the secondary node if you don’t have an endpoint.

USE [master]
GO
CREATE ENDPOINT [Hadr_endpoint]
STATE=STARTED
    AS
TCP (LISTENER_PORT= 5022,LISTENER_IP=ALL)
FOR DATA_MIRRORING (ROLE=ALL,AUTHENTICATION=WINDOWS NEGOTIATE,ENCRYPTION=REQUIREDALGORITHM AES)
GO

After this you can join your secondary availability group to the Distributed AG

ALTER AVAILABILITY GROUP [Distriception]
JOIN
AVAILABILITY GROUP ON ‘AG_CEPTION’
WITH
(
LISTENER_URL ‘TCP://AGCeption1:5022’,
AVAILABILITY_MODE ASYNCHRONOUS_COMMIT,
FAILOVER_MODE=MANUAL,
SEEDING_MODE AUTOMATIC
),
‘AG_CEPTION2’
WITH
(
LISTENER_URL ‘TCP://AGCeption2:5022’,
AVAILABILITY_MODE ASYNCHRONOUS_COMMIT,
FAILOVER_MODE=MANUAL,
SEEDING_MODE =AUTOMATIC);
GO

After this you can check your error log and databases from the secondary AG and see that the automatic seeding has started and you now have a database in sync on the secondary AG.

If you see this, then you have successfully set up your Distributed Availability group.

One major downside I find in the current version is that you cannot create a listener on the distributed availability group & there is no synchronous mode. Which means that the whole failover process is manual and will need some editing to DNS.

This is however a very promising feature and hope to see evolve this!

Thanks for reading and stay tuned!

Tagged , , , ,

Query Store: Here is the bacon!

Whenever I go to a burger restaurant & I want to order myself a nice and tasty burger. The first thing I do is look at the menu and ask myself the question: “Where is the bacon?”. I did the same thing while looking at the menu of the new up and coming SQL Server 2016, and found the nice tasty bacon in the form of a feature called Query Store!

While looking at the menu of problems with poor performing servers, one of the most common problems is vegetable like code, or vegetarian indexes. Mostly bacon like indexes & code are not to be found. If you want to troubleshoot this, you will have to analyze the system for a long time, write a lot of code to capture query plans & how they change, and analyze the impact you make by rewriting it to bacon code & adding steak indexes. Also something that happens a lot is when you visit a client, they just rebooted the system and almost all the juice for troubleshooting your SQL Server performance is gone.

They fixed this in the new SQL Server 2016 with the Query Store! Analyzing your SQL Server Query Performance has become much easier and Query data is now persisted even when they choose to reboot your SQL Server Instance!

How did they do this?

Well the guys at Microsoft decided to deliver an option for every SQL Server 2016 database called Query Store, when you enable your QS for your database, SQL Server will capture data for every query which is executed against that database. I’ll continue now by showing you how you turn on Query store. After that I’ll show you how you can track a query with Query Store!

Let’s start of by creating a new database

CREATE
DATABASE [QueryStoreDatabase]

CONTAINMENT =
NONE


ON
PRIMARY

( NAME =
N’QueryStoreDatabase’,
FILENAME
= N’F:\DATA\MSSQL\Data\QueryStoreDatabase.mdf’

, SIZE = 131072KB , FILEGROWTH = 65536KB )


LOG
ON

( NAME =
N’QueryStoreDatabase_log’,
FILENAME
= N’F:\DATA\MSSQL\Data\QueryStoreDatabase_log.ldf’

, SIZE = 65536KB , FILEGROWTH = 65536KB )

GO

As you execute this code, nothing special happens, you just create a basic database

If you right click your database and select properties, you can see a new tab there called Query Store

By default, SQL Server will turn off Query Store, to enable Query Store, you have to choose Read Write, this will allow SQL server to write query data into your query store, if you choose read you can only analyze the data, SQL server will not refresh the query data.

We continue by turning on Read Write, after this you can configure some other options for your QS, you can alter the interval in which it will flush data for QS and in which interval it will collect your data. I will alter this to 5 minutes for this blogpost, to be able to show data after 5 minutes. I will leave the Query Store Capture mode to all and leave Size based Cleanup mode off. By default, QS will keep 367 days of query data.

You can also give a Max Size, this is the amount of space in your database that will be made available for QS, this means that the data is stored inside of your database files. In the left of the two graphs in the bottom you can see your database size, and the size of your query store against it, in the right graph you can see the available space of query store, and the amount used.

Click OK and then continue by entering this code, this will create a table and put some data into it.

DROP
TABLE
IF
EXISTS [dbo].[QueryStore]

— Cool New Syntax for IF EXISTS

GO

CREATE
TABLE dbo.QueryStore

(

QSID int
identity(-2147483648,1),

SomeText varchar(200),

SomeCategory varchar(40)

)

GO

SET
NOCOUNT
ON;

DECLARE @Count int

SET @Count = 1

while @Count <= 10000

BEGIN

IF @Count <= 2500

BEGIN

INSERT
INTO dbo.QueryStore

VALUES (‘TEXT’+cast(@Count as
varchar),‘2500’)

END

IF @Count > 2500 and @count <= 5000

BEGIN

INSERT
INTO dbo.QueryStore

VALUES (‘TEXT’+cast(@Count as
varchar),‘5000’)

END

IF @COUNT > 5000

BEGIN

INSERT
INTO dbo.QueryStore

VALUES (‘TEXT’+cast(@Count as
varchar),‘10000’)

END

SET @Count = @Count + 1

END

GO

INSERT
INTO dbo.QueryStore

select SomeText,SomeCategory from dbo.QueryStore

GO 9

CREATE
CLUSTERED
INDEX [CI_ID] ON [dbo].[QueryStore]

(

    [QSID] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, SORT_IN_TEMPDB
=
OFF, DROP_EXISTING
=
OFF, ONLINE
=
ON, ALLOW_ROW_LOCKS
=
ON, ALLOW_PAGE_LOCKS
=
ON)

We just created a database with a table called QueryStore. Now let’s query that table and track it to see if we can improve its performance.

Execute the following query

USE [QueryStoreDatabase]

select SomeText from dbo.QueryStore where SomeCategory =
‘5000’

GO

And continue by checking your query store data

select query_id,query_sql_text from
sys.query_store_query a

inner
join
sys.query_store_query_text b on a.query_text_id =B.query_text_id

where b.query_sql_text =
‘(@1 varchar(8000))SELECT [SomeText] FROM [dbo].[QueryStore] WHERE [SomeCategory]=@1’

Use the query id to open the Tracked Queries query store reports, this is a new tab on database level.

Here you can see our query which has a plan ID and duration. We are now going to continue by adding an additional index to get the query to run faster.

CREATE
NONCLUSTERED
COLUMNSTORE
INDEX [NCI_ColumnStore] ON [dbo].[QueryStore]

(

    [QSID],

    [SomeText],

    [SomeCategory]

)WITH (DROP_EXISTING = OFF)

GO

After adding the index rerun the following query

select SomeText from dbo.QueryStore where SomeCategory =
‘5000’

And go back to the tracked queries!

As you can see our column store was a lot faster! This is now a basic toolset you can use in your SQL Server! You can now easily track changes you make to a query and immediately see the impact you are making.

This is a feature which has got me really excited for SQL Server 2016, this is a real game changer for Query tuning!

Thank you all for reading & stay tuned!

Tagged , , , , ,

Migration Gone Wrong! The Fallback Scenario

Today I’ll be handling an issue I had with the migration of a SQL Server 2000 database to SQL Server 2012. About a week ago we migrated the database from the old and grumpy, and more importantly unsupported windows server 2003 and SQL Server 2005 instance. All was well, and we had no instant issues when users started working with the application. The migration was seamless and we did not expect any future problems, also the upgrade advisor did not give us any errors!

But then lightning struck this morning, somebody who was on a holiday came back into work and started working with the application on the migrated database. But apparently this user was a bit different than the other users who tested. This guy used the application to its full capacity, unravelling a waterfall of syntax errors & DB errors. The impact was uncontrollable so there was only one option, we had to fall back to the old & abandoned SQL Server 2000 instance.

Now here starts the tricky part, as users had been using the migrated database in production for several days, they had uploaded a lot of data which had to be present when we fell back to the SQL Server 2000 instance. But as you all know once a database is on SQL Server 2012, even in lower compatibility mode, it will not restore on a version lower than the running version! It is also not possible to detach and attach the database as there are changes which happened on system table level. Due to these restrictions there is only one option left, scripting out the database and exporting the data to the old SQL Server 2000 database.

I will now continue on explaining how I executed the fall back scenario.

First we will generate the scripts off the database

We choose to script out the entire database & all database objects! Watch out for triggers, these are not scripted out you will have to manually add them later!

Click next to create the scripts and copy them to clipboard/file/new query window whatever has your preference

 

And click next, it will now script out your database.

 

Execute the script on the destination server, here it was a SQL server 2005, you might get some errors for adding members & altering compatibility mode, depending on the SQL server version you are working with.

You can alter the statements to statements that work on your version. Also you might want to set your database owner to sa at this point.

Now that we have our database on our SQL Server 2005 we want to pump the data into it from the new old database J

Go to the database you want to export the data from and click export data.

Enter the information of your source database

Afterwards insert the information of the destination database

 

Then choose to copy data from table

In the next window select all your tables, and then check table by table by clicking edit mappings

 

Don’t forget to put on enable identity insert on the tables!!!

Also if you have foreign keys, don’t forget to disable them when loading your data between the instances, you can easily do that with this command.

— Disable all the constraint in database

EXEC sp_msforeachtable ‘ALTER TABLE ? NOCHECK CONSTRAINT all’

When you have done all the steps above you should be able to export your data to your SQL server 2000 database.

 

After you did the export don’t forget to re-enable your foreign keys

— Enable all the constraint in database

EXEC sp_msforeachtable ‘ALTER TABLE ? WITH CHECK CHECK CONSTRAINT all’

 

In this way we just kind of restored a SQL Server 2012 database on a SQL Server 2005 instance! So the fall back was successful! All data was present and the business was happy, now we can start testing the application again for compatibility with the SQL Server 2012 version which will prove to be a long and though process!

But the application is working again to its full extent, unfortunately it is on a SQL Server 2005 instance with a compatibility mode of SQL Server 2000 with a Windows Server 2003. (unsupportedception J)

 

Hope you enjoyed reading this blog and stay tuned!

Tagged , , , , , , , , , ,

THE SQL SERVER MISCONFIGURATION CHRONICLES: PART 3 Database Files

In the previous part of the SQL Server Misconfiguration Chronicles we handled the default instance memory configuration (https://swyssql.wordpress.com/2015/12/17/the-sql-server-misconfiguration-chronicles-part-2-instance-memory-settings/)

In the third part of the misconfiguration we will handle the database files. If you create a database SQL Server will look at your model database and create the file structure as specified in your model database. By default this will mean that you will create 1 data file with a size of 4 MB and 1 log file with a size of 1 MB. These files will be able to grow automatically by auto growth, where the data file will grow in steps of 1MB and your log file will increase with 10% each growth

Why is this a bad thing?

In this episode of the misconfiguration chronicles this is actually a multi part question.

  1. Why is the data file configuration not optimal?

It all starts with another question, how big is your database going to be? If you are going to have a database which will only contain 7 MB of data this data file setup will work for you, but when you have to scale your database this will not be a good configuration for you. Let’s say you are going to have a database in which initially 4 GB of data will be loaded, and this might grow to 8 GB within the next year. In this situation the default data file setup will not be optimal. If you leave the default settings on this will mean that your database will grow automatically about 4088 times for your initial load, causing file fragmentation! This will cause your IO calls to become a lot slower because you have to access your data on a fragmented file. (For more information on fragmented files and how to solve them you can read following blog which gives you a perfect explanation on the problem and how to fix it https://www.mssqltips.com/sqlservertip/3008/solving-sql-server-database-physical-file-fragmentation/).
This fragmentation will definitely impact your performance, and cause your SQL server to perform slower or even give errors due to your file system limitation. You can prevent this issue by initially scaling your database the right way, you can choose to give your initial database file a size of 4096 MB with an auto growth setting of 1024 MB. This will minimize the fragmentation of your file.
Also important to remember is that SQL server can use Instant File Initialization for its SQL Server data files. This has to be set on the OS level. You have to grant the SQL Server service account the right to Perform Volume Maintenance tasks. Why is the Instant File initialization a good thing? IFI will “reserve” the database file space of your auto-growth and not fill the space with zero. This will cause your file to remain available during an auto growth (Without INI your transaction will become suspended while auto growing).

Another thing you can do is choose to use multiple files, but this is to enhance performance, and give you flexibility in scaling your database. For more information on this I suggest you read the impact on performance of multiple data files post of Paul Randall (http://www.sqlskills.com/blogs/paul/benchmarking-do-multiple-data-files-make-a-difference/).

  1. Why is the Log file configuration not optimal?

For the physical file level fragmentation, the explanation of the data file configuration also applies here, only the part of multiple files and the part of Initial File Initialization does not apply here. For multiple log files: SQL Server will only use one transaction log file. If you add another transaction log file it will not be used unless you have a max size on the other transaction log file and/or the file is full. This is because the SQL Server transaction log will write sequentially into the file this means transaction after transaction. It cannot spread its workload over different files.

The first difference with the data file part is that we here have a default setup of an auto growth of 10%, this will make your file grow in very small portions at first, but later on 10% might become a very big growth. 10% of 1MB is not that much and will case heavy fragmentation if you have a large transaction running on your SQL server. It is better to set an auto growth in MB according to the workload your SQL Server has to handle. I usually set the Auto growth on LOG or DATA files from 64MB to 128 MB to 256MB to 512MB to 1024MB depending on the size of the file and the workload.

The second and most important difference with data files is that log files use something that is called Virtual Log Files aka the VLF’s. If your log file grows SQL Server will add additional Virtual Log Files. (for more information how many virtual log files will be created & more information on VLF’s you can read this great blog on VLF http://www.sqlskills.com/blogs/kimberly/transaction-log-vlfs-too-many-or-too-few/ & http://www.sqlskills.com/blogs/paul/important-change-vlf-creation-algorithm-sql-server-2014/ ).
Having too many VLF files will impact your transaction log backup & restore process, this will take longer and longer depending on the amount of VLF which you have in your transaction log file. In extreme cases this might even impact your insert/update/delete performance (as explained by the blog on Brent Ozar’s website http://www.brentozar.com/blitz/high-virtual-log-file-vlf-count/ )

How do you solve this?

There is not a straight answer for this one, because it is a multipart question and it all depends. We can say that the default setting will have to be altered 99% of the time. The configuration you have to choose always depends on what your size is and will be, and what your workload will be, so unfortunately I cannot give you a straight answer here, because it will depend on other factors which will change the answer!

Thank you all for reading and happy reconfiguring!

Stay Tuned!

Tagged , , , , , , , ,

THE SQL SERVER MISCONFIGURATION CHRONICLES: PART 2 Instance Memory Settings

In the first part of the SQL Server Misconfiguration Chronicles we handled the database level issue of AutoClose & AutoShrink (https://swyssql.wordpress.com/2015/12/08/the-sql-server-misconfiguration-chronicles-part-1-database-autoclose-autoshrink/)

The second part of the misconfiguration chronicles will handle the default memory settings of a freshly installed SQL Server instance. SQL Server by default configures the instance memory settings to take a maximum memory of 2147483647MB which is about 2 Petabytes.

Why is this a bad thing?

Let’s say you have configured your SQL server memory to be able to take 2 petabytes of data, but your server only has 64GB of memory available. This will mean that SQL Server will be able to take all memory that exists on the server. But what happens to the other processes on your server. As you know you will also have to run an OS on your server, and maybe also SSIS & SSAS & god knows what. If SQL Server takes all the memory, it will be constantly fighting with the other processes to get the free memory of the server. Which will result in to memory pressure and with that big CPU consumption, because data has to be swapped from memory to disk constantly. Which will then result in a bad performing server.

How do you solve this?

That answer is quite straightforward, lower the default memory setting of your SQL Server to a value which will not take all memory of your server. For example, if you have a server with 12 GB of RAM, I would start off by giving 8GB of ram to the SQL Server leaving 4GB for the OS & other processes. This will result in a better performance of your server.

Following code will allow you to alter the default MaxMemory setting of your SQL Server

———————————————————————-
———————————————————————-
— Created By Stijn Wynants
— SQL MaxMemoryFixer
— Description: Execute Script to PRINT ALTER ServerMemory
— Statements for SQL Server Instance
———————————————————————-
———————————————————————-
DECLARE @ServerMemory int
DECLARE @SM nvarchar(200)
DECLARE @CMD varchar(max)
DECLARE @MemoryForOS int
DECLARE @MemoryOtherProcesses int
———————————————————————-
———————————————————————-
— SET YOUR MEMORY FOR SQL SERVER HERE !!!!
SET @ServerMemory = 14000        –> In MB
SET @MemoryForOS = 4096            –> In MB (Recommended minimum of 2048 MB Optimal 4096 MB, with more memory(64 gb,256,…) leave about 10%.
SET @MemoryOtherProcesses = 0    –> In MB (If there are no other processes just set to 0)
———————————————————————-
———————————————————————-
SET @SM cast(@ServerMemory as nvarchar)
DECLARE @FullSystemMemory as int
SET @FullSystemMemory (select top 1 total_physical_memory_kb/1024 from sys.dm_os_sys_memory)
IF @ServerMemory (@FullSystemMemory @MemoryForOS @MemoryOtherProcesses)
Begin
SET @CMD =

EXEC sp_configure ”show advanced option”, ”1”;
GO
RECONFIGURE
GO
EXEC sys.sp_configure N”max server memory (MB)”, ”’+@SM+”’
GO
RECONFIGURE WITH OVERRIDE
GO
EXEC sp_configure ”show advanced option”, ”0”;
GO
RECONFIGURE
GO

END
ELSE
BEGIN
SET @CMD =
‘There is only ‘+cast((@FullSystemMemory 2048as varchar)+‘ MB available on this server, you have tried to choose more memory then the server can deliver. Pick a lower value then ‘+@SM+‘ MB’
END
PRINT @CMD

Thank you all for reading and happy reconfiguring!

The next part will be on Data File Misconfiguration!

(https://swyssql.wordpress.com/2015/12/24/the-sql-server-misconfiguration-chronicles-part-3-database-files/)

Stay Tuned!

Tagged , , , , , , ,

The SQL Server Misconfiguration Chronicles: Part 1 Database AutoClose & AutoShrink

In the SQL Server Consulting world you see a lot of different environments, but mostly you see the same basic misconfigurations for their SQL Server Enviroment.

The following blogposts will show you the most common issues with the configuration of SQL Server, they will tell you what the issue is, why it is a bad configuration and how you can solve it.

The first post will be on two common misconfigured database options: AutoClose & AutoShrink.

What does Autoclose do to my database?

Autoclose will close your database when the last user who uses the database is disconnected.

Why is that a bad thing?

Everytime a user connects to a SQL Database which is closed it will have to be opened again, this will degrade performance because the connecting user will have to wait until the database is opened again before being able to access the data.

How do we solve this issue?

You can execute the following script which will generate statements to alter auto close to false for all databases of the instance

----------------------------------------------------------------------
———————————————————————-
—        Created By Stijn Wynants                                    
—        SQL AutoClose Fixer            
—        Description: Execute Script to PRINT ALTER STATEMENTS
—                     auto_close to false on all databases
———————————————————————-
———————————————————————-
DECLARE @CMD varchar(max)
DECLARE @Database varchar(200)
DECLARE AutoCloseFixer CURSOR
FOR 
select name from sys.databases where database_id >and is_auto_close_on = 1
OPEN AutoCloseFixer
FETCH NEXT FROM AutoCloseFixer INTO @Database
WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
         PRINT
————> DisableAutoClose on ‘+@Database+
USE [master];
 GO
ALTER DATABASE [‘+@Database+‘] SET AUTO_CLOSE OFF WITH NO_WAIT;
PRINT ”Disabled AutoClose on ‘+@Database+”’
 ‘
END TRY
BEGIN CATCH
    SELECT  ERROR_MESSAGE()

END CATCH
FETCH NEXT FROM AutoCloseFixer INTO @Database END
CLOSE AutoCloseFixer
DEALLOCATE AutoCloseFixer

What does AutoShrink do to my Database?

It will remove the unused space from your database file and thus making your database file smaller.

Why is that a bad thing?

Because auto shrink process removes all free space out of your database file, there will be no space left for new data. So the moment when new data gets loaded in your database your database file will have to grow again. (This is handled by auto growth, this will be discussed in the next post) This will grow larger than the actual space you needed, which will leave some free space, when the auto shrink feature kicks in again it will remove this space again. This constant shrink grow shrink grow operation will cause file fragmentation (on a system level) and this uses a lot of resources. Also the shrink operation itself uses a lot of resources because it will move pages to other places in you database file which takes CPU IO & generates transaction log. But last but not least when auto shrink services your database it will not only shrink your database but because it moves around the pages it will have a devastating side-effect! It will cause a complete fragmentation of all your database indexes which will cause your performance to plummet.

How do we solve this issue?

You can execute the following script which will generate statements to alter auto growth to false for all databases of the instance

———————————————————————-
———————————————————————-
—        Created By Stijn Wynants                                    
—        SQL AutoShrink Fixer            
—        Description: Execute Script to PRINT ALTER STATEMENTS
—                     auto_shrink to false on all databases
———————————————————————-
———————————————————————-
DECLARE @CMD varchar(max)
DECLARE @Database varchar(200)
DECLARE AutoShrinkFixer CURSOR
FOR
select name from sys.databases where database_id >and is_auto_shrink_on = 1
OPEN AutoShrinkFixer
FETCH NEXT FROM AutoShrinkFixer INTO @Database
WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        PRINT

————> DisableAutoShrink on ‘+@Database+
            USE [master];
            GO
            ALTER DATABASE [‘+@Database+‘] SET AUTO_SHRINK OFF WITH NO_WAIT;
            PRINT ”Disabled AutoShrink on ‘+@Database+”’
            ‘
    END TRY
    BEGIN CATCH
        SELECT ERROR_MESSAGE()
    END CATCH
    FETCH NEXT
FROM AutoShrinkFixer 
INTO @Database END
CLOSE AutoShrinkFixer
DEALLOCATE AutoShrinkFixer

Thank you all for reading and happy reconfiguring!

The next part will be on Default Memory Settings Misconfiguration!
(<a href="https://swyssql.wordpress.com/2015/12/17/the-sql-server-misconfiguration-chronicles-part-2-instance-memory-settings//)

Stay Tuned!

Tagged , , , , , , , , , , ,