How is Fill Factor Impacting My Indexes?

TSQLTuesday LogoThe theme for this month’s T-SQL Tuesday is indexes so it seemed like the perfect excuse to blog about a script that I have written to see what choices for fill factors on indexes actually does to the structure of those indexes. I have to give special thanks to Brent Ozar (Blog|Twitter) for taking the time to review and offer his thoughts on the query. I have to admit that I was nervous to publish the script because I have not seen anything else like it and figured there must have been a reason for that.

For those that are unfamiliar, fill factor is an optional parameter that can be specified when adding or rebuilding an index. Specifying a fill factor tells SQL Server to leave a certain percentage of each data page open for future inserts in order to lessen the likelihood of page splits. Page splits are what happens when SQL Server tries to add another row to a data page that it does not fit on. Most page splits involve taking half the rows on the page and putting them onto a newly allocated page somewhere else on your data file, allowing sufficient room for the new row to be added to either page. If you are lucky enough that the row you are adding would be the last row on the page then the existing page is left as is and the new row is added to the newly allocated page. Regardless of how the page splits, the new page is almost never anywhere near the other pages of the index it goes with. The scattering of index pages means that the disk heads have to move around a lot more leading to poor performance.

Now that we have talked about the problems that fill factor can help us with, we should talk about the dark side. Yes, the dark side. Setting the fill factor to anything other than the default decreases the rows per page for that index, thereby increasing the number of pages that must be read. According to Books Online, the read performance penalty is twice the chosen fill factor. This means that setting the fill factor to 50% will lead to twice as many reads to get the same data. Even a more reasonable number like 90% would have a 20% performance penalty on all reads.

By now it should be clear that choosing the right fill factor for your indexes is one of the more important steps in creating an index, right behind picking the right key columns. The problem is knowing how to pick a good number and here is where it gets tough because like everything else: It Depends and It Changes. My method of setting fill factors is to calculate the rows per page of an index then use the expected change in rows between reindex operations to figure out what percentage of rows need to be left free per page. The exception to this process is if the index is on an ever increasing value, like an identity column, then the fill factor is automatically 100.

My process works very well for the “It Depends” part of setting a fill factor but completely ignores the “It Changes” part. Over time as tables get larger, the fill factor setting on a table needs to be adjusted down. I have also run into servers where the default fill factor has been set to a value other than 0 (same as 100%), creating a need to quickly identify indexes that could perform better. What I needed was a simple query that I could run that would very quickly give me an idea of where I can adjust fill factors to improve performance.

Here is that query:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
SELECT      OBJECT_NAME(ips.object_id) AS table_name,
            ips.index_type_desc,
            ISNULL(i.name, ips.index_type_desc) AS index_name,
            ISNULL(REPLACE(RTRIM((  SELECT      c.name + CASE WHEN c.is_identity = 1 THEN ' (IDENTITY)' ELSE '' END + CASE WHEN ic.is_descending_key = 0 THEN '  ' ELSE ' DESC  ' END
                                    FROM        sys.index_columns ic
                                                    INNER JOIN sys.columns c
                                                          ON ic.object_id = c.object_id
                                                                AND ic.column_id = c.column_id
                                    WHERE       ic.object_id = ips.object_id
                                                          AND ic.index_id = ips.index_id
                                                                AND ic.is_included_column = 0
                                    ORDER BY    ic.key_ordinal
                                    FOR XML PATH(''))), '  ', ', '), ips.index_type_desc)  AS index_keys,
            ips.record_count,
            (ips.page_count / 128.0) AS space_used_in_MB,
            ips.avg_page_space_used_in_percent,
            CASE WHEN i.fill_factor = 0 THEN 100 ELSE i.fill_factor END AS fill_factor,
            8096 / (ips.max_record_size_in_bytes + 2.00) AS min_rows_per_page,
            8096 / (ips.avg_record_size_in_bytes + 2.00) AS avg_rows_per_page,
            8096 / (ips.min_record_size_in_bytes + 2.00) AS max_rows_per_page,
            8096 * ((100 - (CASE WHEN i.fill_factor = 0 THEN 100.00 ELSE i.fill_factor END)) / 100.00) / (ips.avg_record_size_in_bytes + 2.0000) AS defined_free_rows_per_page,
            8096 * ((100 - ips.avg_page_space_used_in_percent) / 100.00) / (ips.avg_record_size_in_bytes + 2) AS actual_free_rows_per_page,
            reads = ISNULL(ius.user_seeks, 0) + ISNULL(ius.user_scans, 0) + ISNULL(ius.user_lookups, 0),
            writes =  ISNULL(ius.user_updates, 0),
            1.00 * (ISNULL(ius.user_seeks, 0) + ISNULL(ius.user_scans, 0) + ISNULL(ius.user_lookups, 0)) / ISNULL(CASE WHEN ius.user_updates > 0 THEN ius.user_updates END, 1) AS reads_per_write
FROM        sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'SAMPLED') ips
            INNER JOIN sys.indexes i
                ON ips.object_id = i.object_id
                    AND ips.index_id = i.index_id
            LEFT OUTER JOIN sys.dm_db_index_usage_stats ius
                ON ius.database_id = DB_ID()
                    AND ips.object_id = ius.object_id
                        AND ips.index_id = ius.index_id
WHERE       ips.alloc_unit_type_desc != 'LOB_DATA'
ORDER BY    ips.index_type_desc,
            OBJECT_NAME(ips.object_id),
            (ips.page_count / 128.0)

The query should be very familiar to anyone that has looked at index fragmentation in SQL 2005 or newer. The same rules apply, the only difference is the columns that are being used. For larger databases consider limiting the scan to a single table or even a single index. It is also a good idea to ignore smaller tables here. I leave it up to the individual running the script to define a small table. For some that will be 100 pages, others 500 pages, but anything over 1000 pages should probably be looked at.

The size calculations used in the query are based on the formulas found here: http://msdn.microsoft.com/en-us/library/ms178085(SQL.90).aspx, although the math is quite simple because the DMV accounts for things like null bitmaps and row version information.

I assume that everyone will come up with slightly different ways to use the query. I like to make 2 passes over the data, the first in the morning and the second after the end of the business day. My first pass through the results is used to look for indexes that have too small of a fill factor set. They are easy to find because their free rows per page numbers are less than 1. A value of less than 1 means that the fill factor either needs to be changed to allow some free rows per page or to be more honest about the actual number of free rows per page. My second pass is used to look at the change over the course of the day. The best way to do the comparison is to paste both result sets into Excel and use formulas to look for differences. The second pass will show the indexes that have their factor set either too high or too low. The idea is to focus just as much on the indexes that show significant changes as much as those that do not show any changes at all.

So there it is, a query to tell how good the current fill factor settings are.

To make sure that all users stay as happy as possible it is best to run the query the first time during an off-peak time so that impact can be safely gauged.

Please let me know if you run into any issues, have any ideas that would make this script better or just want to share how you are using it. As always scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

A Stored Procedure to Move SSIS Packages Between Servers

Today’s post is one that I have been debating on whether to publish for a while. The purpose of the stored procedure I am sharing is to move SSIS packages stored via SQL Server Storage from one server SQL 2005 server to another in a way that can easily be invoked by any release management system that can call stored procedures. The part I have reservations about is that it uses linked servers. I almost never allow linked servers to be created on the servers I manage, mostly because they can be a security problem. Breaking the rules in this case is what was right for the particular problems I was trying to solve. Please consider whether you can implement this logic another way before using this stored procedure in your environment.

This stored procedure is not terribly complicated so I will run through what it does fairly quickly. The first step is to get the folder_id of the package we want to copy. If it gets more than 1 folder name back it throws an error because it does not know which package to move. If the folder_id returned is null then an error is thrown. If the stored procedure makes it through those checks, the current version a the destination is deleted and the new version is copied there.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
CREATE PROCEDURE dbo.move_ssis_package @from_server_name varchar(256), @to_server_name varchar(256), @package_name sysname

AS

DECLARE @sql_command nvarchar(4000),
@folder_id uniqueidentifier,
@foldername sysname

SELECT @sql_command = 'SELECT @folder_id = pf2.[folderid]
FROM [' + @from_server_name + '
].[msdb].[dbo].[sysdtspackagefolders90] pf
INNER JOIN [' + @from_server_name + '].[msdb].[dbo].[sysdtspackages90] p
ON pf.folderid = p.folderid
LEFT OUTER JOIN [' + @to_server_name + '].[msdb].[dbo].[sysdtspackagefolders90] pf2
ON pf.[foldername] = pf2.[foldername]
WHERE p.name = @package_name'

EXEC sp_executesql @sql_command, N'
@package_name sysname, @folder_id uniqueidentifier OUTPUT', @package_name = @package_name, @folder_id=@folder_id OUTPUT

IF @@ROWCOUNT > 1
BEGIN
RAISERROR ('
This package exists in more than one location.', 16, 1)
END

IF @folder_id IS NULL
BEGIN
RAISERROR ('
SSIS Folder does not exist.', 16, 1)
END

SELECT @sql_command = '
DELETE [' + @to_server_name + '].[msdb].[dbo].[sysdtspackages90]
WHERE name = @package_name'

EXEC sp_executesql @sql_command, N'
@package_name sysname', @package_name = @package_name

SELECT @sql_command = '
INSERT [' + @to_server_name + '].[msdb].[dbo].[sysdtspackages90]
SELECT [name]
,[id]
,[description]
,[createdate]
,@folder_id AS [folderid]
,[ownersid]
,[packagedata]
,[packageformat]
,[packagetype]
,[vermajor]
,[verminor]
,[verbuild]
,[vercomments]
,[verid]
,[isencrypted]
,[readrolesid]
,[writerolesid]
FROM [' + @from_server_name + '].[msdb].[dbo].[sysdtspackages90]
WHERE name = @package_name'

EXEC sp_executesql @sql_command, N'
@package_name sysname, @folder_id uniqueidentifier', @package_name = @package_name, @folder_id=@folder_id

Please let me know if you run into any issues, have any ideas that would make this stored procedure better or just want to share how you are using it. As always scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

Great News! SSMS Tools Pack 1.9 is Coming Out!

Mladen Prajdic (Blog|Twitter) recently announced that the newest version of SSMS Tools Pack is coming out and I am excited.

Why am I excited?

Well, I am glad you asked.

I am excited because it will allow me to define my window colors in SQL Management Studio using regular expressions rather than having to define them each individually.

Why is that such a big deal?

I have 100s of servers and I am constantly adding new servers while decommissioning old ones. The sheer amount of changes that would have to be made manually has always kept me from being able to take advantage of window coloring. Rather than have 100s of rules I now have less than 10 regular expressions that cover all of my servers. Here, check it out:

SSMS Tools Pack Connection Coloring Options Window
Naming convention changed to protect the employed.

So is that all the SSMS Tools Pack does is color windows?

No, not at all. The SSMS Tools Pack is a large suite of plug-ins for SQL Management Studio available as a free download. There are several features that I cannot live without. My favorite feature is that it can be configured to keep a journal of all queries you have run. This can be especially useful if you work in an environment where a sys admin can push updates that cause your machine to reboot whenever they feel like it. If you are not familiar with all of the features currently in the product then please go check out the list here.

If you have not tried out the SSMS Tools Pack then I highly suggest you give version 1.9 a whirl. I know I will.

How Can I Quickly Script Out Replication?

I recently finished taking down a bunch of servers that I was using to scale out my environment by forcing read only connections off of my main read-write servers. To make a long story short, hardware advances and the additional diagnostic information in SQL 2005 allowed me to consolidate to a few very powerful, reasonably well-tuned read-write servers. The consolidation of servers allowed me to save a ton of power and cooling along with some rack space and a good size chunk of SAN disk.

Taking down the servers means that I now have to update all of my environment diagrams, server configuration scripts and even a spreadsheet or two. Anyone who has ever done this before is cringing right now. One of the worst tasks is updating the replication scripts. I script my replication settings to a network share just in case I do something silly and need to revert to my last know good setup. The scripts can really save my bacon but they are incredibly tedious to create. I have to go into Management Studio, right-click on each publication, select generate script, select script to file then finally find the existing file for that database to add to or decide there is not one and start a new file. With the amount of scripts I had to create it would have easily taken 4, make that 8 hours with interruptions to get everything scripted.

Given, that the whole process would have taken hours and probably would have gotten screwed up along the way I decided to turn to PowerShell. Unfortunately, I did not have a script ready to go….WHHAAAT?…yeah I know..I don’t have a script for everything..so I threw the question out to Twitter. Aaron Nelson (Blog|Twitter) came back right away, pointing me toward SQL PowerShell Extensions (SQLPSX) and very quickly I had a working script. If you are not familiar with SQLPSX please take some time to check it out. It really makes coding PowerShell for SQL Server fast. More importantly, if you are not part of the SQL community on Twitter then get there first.

The actual script is not terribly complex. It takes a distribution server name and an output directory as parameters then works through all publications on each of the servers that connects to the distribution server, scripting them out.

I have only run this script against a dedicated distribution server but it should also work where the publisher is the distributor too.

I spent about 4 hours throwing the script together and generated all of the scripts I needed in a little over 1 minute.

With that, here is the script:

Update: Chad Miller (Blog|Twitter) showed how this script could take better advantage of the features of SQLPSX. His version of the script is available here: http://sev17.com/2010/08/quickly-script-out-replication-redux/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
param ([string]$sqlServer, [string]$outputDirectory, [bool]$scriptPerPublication)

if ($sqlServer -eq "")
{
    $sqlserver = Read-Host -Prompt "Please provide a value for -sqlServer"
}

if ($outputDirectory -eq "")
{
    $outputDirectory = Read-Host -Prompt "Please provide a value for -outputDirectory"
}

function ScriptPublications
{
    param ([string]$sqlServer, [string] $outputDirectory, [bool] $scriptPerPublication)
   
    Import-Module Repl
   
    [string] $path =  "$outputDirectory$((get-date).toString('yyyy-MMM-dd_HHmmss'))"
   
    New-Item $path -ItemType Directory | Out-Null
   
    foreach($publication in Get-ReplPublication $sqlServer)
    {
        [string] $fileName = "{0}{1}.sql" -f $path,$publication.DatabaseName.Replace(" ", "")
        if($scriptPerPublication)
        {
            $fileName = "{0}{1}_{2}.sql" -f $path,$publication.DatabaseName.Replace(" ", ""),$publication.Name.Replace(" ", "")
        }
        [string] $progressText = "Scripting {0} to {1}" -f $publication.Name.Replace(" ", ""),$fileName
        Write-Output $progressText
        $publication.Script([Microsoft.SqlServer.Replication.scriptoptions]::Creation `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeArticles `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludePublisherSideSubscriptions `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeCreateSnapshotAgent `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeGo `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::EnableReplicationDB `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludePublicationAccesses `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeCreateLogreaderAgent `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeCreateQueuereaderAgent `
            -bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeSubscriberSideSubscriptions) | Out-File $fileName -Append
    }
}

[Microsoft.SqlServer.Management.Common.ServerConnection] $serverConnection = new-object Microsoft.SqlServer.Management.Common.ServerConnection($sqlServer)
[Microsoft.SqlServer.Replication.ReplicationServer] $distributor = New-Object Microsoft.SqlServer.Replication.ReplicationServer($serverConnection);

foreach($distributionPublisher in $distributor.DistributionPublishers)
{
    if($distributionPublisher.PublisherType -eq "MSSQLSERVER")
    {
        [string] $path = $outputDirectory + "from_" + $distributionPublisher.Name.Replace("", "_")
        ScriptPublications -sqlServer $distributionPublisher.Name -outputDirectory $path -scriptPerPublication $false
    }
}

As usual, I hope you find this script helpful. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

What is a Good Way to Run CheckDB on a VLDB?

Introduction

Today’s script is one that I wrote based on the logic outlined in this post by Paul Randal (Blog|Twitter). This script is written for SQL 2000 but, as Paul notes, the logic will work on SQL 2005.

The Script

This stored procedure stays pretty true to the logic outlined in Paul’s post so I will just cover the differences here. The first thing to notice is that the parameters passed into the procedure are the days of the week that different portions of the check should run, the maximum run time in minutes and whether or not to print debug messages. The stored procedure then parses the input strings and runs CHECKALLOCs and CHECKCATALOGs if requested.

If tables should be checked today a little more work is necessary. I decided to use a utility database to hold work tables for my custom scripts called DBADB. The first part of performing a table check is to see if a work table already exists in the database. If the table does not exist then one is created and loaded with a list of all tables in the database. After the table is loaded, the process begins looping through the table, checking that the run time has not been exceeded then running checktable on each table. This continues until the table list has been processed or time runs out. If time runs out then the process picks up where it left off next time the table check starts to make sure all tables are eventually checked before starting over again.

Here is the script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
CREATE PROCEDURE [dbo].[sp_dba_checkdb_vldb] @days_to_run_checkalloc varchar(15) = '1,4', @days_to_run_checkcatalog varchar(15) = '1', @days_to_run_checktable varchar(15) = '1,2,3,4,5,6,7', @max_minutes_to_run int = 360, @debug_flag bit = 0
AS

 BEGIN

    DECLARE @date_part_search_string    char(3),
            @start_time                 datetime,
            @sql_text                   nvarchar(4000),
            @current_object_id          int

    SELECT  @date_part_search_string = '%' + CAST(DATEPART(dw, GETDATE()) AS VARCHAR) + '%',
            @start_time = GETDATE()

    IF PATINDEX(@date_part_search_string, @days_to_run_checkalloc) > 0
     BEGIN
        IF @debug_flag = 1
            PRINT 'DEBUG: Running DBCC CHECKALLOC'
            DBCC CHECKALLOC
     END

    IF PATINDEX(@date_part_search_string, @days_to_run_checkcatalog) > 0
     BEGIN
        IF @debug_flag = 1
            PRINT 'DEBUG: Running DBCC CHECKCATALOG'
            DBCC CHECKCATALOG
     END
     
    IF PATINDEX(@date_part_search_string, @days_to_run_checktable) > 0
     BEGIN
        DECLARE @control_table  varchar(500)
        SELECT  @control_table = DB_NAME() + '_dbcc_checktable_worklist'
        IF NOT EXISTS(SELECT * FROM [DBADB].[dbo].[sysobjects] WHERE name = @control_table)
         BEGIN
            SELECT  @sql_text = 'SELECT DISTINCT
                                        i.id,
                                        CAST(NULL AS datetime) AS run_date_time
                                INTO    [DBADB].[dbo].'
+ QUOTENAME(@control_table) + '
                                FROM    sysindexes i
                                        INNER JOIN sysobjects o
                                            ON i.id = o.id
                                WHERE   o.type != '
'TF'''

            IF @debug_flag = 1
                PRINT 'DEBUG: Running sql command: [' + @sql_text + ']'
            EXEC sp_executesql  @sql_text

            SELECT  @sql_text = '
CREATE CLUSTERED INDEX IX_' + @control_table + '_id_run_date_time ON [DBADB].[dbo].' + QUOTENAME(@control_table) + ' (id, run_date_time)'
            IF @debug_flag = 1
                PRINT '
DEBUG: Running sql command: [' + @sql_text + ']'
            EXEC sp_executesql  @sql_text
         END

        SELECT  @sql_text = '
  SELECT  TOP 1 @current_object_id = c.id
                                FROM    [DBADB].[dbo].[' + @control_table + '] c
                                        INNER JOIN sysobjects o
                                            ON c.id = o.id
                                WHERE   c.run_date_time IS NULL
                                            AND o.type != ''TF'''
         
        IF @debug_flag = 1
            PRINT '
DEBUG: Running sql command: [' + @sql_text + ']'
           
        EXEC sp_executesql @sql_text, N'
@current_object_id int OUTPUT', @current_object_id = @current_object_id OUTPUT
       
        IF @debug_flag = 1
            PRINT '
DEBUG: @current_object_id = ' + ISNULL(CAST(@current_object_id AS varchar), 'NULL')
       
        WHILE   @current_object_id IS NOT NULL AND DATEADD(mi, @max_minutes_to_run, @start_time) > GETDATE()
         BEGIN
            SELECT @current_object_id = NULL

            SELECT  @sql_text = '
  SELECT  TOP 1 @current_object_id = c.id
                                    FROM    [DBADB].[dbo].[' + @control_table + '] c
                                            INNER JOIN sysobjects o
                                                ON c.id = o.id
                                    WHERE   c.run_date_time IS NULL
                                                AND o.type != ''TF'''

            IF @debug_flag = 1
                PRINT '
DEBUG: Running sql command: [' + @sql_text + ']'
               
            EXEC sp_executesql @sql_text, N'
@current_object_id int OUTPUT', @current_object_id = @current_object_id OUTPUT

            IF @debug_flag = 1
                PRINT '
DEBUG: @current_object_id = ' + ISNULL(CAST(@current_object_id AS varchar), 'NULL')
               
            IF @debug_flag = 1
                PRINT '
DEBUG: Running DBCC CHECKTABLE(' + CAST(@current_object_id AS varchar) + ')'

            DBCC CHECKTABLE(@current_object_id)

            SELECT  @sql_text = '
  UPDATE  [DBADB].[dbo].[' + @control_table + ']
                                    SET     run_date_time = GETDATE()
                                    WHERE   id = @current_object_id'

            IF @debug_flag = 1
                PRINT '
DEBUG: Running sql command: [' + @sql_text + ']'

            EXEC sp_executesql @sql_text, N'
@current_object_id int OUTPUT', @current_object_id = @current_object_id OUTPUT
         END
       
        IF @current_object_id IS NULL
         BEGIN
            PRINT '
Ran out of work to do...cleaning up and shutting down.'
            IF EXISTS(SELECT * FROM [DBADB].[dbo].[sysobjects] WHERE name = @control_table)
             BEGIN
                SELECT  @sql_text = '
DROP TABLE [DBADB].[dbo].' + QUOTENAME(@control_table)
                IF @debug_flag = 1
                    PRINT '
DEBUG: Running sql command: [' + @sql_text + ']'
                EXEC sp_executesql  @sql_text
             END
         END
        ELSE
            PRINT '
Ran out of time...shutting down.'
     END
 END
GO

Conclusion

As usual, I hope you find this script helpful. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

How Do I Move SQL Database Files Around?

Introduction

Today’s script is one that I had not planned on blogging so soon but since Paul Randal just talked about moving SQL Server files around for TechNet Magazine, it seemed like a good time to break this one out.

The Script

This script is a little different in that it is a script that creates a script, a “turducken” script if you will. The idea here is to run this script and let it output to text then take those results, paste them into a new window, review the resulting script and maybe even run it.

The script starts out by getting the default data and log file locations from SQL Server by checking the registry, using a method learned by watching Profiler while checking the location with Management Studio. (I often comment out these lines to change to specific locations.) The script then begins building the string to output by creating the command to turn xp_cmdshell on. A lot of people, including me, have a policy against xp_cmdshell being turned on on their servers but in cases like this where it is turned on to be used and turned right back off I feel I can get away with it. The next step is to create alter database scripts to take the databases offline. Next, the alter database statements and DOS file move commands are created. The command to set the database online is then added and finally, xp_cmdshell is turned back off.

Updated 07/14/2010 to replace move commands with copy to make sure the files are still good before they are deleted. This does add manual cleanup but the trade off is not having to find out how good your last backup is. Thanks to Paul Randal (Blog|Twitter) and Buck Woody (Blog|Twitter) for pointing this out.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
DECLARE @file_path  nvarchar(520),
        @log_path   nvarchar(520)

EXEC master.dbo.xp_instance_regread N'HKEY_LOCAL_MACHINE', N'SoftwareMicrosoftMSSQLServerMSSQLServer', N'DefaultData', @file_path OUTPUT
EXEC master.dbo.xp_instance_regread N'HKEY_LOCAL_MACHINE', N'SoftwareMicrosoftMSSQLServerMSSQLServer', N'DefaultLog', @log_path OUTPUT

SELECT  'EXEC sp_configure ''show advanced options'', ''1''
GO
RECONFIGURE
GO
EXEC sp_configure '
'xp_cmdshell'', ''1''
GO
RECONFIGURE
GO'


SELECT 'ALTER DATABASE [' + DB_NAME(mf.database_id) + '] SET OFFLINE WITH ROLLBACK IMMEDIATE
GO
ALTER DATABASE [' + DB_NAME(mf.database_id) + '] MODIFY FILE (NAME = [' + mf.name + '], FILENAME = ''' + @file_path + '' + mf.name + '.mdf'')
GO
ALTER DATABASE [' + DB_NAME(mf.database_id) + '] MODIFY FILE (NAME = [' + mf2.name + '], FILENAME = ''' + @log_path + '' + mf2.name + '.ldf'')
GO
EXEC xp_cmdshell ''copy /Y "' + mf.physical_name + '" "' + @file_path + '' + mf.name + '.mdf"''
GO
EXEC xp_cmdshell ''copy /Y "' + mf2.physical_name + '" "' + @log_path + '' + mf2.name + '.ldf"''
GO
ALTER DATABASE [' + DB_NAME(mf.database_id) + '] SET ONLINE
GO

'
FROM    sys.master_files mf
        INNER JOIN sys.master_files mf2
            ON mf.database_id = mf2.database_id
WHERE   DB_NAME(mf.database_id) NOT IN ('
master', 'model', 'msdb', 'tempdb') and mf.type_desc = 'ROWS' and mf.file_id = 1 and mf2.type_desc = 'LOG'
AND (mf.physical_name != @file_path + '
' + mf.name + '.mdf' OR mf2.physical_name != @log_path + '' + mf2.name + '.ldf')
order by mf.name

SELECT  '
EXEC sp_configure ''xp_cmdshell'', ''0''
GO
RECONFIGURE
GO
EXEC sp_configure ''show advanced options'', ''0''
GO
RECONFIGURE
GO'

Would this Work for TempDB?

Yes and no. The resulting script can be used to alter the location of TempDB but SQL Server must be stopped and started to move the files. There would also have to be manual cleanup in that the old TempDB files would have to be deleted.

Would this Work for System Databases other than TempDB?

No. There is a lot involved in moving a system database. Detailed instructions can be found here: http://msdn.microsoft.com/en-us/library/ms345408.aspx

Conclusion

As usual, I hope you find this script helpful. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

How Can I Tell if SQL Agent is Running via SQL?

Introduction

Today’s post is a continuation in my on-going effort to document all of the scripts I use to manage my environment. To date it has been a great exercise both in getting things written down and in getting feedback from others on better ways I could be doing things. To that end, here is a stored procedure that I use to tell if SQL Agent is running before trying to start a job programmatically.

The Script

This script opens with some trickery that I stole from Management Studio using Profiler. This is my new favorite way to determine if a server is a named instance and handle it accordingly. I will be going back to retrofit all of my scripts to use this method.

The next step is to declare a temporary table and pull in the results of xp_servicecontrol to be able to use them later. The xp_servicecontrol extended procedure is undocumented and unsupported but still seems like the best option in this case. Alternatives I have seen include using xp_cmdshell (NO!) or building a CLR stored procedure to call WMI and get this information.

Finally, the service status is selected into the output variable for programmatic use by a caller and the flag is checked to see if the service status should also be returned as a result set for debug reasons.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
CREATE PROCEDURE [dbo].[sp_dba_GetSqlAgentStatus] @display_results bit = 0, @service_state varchar(100) = null OUTPUT

AS

 BEGIN

    DECLARE @agent_string   nvarchar(256)
         
    IF (SERVERPROPERTY('INSTANCENAME') IS NOT NULL)
        SELECT @agent_string = @agent_string + N'SQLAgent$' + CONVERT (sysname, SERVERPROPERTY('INSTANCENAME'))
    ELSE
        SELECT @agent_string = @agent_string + N'SQLServerAgent'

    CREATE TABLE #service_state
    (
        current_service_state   varchar(50)
    )
   
    INSERT INTO #service_state
        EXEC master.dbo.xp_servicecontrol 'QueryState', 'SQLServerAgent', @service_state OUTPUT

    SELECT  @service_state = current_service_state
    FROM    #service_state

    DROP TABLE #service_state

    IF @display_results = 1
        SELECT @service_state AS current_service_state

 END
GO

Conclusion

As usual, I hope you find this script helpful. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

What is a Good Way to Get the SQL Server Log Directory?

Introduction

Today’s post is going to be a quick one to get back in the saddle again. I have been on an involuntary hiatus from blogging due to production issues. Things are getting better so it seems like time to get back to blogging.

The script for today is one that I use in all of my job creation scripts as well as a script I have to go back and clean up job output file paths. It gets the location of master on a SQL 2005 or SQL 2008 server, using that to find the Log directory.

The Script

This script is quite simple so I will not spend a ton of time here. It just looks at the location of master, goes up 1 level and swaps in “Log” for “Data” in the directory name. Please note that this may not work well in your environment if you have moved master after installing SQL. Thanks to Argenis Fernandez (Blog|Twitter) for pointing out the ineffcient way I was doing this before, this script now just looks at the location of the SQL Server Error Log and uses the same directory. Warning: This is using undocumented functionality, but the best things in SQL tend to be undocumented.

1
2
3
4
5
6
7
8
9
10
11
12
13
CREATE PROCEDURE [dbo].[sp_dba_GetJobOutputFileDirectory] @log_directory varchar(400) OUTPUT, @showResults BIT = 1

AS

 BEGIN

    SELECT  @log_directory = REPLACE(CAST(SERVERPROPERTY('ErrorLogFileName') AS nvarchar(512)), 'ERRORLOG', '')

    IF @showResults = 1
        SELECT @log_directory

 END
GO

Conclusion

As usual, I hope you find this script helpful. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

What is a Good Way to Quickly Disable SQL Agent Job Schedules?

Introduction

I recently blogged about a Stored Procedure to Sequentially Run SQL Agent Jobs and have been meaning to blog about this script as a follow-up because this is very helpful for converting from SQL job schedules to sequential job schedules while still having a back out path. The question of how to disable SQL jobs came up on #SqlHelp yesterday so I figured it was time to cover this script.

Background

When I looked at switching over to running my maintenance jobs in a sequential manner rather than fussing with individual job schedules it seemed like a daunting task. I had to have a way to cover a lot of ground fast so manually via Management Studio was out and I needed to be able to quickly rollback so a flexible script was definitely a must.

I decided to write a script to disable the job schedules on all of my maintenance jobs. I figured this was the easiest way to cover the most ground and if things went bad I could just change the script to turn the enabled flag back on and re-run it.

The Script

This script has 2 major parts, disable the job schedules then notify SQL Agent of the change. The script begins by declaring a table variable (NOTE: I started out with a regular #temp table here but got column definition errors that were corrected by changing to a table variable. I have not had time to research why this happened.) to store the changed records in, allowing SQL Agent to be notified of only the job schedules that have been changed.

I could have just notified on all jobs but this invites the possibility of “fixing” the schedule on a job that had been broken by some other scripted update. Not wanting to take the chance of starting SQL Agent jobs that have not run in a while I opted to carefully limit the scope of the notifications.

Once I have the results it is as simple as opening a cursor and calling sp_sqlagent_notify for each job schedule that has been updated.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
USE msdb

DECLARE @updated_schedules table
(
    schedule_id int,
    job_id      uniqueidentifier,
    job_name    sysname
)

UPDATE  ss
SET     ss.[enabled] = 0
OUTPUT  INSERTED.schedule_id,
        sj.job_id,
        sj.name
INTO    @updated_schedules
FROM    msdb.dbo.sysjobs sj
        INNER JOIN msdb.dbo.sysjobschedules sjs
            ON sj.job_id = sjs.job_id
        INNER JOIN msdb.dbo.sysschedules ss
            ON sjs.schedule_id = ss.schedule_id
WHERE   sj.name LIKE '<pattern>'
           
DECLARE @schedule_id    int,
        @job_id         uniqueidentifier,
        @job_name       sysname

DECLARE csr CURSOR FAST_FORWARD READ_ONLY
FOR
    SELECT  schedule_id,
            job_id,
            job_name
    FROM    @updated_schedules

OPEN csr

FETCH NEXT FROM csr INTO @schedule_id, @job_id, @job_name

WHILE @@FETCH_STATUS <> -1
 BEGIN
   
    EXECUTE msdb.dbo.sp_sqlagent_notify @op_type     = N'S',
                                        @job_id      = @job_id,
                                        @schedule_id = @schedule_id,
                                        @action_type = N'U'
   
    PRINT 'Called sp_sqlagent_notify for ' + @job_name
   
    FETCH NEXT FROM csr INTO @schedule_id, @job_id, @job_name
   
 END

CLOSE csr
DEALLOCATE csr

Conclusion

As usual, I hope you find this script helpful. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.

Whats a Good Rule for Max Degree of Parallelism?

Introduction

I am one of those people that believe that anything worth doing is worth having a script to do it. Setting the Max Degree of Parallelism is no exception. In this post I will go through the script I use as well as the metrics I watch to make sure the setting is correct.

Reasoning

Max Degree of Parallelism (MAXDOP) is a setting that many people do not get right and for good reason it is tricky. The easiest scenario is a server with NUMA architecture. For NUMA servers the stepping off point is the number of cores per NUMA node. For other servers the answer is often “It Depends” followed by “It Changes Over Time”. The good news here is that the setting is fairly easy to tune by watching wait stats.

The rule of thumb I use for gauging whether or not my settings are correct is to make sure that CXPACKET waits are slightly higher than SOS_SCHEDULER_YIELD. My goal is to get them as close as possible to each other without SOS_SCHEDULER_YIELD waits exceeding CXPACKET waits. Depending on your workload and the amount of tuning you are doing you may need to evaluate your settings more often.

The Script

This script is dirt simple but do not be deceived, this is not a change to take lightly. It is a great starting off point for a new server but careful research should be done before running on any existing production systems. The basic logic is enable ‘show advanced options’ then get the hyperthread_ratio from sys.dm_os_sys_info. This will give you the number of cores per socket and that is typically equal to the number of cores on a NUMA node. Hyperthreading is not the same as multiple cores, I am not sure I would use this script on a hyperthreaded server. Finally the script sets ‘max degree of parallelism’ to the hyperthread ratio and turns off ‘show advanced options’.

1
2
3
4
5
6
7
8
9
10
11
12
13
EXEC sys.sp_configure N'show advanced options', '1'
RECONFIGURE WITH OVERRIDE

DECLARE @hyperthread_ratio  INT

SELECT  @hyperthread_ratio = hyperthread_ratio
FROM    sys.dm_os_sys_info

EXEC sys.sp_configure N'max degree of parallelism', @hyperthread_ratio
RECONFIGURE WITH OVERRIDE

EXEC sys.sp_configure N'show advanced options', '0'
RECONFIGURE WITH OVERRIDE

Conclusion

As usual, I hope you find this script helpful. This script and the basic rules I have laid out work well for me in my environment. Test it carefully in yours. Please let me know if you run into any issues with it or know a better way to do the same thing. Please keep in mind that scripts from the internet are like Halloween candy, inspect before consumption. I offer no warranty beyond a sympathetic ear if you should run into any issues.