Clean up the SSIS database

SQL Server Integration Services (or SSIS) is a very powerful service and set of tools to import data to and from your database. Some companies will have entire teams dedicated to writing SSIS packages (also known as ISPACs) to perform their ETL (Extract-Transform-Load) processes.

As part of your SSIS maintenance, there are a few things you need to keep an eye on:

  • Is your SSIS database in simple or full mode? I keep mine in simple, but this is a bit of a personal preference — If I lose my database, I’d just redeploy my ISPACs.
  • Are your maintenance jobs running? — This is really important! Your ISPACs generate all sort of logging information, and the more packages you have, the more logging. Without the maintenance, your database will balloon out of control with al that logging.
  • How big is the retention window? — You can configure the amount of time that logging data will be kept. I’ll decrease the amount depending on the number of packages I have in the store, and their frequency. Disabling the logging does not actually disables logging! A few other things will still generate logging data.
  • Is the database huge anyway? — Unfortunately, the maintenance job doesn’t do that good of a job at keeping up with growth! There are a few things that are not actually removed by the default maintenance job, but they grow anyway. .
The default retention period is 1 whole year! That’s a bit excessive in my opinion…

So, as much as I try to keep up with the maintenance, sometimes there’s only one things to be done: Nuke the whole thing! Especially the defaults where never changed.

Of course I had to use this picture. It’s a classic!

I’m sure there’s a variety of ways of cleaning up the catalog, but this so far is my favorite. It manages to remove all the logging information without any operational impact. At worst, I’ve had to stop the SSIS service long enough for the operation to complete. I don’t necessarily recommend shrinking the data file afterwards (unless you really need to!), because your logs are likely to grow back anyway, based on your day-to-day volumes. Credits for this beautiful script goes to Ibrahim Naji (home | twitter) and the original post is here.

Now for the script:

USE SSISDB
go

CREATE PROCEDURE [internal].[cleanup_server_retention_window_truncateall]
--WITH EXECUTE AS 'AllSchemaOwner'
AS 
   
----- STRUCTURE OF TABLES:

--[internal].[operations]
--  [internal].[executions]
--      [internal].[executable_statistics]
--      [internal].[execution_component_phases]
--      [internal].[execution_data_statistics]
--      [internal].[execution_data_taps]
--      [internal].[execution_parameter_values]
--      [internal].[execution_property_override_values]
--  [internal].[extended_operation_info]
--  [internal].[operation_messages] --(DATA HEAVY)
--      [internal].[event_messages] --(DATA HEAVY)
--          [internal].[event_message_context]
--  [internal].[operation_os_sys_info]
--  [internal].[operation_permissions]
--  [internal].[validations]

    SET NOCOUNT ON
   
    DECLARE @enable_clean_operation bit
    DECLARE @retention_window_length INT
   
    DECLARE @caller_name nvarchar(256)
    DECLARE @caller_sid  varbinary(85)
    DECLARE @operation_id BIGINT
   
    EXECUTE AS CALLER
        SET @caller_name =  SUSER_NAME()
        SET @caller_sid =   SUSER_SID()
    REVERT
         
   
    BEGIN TRY
        SELECT @enable_clean_operation = CONVERT(bit, property_value)
            FROM [catalog].[catalog_properties]
            WHERE property_name = 'OPERATION_CLEANUP_ENABLED'
       
        IF @enable_clean_operation = 1
        BEGIN
            SELECT @retention_window_length = CONVERT(INT,property_value)  
                FROM [catalog].[catalog_properties]
                WHERE property_name = 'RETENTION_WINDOW'
               
            IF @retention_window_length <= 0
            BEGIN
                RAISERROR(27163    ,16,1,'RETENTION_WINDOW')
            END
           
            INSERT INTO [internal].[operations] (
                [operation_type],  
                [created_time],
                [object_type],
                [object_id],
                [object_name],
                [STATUS],
                [start_time],
                [caller_sid],
                [caller_name]
                )
            VALUES (
                2,
                SYSDATETIMEOFFSET(),
                NULL,                    
                NULL,                    
                NULL,                    
                1,      
                SYSDATETIMEOFFSET(),
                @caller_sid,            
                @caller_name            
                )
            SET @operation_id = SCOPE_IDENTITY()






            -- Remove all [internal].[executions] dependancies
            TRUNCATE TABLE [internal].[executable_statistics]
            TRUNCATE TABLE [internal].[execution_component_phases]
            TRUNCATE TABLE [internal].[execution_data_statistics]
            TRUNCATE TABLE [internal].[execution_data_taps]
            TRUNCATE TABLE [internal].[execution_parameter_values]
            TRUNCATE TABLE [internal].[execution_property_override_values]


            -- Remove all [internal].[event_message_context] dependancies
            TRUNCATE TABLE [internal].[event_message_context]

            -- Remove all non-dependant tables
            TRUNCATE TABLE [internal].[operation_os_sys_info]
            TRUNCATE TABLE [internal].[operation_permissions]
            TRUNCATE TABLE [internal].[validations]
            TRUNCATE TABLE [internal].[extended_operation_info]

            -- Deal with [internal].[event_messages] and [internal].[operation_messages]
            ALTER TABLE [internal].[event_message_context] DROP CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages]
       
            TRUNCATE TABLE internal.event_messages
       
            ALTER TABLE [internal].[event_message_context]  WITH CHECK ADD  CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages] FOREIGN KEY([event_message_id])
            REFERENCES [internal].[event_messages] ([event_message_id])
            ON DELETE CASCADE

            ALTER TABLE [internal].[event_messages] DROP CONSTRAINT [FK_EventMessages_OperationMessageId_OperationMessage]
       
            TRUNCATE TABLE [internal].[operation_messages]

            ALTER TABLE [internal].[event_messages]  WITH CHECK ADD  CONSTRAINT [FK_EventMessages_OperationMessageId_OperationMessage] FOREIGN KEY([event_message_id])
            REFERENCES [internal].[operation_messages] ([operation_message_id])
            ON DELETE CASCADE

            -- Deal with [internal].[executions]

            ALTER TABLE [internal].[executable_statistics] DROP CONSTRAINT [FK_ExecutableStatistics_ExecutionId_Executions]
            ALTER TABLE [internal].[execution_component_phases] DROP CONSTRAINT [FK_ExecCompPhases_ExecutionId_Executions]
            ALTER TABLE [internal].[execution_data_statistics] DROP CONSTRAINT [FK_ExecDataStat_ExecutionId_Executions]
            ALTER TABLE [internal].[execution_data_taps] DROP CONSTRAINT [FK_ExecDataTaps_ExecutionId_Executions]
            ALTER TABLE [internal].[execution_parameter_values] DROP CONSTRAINT [FK_ExecutionParameterValue_ExecutionId_Executions]
            ALTER TABLE [internal].[execution_property_override_values] DROP CONSTRAINT [FK_ExecutionPropertyOverrideValue_ExecutionId_Executions]

            TRUNCATE TABLE [internal].[executions]

            ALTER TABLE [internal].[execution_property_override_values]  WITH CHECK ADD  CONSTRAINT [FK_ExecutionPropertyOverrideValue_ExecutionId_Executions] FOREIGN KEY([execution_id])
            REFERENCES [internal].[executions] ([execution_id])
            ON DELETE CASCADE

            ALTER TABLE [internal].[execution_parameter_values]  WITH CHECK ADD  CONSTRAINT [FK_ExecutionParameterValue_ExecutionId_Executions] FOREIGN KEY([execution_id])
            REFERENCES [internal].[executions] ([execution_id])
            ON DELETE CASCADE

            ALTER TABLE [internal].[execution_data_taps]  WITH CHECK ADD  CONSTRAINT [FK_ExecDataTaps_ExecutionId_Executions] FOREIGN KEY([execution_id])
            REFERENCES [internal].[executions] ([execution_id])
            ON DELETE CASCADE

            ALTER TABLE [internal].[execution_data_statistics]  WITH CHECK ADD  CONSTRAINT [FK_ExecDataStat_ExecutionId_Executions] FOREIGN KEY([execution_id])
            REFERENCES [internal].[executions] ([execution_id])
            ON DELETE CASCADE
       
            ALTER TABLE [internal].[execution_component_phases]  WITH CHECK ADD  CONSTRAINT [FK_ExecCompPhases_ExecutionId_Executions] FOREIGN KEY([execution_id])
            REFERENCES [internal].[executions] ([execution_id])
            ON DELETE CASCADE
       
            ALTER TABLE [internal].[executable_statistics]  WITH CHECK ADD  CONSTRAINT [FK_ExecutableStatistics_ExecutionId_Executions] FOREIGN KEY([execution_id])
            REFERENCES [internal].[executions] ([execution_id])
            ON DELETE CASCADE
       

            -- Deal with [internal].[operations]
            DECLARE @deleted_ops TABLE(operation_id BIGINT, operation_type SMALLINT)

            DELETE --TOP (@delete_batch_size)
            FROM [internal].[operations]
            OUTPUT DELETED.operation_id, DELETED.operation_type INTO @deleted_ops
            WHERE operation_id != @operation_id

           
           
            DECLARE @execution_id BIGINT
            DECLARE @sqlString              nvarchar(1024)
            DECLARE @key_name               [internal].[adt_name]
            DECLARE @certificate_name       [internal].[adt_name]
           
           
            DECLARE execution_cursor CURSOR LOCAL FOR
                SELECT operation_id FROM @deleted_ops
                WHERE operation_type = 200
           
            OPEN execution_cursor
            FETCH NEXT FROM execution_cursor INTO @execution_id
           
            WHILE @@FETCH_STATUS = 0
            BEGIN
                SET @key_name = 'MS_Enckey_Exec_'+CONVERT(VARCHAR,@execution_id)
                SET @certificate_name = 'MS_Cert_Exec_'+CONVERT(VARCHAR,@execution_id)
                SET @sqlString = 'IF EXISTS (SELECT name FROM sys.symmetric_keys WHERE name = ''' + @key_name +''') '
                    +'DROP SYMMETRIC KEY '+ @key_name
                    EXECUTE sp_executesql @sqlString
                SET @sqlString = 'IF EXISTS (select name from sys.certificates WHERE name = ''' + @certificate_name +''') '
                    +'DROP CERTIFICATE '+ @certificate_name
                    EXECUTE sp_executesql @sqlString
                FETCH NEXT FROM execution_cursor INTO @execution_id
            END
            CLOSE execution_cursor
            DEALLOCATE execution_cursor

            END
    END TRY
    BEGIN CATCH
       
       
        IF (CURSOR_STATUS('local', 'execution_cursor') = 1
            OR CURSOR_STATUS('local', 'execution_cursor') = 0)
        BEGIN
            CLOSE execution_cursor
            DEALLOCATE execution_cursor            
        END
       
        UPDATE [internal].[operations]
            SET [STATUS] = 4,
            [end_time] = SYSDATETIMEOFFSET()
            WHERE [operation_id] = @operation_id;      
        THROW
    END CATCH
   
    RETURN 0

Now to executive the thing

Use ssisdb
Exec [internal].[cleanup_server_retention_window_truncateall]

Other links!

More logging fun!

Get your object sizes

Remove replication objects from your database

As part of your DBA duties, you’re probably restoring your production databases into a test server. Then one day you notice that your log volume is getting pretty close to full! You take a look in the transaction log folder, and notice that that log file is even bigger than the data file. You did remember to set the recovery mode to simple, so why is the log file so big?

Naturally, you run DBCC SHRINKFILE against the log file, but the file doesn’t shrink. You take a peek at sys.databases, and the LOG_REUSE_WAIT_DESC column says “REPLICATION”.

“It’s waiting for replication? Odd, I haven’t set replication on this test server!

A bit of background

Internally, a database log file is divided into smaller blocks called Virtual log files (VLF). During the normal course of events, a VLF is filled with transactions, and certain flags control when a VLF can be overwritten. The whole topic is super complex, so for much more detailed explanation, read here.

When a database is setup as a replication publisher, one of the VLF “don’t overwrite me, bro!” flags is held in place until the replication log reader has had a chance to read the contents of the VLF and copy them into the Distribution database (again, this is over-simplifying things).

In this issue, the database was backed up while those flags were still in place, so the flags were restored in your test server! Since my test server isn’t replicated, there’s nothing actively reading the VLF and clearing the flags, so the VLFs are never allowed to be reused. This is why the log_reuse flag is set to “Replication”, and the result is that the log will just grow forever (or until you run out of space, whichever comes first).

The easiest fix is to run the command below — It’ll remove all the replication objects, flags, etc, from the database. After that, your shrink should work as expected:

USE master
EXEC sys.sp_removedbreplication @dbname = 'your database name', @type ='both'

Other links!

Write a backup report!

Shrink your log files painlessly

All the advice you’ve heard out there is true: It’s not a good idea to regularly shrink your files! If you find yourself in that situation, you probably have other problems to deal with, such as

  • You might need more storage!
  • Data files are supposed to grow (in most cases)! If they’re growing faster than normal, however, you should find out why, and plan accordingly (and you do have growth baselines, right?)
  • Shrinks cause fragmentation!
  • Shrinks cause unnecessary I/O (because they files are just going to grow again).

Unfortunately, sometimes problems do happen, and now you have a fire to put out. While I have taken the time to size my volumes correctly, sometimes we’ll have a large data load, bulk update, etc. Over time, my log volume will be almost full, and one large update away from filling out. Not good!

Another situation I might run into is when I’m building a new test server, and using prod backups. The log file in that restore will be as big as it was when the backup took place, and the dev server may not have been sized as big as the prod server (or maybe it was, but as of 2 years ago) so now I can’t fit the log files!

Or you can use a Size-Ray on your database. It worked for Lois Lane!

For those situations, I wrote the script below. It’s not anything super fancy, it’ll just go through each of the user databases (database_ID >4) and issue a TRUNCATE_ONLY. What that does is instruct SQL to return any empty space that’s already available at the end of the log file back to the Operating System. If you set the parameter @ChangetoSimple to 1, it’ll change the recovery mode to simple before it issues the truncate_only. The ins and outs of the log space usage is a topic for a different post, but for my purposes here, all it means is that it’ll free up space in the log file.

The big letter caveat is DO NOT DO SET THIS TO 1 ON PRODUCTION, IF YOU DEPEND ON LOG BACKUPS! It will ruin your log backup chain, and you’ll need to start the chain again (with a full log backup). I usually only change things to simple in non-production environments.

USE master;

DECLARE @dbname      sysname,
        @logicalname sysname,
        @sql         VARCHAR(200);

DECLARE @ChangetoSimple INT = 0;
DECLARE @simple VARCHAR(200) = ' ';

SELECT name,
       log_reuse_wait_desc
FROM sys.databases
ORDER BY name;

DECLARE list CURSOR FOR
SELECT DISTINCT
       DB_NAME(mf.database_id),
       mf.name
FROM sys.master_files mf
    INNER JOIN sys.databases db
        ON mf.database_id = db.database_id
WHERE DB_NAME(mf.database_id) NOT IN ( 'master', 'tempdb', 'distribution', 'msdb', 'model' )
      AND mf.type_desc = 'log';

OPEN list;

FETCH NEXT FROM list
INTO @dbname,
     @logicalname;
WHILE @@FETCH_STATUS = 0
    BEGIN
        IF (@ChangetoSimple) = 1
            SELECT @simple = 'alter database [' + @dbname + '] set recovery simple';

        SET @sql
            = 'use [' + @dbname + ']; ' + @simple + '  DBCC shrinkfile (N''' + @logicalname
              + ''',0, truncateonly);
		';
        PRINT @sql;
        FETCH NEXT FROM list
        INTO @dbname,
             @logicalname;
        BEGIN TRY
            EXEC (@sql);
            PRINT @sql;
        END TRY
        BEGIN CATCH
            SELECT ERROR_NUMBER(),
                   ERROR_SEVERITY(),
                   ERROR_MESSAGE();
        END CATCH;
    END;
CLOSE list;
DEALLOCATE list;

Useful links!

More troubleshooting:

Recover from disk full issues

Get the size of all your objects

I try to stay in touch with the internals of my databases, including the size of things. Growth patterns are a good indicator of where most of your database activity lies, and that in turn can help you locate activity hot-spots. For today’s post, I have a simple script will help you get all your object sizes, including tables, heaps and indexes.

As an added bonus, it’ll also give you suggestions of which indexes are just giving you maintenance overhead. Every DML also updates the indexes, but you have the potential for index overhead if you have more updates than reads. To identify such candidates, I use the formula below:

( user_seeks + user_scans + user_lookups )  / (user_updates) 

The idea here is simple: if the combined seeks, scans and lookups add up to more than the updates, then the index is probably good! If you’re having more updates than reads, then the index is probably just baggage, and you should consider tossing it overboard.

That’s a lot of baggage!

There’s a few caveats to consider, however

  • Index stats are not preserved during service restarts.
  • Index stats can also be manually flushed.
  • It is very possible that an index is used for just one report, which only runs once a year! And of course, this one index shaves off several hours of report time execution.
  • sys.dm_db_index_usage_stats only holds information for indexes that have been used since the last service start up.
  • sys.dm_db_index_usage_stats does not return information about memory-optimized indexes (see here).

And here’s the code itself


SELECT 
    ds.name AS filegroupname,
    df.name AS filename,
    df.physical_name,
    SCHEMA_NAME(o.schema_id) AS SCHEMAName,
    OBJECT_NAME(ps.object_id) AS objname,
    i.name indexname,
    o.create_date,
    i.index_id,
    ps.row_count,
    SUM(ps.used_page_count) / 128 AS used_mb,
    SUM(ps.reserved_page_count) / 128 AS reserved_mb,
    SUM(ps.in_row_data_page_count) / 128 In_row_mb,
    SUM(ps.in_row_used_page_count) / 128 AS row_used,
    SUM(ps.in_row_reserved_page_count) / 128 AS row_reserved,
    SUM(ps.lob_used_page_count) / 128 AS lob_mb,
    SUM(ps.lob_reserved_page_count) / 128 AS lob_reserved,
    us.user_updates AS [Total Writes],
    us.user_seeks + us.user_scans + us.user_lookups AS [Total Reads],
    case
		when user_updates > 0 then
		Convert(decimal(6,2),( user_seeks + user_scans + user_lookups )  / (user_updates* 1.0)) 
	else 
		Convert(decimal(6,2),( user_seeks + user_scans + user_lookups ) )
	end as ReadToWriteRatio
FROM sys.dm_db_partition_stats ps
    left JOIN sys.objects o
        ON o.object_id = ps.object_id
    left JOIN sys.indexes i
        ON o.object_id = i.object_id
    left JOIN sys.data_spaces ds
        ON ds.data_space_id = i.data_space_id
    left JOIN sys.database_files df
        ON df.data_space_id = ds.data_space_id
           AND df.data_space_id = i.data_space_id
    left JOIN sys.dm_db_index_usage_stats us
        ON us.object_id = i.object_id
           AND us.index_id = i.index_id
GROUP BY 
SCHEMA_NAME(o.schema_id),
         OBJECT_NAME(ps.object_id),
         us.user_seeks + us.user_scans + us.user_lookups,
         us.user_updates - (us.user_seeks + us.user_scans + us.user_lookups),
         ds.name,
         df.name,
         df.physical_name,
         i.name,
         o.create_date,
         i.index_id,
         ps.row_count,
         us.user_updates
ORDER BY objname,
         filegroupname,
         i.name;

For a fresh restore of a StackOverflow database (thanks Brent!) this is what the output will look like. You can save the results into a table (along with a time-stamp) so you can review your usage trends overtime.

Other links!

sys.dm_db_index_physical_stats – get fragmentation values!

Where in the disk are my indexes? – similar to this, but without the usage values