SQLGeordie's Blog

Helping the SQL Server community……where i can!

Delayed Durability in the wild… — November 14, 2016

Delayed Durability in the wild…

Background

We have recently been working on large data migration project for one of our clients and thought I would share how Delayed Durability helped us overcome a performance issue when the solution was moved to the client’s Development domain.

I won’t go into details of the project or the finer detail of our proposed solution as I have plans to put some more content together for that but in short the migration of the data was to be run by a (large) number of BIML generated SSIS (Child) packages for each table to be migrated, derived from a meta data driven framework with each stage being run by a master package, all of which run by a MasterOfMaster packages.

To maximize throughput, utilise as much processing power as possible, reduce the time it would take to run the migration and control the flow we built a series of sequence containers each running it’s own collection of Child Packages. We built the framework in such a way that these could be run in parallel or linear and each master package could contain as many containers (no pun intended) of child packages as required. This also allowed us to handle the order that packages were run in, especially those with dependencies whilst keeping the potential for parallelising (is that a word? No idea but I like it) the whole process as much as possible. Leaving the MaxConcurrentExecutables property to -1 mean’t we could push the processing to run up to 10 packages at once due to the VM having 8 cores (on Integration, 4 cores on Development) and this value of -1 allows the maximum number of concurrently running executables to equal the number of processors plus two.

An small example of how the MasterOfMaster and a Master Package for a stage looked is shown below:

Each container number could have Parallel and/or Linear Containers and both must succeed before the next Container level can start.

NOTE that this is just an example representation, naming conventions shown do not reflect the actual solution.

Problem

During development and initial testing on our own hardware, we had the migration at the time running at ~25minutes for around 600 packages (ie. tables) covering (what we termed) RawSource–>Source–>Staging which was well within the performance requirements for the stage that development was at and for what was initially set out. The rest of this blog post will hone in specifically on Source–>Staging only.

However, once we transferred the solution to the clients development environment things took a turn for the worse. In our environment we were running VMs with 8 cores, 16GB RAM and utlising SSDs. The client environment was running SQL Server 2016 Enterprise on VMWare vSphere 5.5, 8 vCPUs, 32GB RAM (for Integration, Development was half this) but the infrastructure team have done everything in their power to force all VMs onto the lower tier (ie. slow disks) of their 3-PAR SAN and throttle them in every way possible, just to make things more of a challenge. Even though the VM’s themselves were throttled we were confident that we wouldn’t see too much of a performance impact, especially as this was only a subset of the processing to be done so we needed it to be quick and it will only ever get longer and longer.

How wrong we were. On the first run the processing (for Source–>Staging) took around 141 minutes, yes you read that right, a full 116 minutes longer than the whole process took on our hardware! Wowza, didn’t see that one coming. I won’t delve too much into the investigations as again that will be saved for another blog post but essentially we were seeing a huge amount of the WRITELOG wait type since moving to the client environment. We believed the reason for this was due to the significant amount of parallel processing (running of SSIS packages in parallel loading to the same DB) we were doing and the SAN didn’t seem to be able to handle it. One other thing to note, due to truncations not being flagged as error’s in OLE DB Destination fast load data mode access, some of the packages that weren’t a direct copy where we knew the schema was exactly the same were run in non-fast load, ie row-by-row which puts additional stress on the system as a whole.

I will be blogging at a later date regarding how we managed to get everything running in fast load and handle the truncation via automated testing instead.

Solution

Enter Delayed Durability.

I won’t enter into too much detail regarding what this is or how it specifically works as this has been blogged by many others (Paul Randal, Aaron Bertrand to name just a couple) but my favourite description of delayed durability is comes from the msdn blogs and they refer to it as a “lazy commit“. Before you ask, yes we understood the issues of implementing such a change but the migration process was always a full drop and reload of the data so we didn’t care if we lost anything as we could simply run the process again.

Setting delayed durability at the database level we were able to control which Databases involved in the process we wished to have this without altering the BIML framework or code itself to handle it at the transaction level. By simply applying this to the Source and Staging databases we reduced the processing time from 141 minutes to 59 minutes. This wasn’t exactly perfect but shaving more than half the time off with one simple change and pushing the WRITELOG wait stat way way down the list was a great start.

As a side not, we have managed to get the processing from ~59mins to ~30mins without changing the VM/hardware configuration but I will leave that for another post.

Proof

When I first set out with this blog post it was only going to be a few paragraphs giving an insight into what we did however, I thought that all this would be pointless without some visualisation of the processing both before and after.

Row-by-Row with no Delayed Durability

We needed to get a baseline and where better to start than capturing the metrics through SentryOne and using Adam Mechanic’s spWhoIsActive we can see what I was talking about with the WRITELOG wait stat:

spwhoisactive_output

Granted the wait time themselves was relatively low, these were apparent almost every time we hit F5 and running our wait stat scripts in was in the top 3. A sample of the processing indicating this wait stat can also be seen below:

waitstats_writelog

As stated previously, overall the Source–>Staging process took 141 minutes and the overall processing from SentryOne PA was captured:

so_fullprocess_output

Row-by-Row with Delayed Durability

So when we ran the same process with Delayed Durability we can see straight away that the transactions/sec ramp up from ~7000 to ~12500. Top left shows without Delayed Durability, bottom left with Delayed Durability and right shows them side by side:

The overall process for Source–>Staging took only 59 minutes. I’ve tried to capture the before/after in the image below, the highlighted section being the process running with Delayed Durability forced:

SO_FullProcess_withDD_Output.png

You can see from this the drastic increase in Transactions/sec and reduction in Log Flushes.

Two package execution time examples (trust me that they are the same package) showing that with Delayed Durability the processing time was only 43% (166sec down to 72sec and 991sec to 424sec) ) of that without Delayed Durability set. Apologies for the poor image quality….

To me that is a huge reduction for such a simple change!

Conclusion

So should you go out and apply this to all your production databases right this second? No, of course you shouldn’t. We applied this change for to fix a very specific problem in an isolated environment and were willing to take the hit on losing data if the server crashed – are you, or more importantly your company willing to lose that data? I’m taking an educated guess that this will be a no but for certain situations and environments this configuration could prove to be very useful.

Links

Advertisements
Why would you never use SSIS Fast Load…? —

Why would you never use SSIS Fast Load…?

We all know that if you want SQL Server to push data into a table then you want to batch the inserts / use a bulk insert mechanism but is there a time when performance isn’t everything?

Background

Although it has its critics, SSIS is a very powerful tool for Extracting, Transforming and ultimately Loading data from and to various systems. I kind of have a love / hate relationship with SSIS, I love it but it seemingly hates me with a passion.

During a recent data migration project we had a series of packages using a stored procedure as the source and a SQL Server table as the destination. By using the OLE DB Destination task you have a series of options Data Access Modes which can provide various additional configurations. I won’t delve into all of these but have a look at the msdn link provided at the end for further information.

The ones I want to concentrate on are:

  • Table or view
  • Table or view – Fast Load

In short, fast load does exactly what it says on the tin, it loads data fast! This is because it is optimised for bulk inserts which we all know SQL Server thrives on, it isn’t too keen on this row-by-row lark.

Problem

Now, I won’t be providing performance figures showing the difference between running a package in fast load compared to row-by-row, this has been done to death and it is pretty much a given (in most cases) that fast load will out perform row-by-row.

What I do want to bring to your attention is the differences between the two when it comes to redirecting error rows, specifically rows that are truncated. One of the beauties of SSIS is the ability to output rows that fail to import through the error pipeline and push them into an error table for example. With fast load there is a downside to this, the whole batch will be output even if there is only 1 row that fails, there are ways to handle this and a tried and tested method is to push those rows into another OLE DB Destination where you can run them either in smaller batches and keep getting smaller or simply push that batch to run in row-by-row to eventually output the 1 error you want. Take a look at Marco Schreuder’s blog for how this can be done.

One of the issues we have exerienced in the past is that any truncation of a column’s data in fast load will not force the package to fail. What? So a package can succeed when in fact the data itself could potentially not be complete!?! Yes this is certainly the case, lets take a quick look with an example.

Truncation with Fast Load

Setup

I have provided a script to setup a table where we can test this. I will attempt through SSIS to insert data which is both below and above 5 characters in length and show the output.

USE tempdb;
GO

DROP TABLE IF EXISTS dbo.TruncationTest;
DROP TABLE IF EXISTS dbo.TruncationTest_error;

CREATE TABLE dbo.TruncationTest
(
TruncationTestID INT IDENTITY(1,1),
TruncationTestDescription VARCHAR(5)
)
GO

CREATE TABLE dbo.TruncationTest_error
(
TruncationTestID INT,
TruncationTestDescription VARCHAR(1000) --Make sure we capture the full value
)
GO

This code will set up 2 tables, one for us to import into (TruncationTest) and another to capture any error rows that we will output (TruncationTest_error).

I set up a very quick and dirty SSIS package to run a simple select statement to output 3 rows and use the fast load data access mode:

SELECT  ('123') AS TruncationTestDescription UNION ALL
SELECT  ('12345') UNION ALL
SELECT  ('123456789');

The OLE DB Source Editor looks like this:

sourceeditor

the OLE DB Destination data access mode:

desteditor_fastload

Finally, this is how the package looks:

package_fastload

Note the truncation warning. This is easy to see when viewing a package in Visual Studio, not so easy to pick up when you are dynamically generating packages using BIML.

Let’s run it……

package_fastload_success

Great, 3 rows populated into the TruncationTest table, everything worked fine! So let’s check the data:

SELECT * FROM dbo.TruncationTest

results_1

Eh? What happened there???? Where’s my ‘6789’ gone from row 3???

From this example you can see that the package succeeds without error and it looks as though all rows have migrated entirely but by querying the data after the package has completed you can see that the description column has indeed been truncated.

Let’s try the same test but changing the Data Access Mode to non-fast load (ie. Row-By-Row)

Truncation with row-by-row

In this example you can see that the row with truncation is in fact pushed out to the error pipeline as you would hope and expect.

desteditor_nonfastload

package_nonfastload_success

We now have 3 rows being processed but one row pushing out to the error pipeline which is what we would expect and hope for.

Let’s take a look at the output:

SELECT * FROM dbo.TruncationTest ORDER BY TruncationTestID
SELECT TruncationTestDescription FROM TruncationTest_error

results_2

The results highlighted in red are those from the fast load, in green are the results from the row-by-row indicating that the error row was piped out to the error table.

Solution(?)

You have a few different options here:

  1. Not really care and push the data through in fast load and suffer the concequences
  2. Run in row-by-row and suffer the performance hit
  3. Amend the OLE DB Source Output to be the same length as the destination column and redirect error rows from there.
  4. Probably loads of others involving conditional splits, derived columns and/or script tasks
  5. Apply option #1 and make sure that relevant (automated or otherwise) testing is applied

During the recent data migration project we were involved in we chose option #5. The reasons for this are:

  1. We wanted to keep the BIML framework, the code and the relevant mappings as simplistic as possible
  2. Performance was vital….
  3. …..but more importantly was the validity of the data we were migrating

We already had a series of automated tests setup for each package we were running and table we were migrating and we had to add to this a series of additional automated tests to check that no data itself was being truncated.

NOTE: Option #4 was also a very valid choice for us but due to the nature of the mapping between source and destination this was not something that was easily viable to implement.

I will leave the how we implemented these test this for another blog post 🙂

Conclusion

Taking a look at the error redirect in the OLE DB Destination we can clearly see that Truncation is greyed out and no option is provided so I have to assume that it simply isn’t an option to configure it here.

errorredirect

I used to have a link to an article which mentions that truncation cannot be deemed an error in a bulk import operation via SSIS due to the mechanics of how it all works but for the life of me I cannot find it :(. I am hoping someone who reads this will be able to provide me with this but for now I will have to draw my own conclusions from this. The closest thing I can find is an answer from Koen Verbeeck (b|t) in an msdn forum question where he states:

The only thing you get is a warning when designing the package.

You get truncation errors when you try to put data longer than the column width in the data flow buffer, i.e. at the source or at transformations, but not at the destination apparently.

What I still don’t understand is why in tSQL you will get an error when trying to “bulk insert” (loose sense of the term……ie. using an INSERT….SELECT) data that will truncate data but SSIS does not. Hopefully someone far cleverer than me will be able to shed some light on this!

The idea behind this blog post was not to focus too much on the importance of testing any data that is moved from one place to another but I wanted to highlight how easy it is to believe that what you are migrating is all fine n dandy because the SSIS package told you so but in actual fact you could be losing some very very important data!!

You have been warned 😉

Links

Making sure your Triggers fire when they should — March 4, 2013

Making sure your Triggers fire when they should

As some of you may be aware, triggers are not my favourite thing in the world but like most things, it does have its place.

Whilst onsite with one of my clients, one of the processes fires a trigger on insert which ultimately runs a SSRS subscription to email a report. All sounding fairly feasible so far. However, this process is also used as part of a batch process overnight which would run a separate insert statement (actually another stored procedure in another job step) instead of the “onDemand” insert. Ok, still doesn’t sound like too much of an issue.

Now, they started experiencing occasional failures of this job during the day with the error relating to the fact that the SSRS subscription job was being called when it already was running. Interesting, this in theory shouldn’t ever happen because the process either ran the jobs based on the batch process or the one off onDemand.

Stepping through the process, it led me to an AFTER INSERT trigger. Upon opening it I spotted the issue straight away. Something that as I’ve found over the years as a consultant, a lot of DBA’s and developers have failed to understand that (from MSDN ):

These triggers fire when any valid event is fired, regardless of whether or not any table rows are affected.This is by design.

So, the issue was that step 3 ran a procedure which ultimately ran an insert statement for the onDemand insert, step 4 ran a procedure to insert for the overnight batch process which as it happens doesn’t have any records to insert but will in fact fire the trigger to run the SSRS subscription again! There is a number of ways to fix this but I’ve tended to stick with a basic check of the “inserted” table for results and RETURN out if no records are there to process.

I’ve supplied a bit of test code below for people to try this out.

Lets create a test table and an audit table:

USE tempdb
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TestTable]') AND type in (N'U'))
DROP TABLE [dbo].[TestTable]
GO
CREATE TABLE [dbo].[TestTable]
(
	TestTableID INT IDENTITY(1,1),
	TestTableDescr VARCHAR(20)
)
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[AuditTrigger]') AND type in (N'U'))
DROP TABLE [dbo].[AuditTrigger]
GO
CREATE TABLE [dbo].[AuditTrigger]
(
	AuditTriggerID INT IDENTITY(1,1),
	AuditTriggerDescr VARCHAR(20),
	DateCreated DATETIME
)
GO

INSERT INTO dbo.TestTable (TestTableDescr)
VALUES ('Test1'), ('Test2'), ('Test3');

SELECT * FROM dbo.TestTable;

Now lets create the trigger with no checking:

USE [TempDB]
GO

IF  EXISTS (SELECT * FROM sys.triggers WHERE object_id = OBJECT_ID(N'[dbo].[trTestTable]'))
DROP TRIGGER [dbo].[trTestTable]
GO

CREATE TRIGGER [dbo].[trTestTable] ON [dbo].[TestTable]
   AFTER INSERT
AS
BEGIN

	--Log the fact the trigger fired
	INSERT INTO [dbo].[AuditTrigger] (AuditTriggerDescr, DateCreated)
	SELECT 'Trigger Fired', GETDATE()

END
GO

Test Inserting a record that exists:

--Valid Insert
INSERT INTO dbo.TestTable (TestTableDescr)
SELECT TestTableDescr
FROM dbo.TestTable
WHERE TestTableDescr = 'Test1';

SELECT  *
FROM    [dbo].[AuditTrigger];

Test Inserting a record that doesn’t exist:

--Not a Valid Insert
INSERT INTO dbo.TestTable (TestTableDescr)
SELECT TestTableDescr
FROM dbo.TestTable
WHERE TestTableDescr = 'Test4';

SELECT  *
FROM    [dbo].[AuditTrigger];

You’ll now see that there are 2 entries in the AuditTrigger table due to the fact that the trigger fired even though no records were actually valid to insert.

So, lets amend the trigger to check for valid inserts:

USE [TempDB]
GO

IF  EXISTS (SELECT * FROM sys.triggers WHERE object_id = OBJECT_ID(N'[dbo].[trTestTable]'))
DROP TRIGGER [dbo].[trTestTable]
GO

CREATE TRIGGER [dbo].[trTestTable] ON [dbo].[TestTable]
   AFTER INSERT
AS
BEGIN

	--Check to see if any records were inserted
	IF NOT EXISTS (SELECT 1 FROM INSERTED)
		RETURN 

	--Log the fact the trigger fired
	INSERT INTO [dbo].[AuditTrigger] (AuditTriggerDescr, DateCreated)
	SELECT 'Trigger Fired', GETDATE()

END
GO

and test the inserts again:

Test Inserting a record that exists:

--Valid Insert
INSERT INTO dbo.TestTable (TestTableDescr)
SELECT TestTableDescr
FROM dbo.TestTable
WHERE TestTableDescr = 'Test2';

SELECT  *
FROM    [dbo].[AuditTrigger];

Test Inserting a record that doesn’t exist

--Not a Valid Insert
INSERT INTO dbo.TestTable (TestTableDescr)
SELECT TestTableDescr
FROM dbo.TestTable
WHERE TestTableDescr = 'Test4';

SELECT  *
FROM    [dbo].[AuditTrigger];

No record will have been inserted with the final insert statement!

Lets clean up our tempdb:

USE [TempDB]
GO

--Clean up
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TestTable]') AND type in (N'U'))
DROP TABLE [dbo].[TestTable]
GO
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[AuditTrigger]') AND type in (N'U'))
DROP TABLE [dbo].[AuditTrigger]
GO

Hopefully this will help point out the misconception that triggers only fire when records are actually inserted 🙂

As per usual, I’d like to hear peoples thoughts/experiences on this topic.

DBCC CheckTable, Spatial Indexes and incorrect compatibility mode….. — November 19, 2012

DBCC CheckTable, Spatial Indexes and incorrect compatibility mode…..

Just a very quick blog today regarding an issue that has arisen with one of my clients. During Integration it became apparent that one table in particular was failing during the weekly consistency checks, the error being output:

DBCC results for ‘sys.extended_index_1696529623_384000’.

There are 313423 rows in 1627 pages for object “sys.extended_index_1696529623_384000”.

DBCC results for ‘schema.Table’.

There are 312246 rows in 12192 pages for object “schema.Table”.

Msg 0, Level 11, State 0, Line 0

A severe error occurred on the current command.  The results, if any, should be discarded.

Msg 0, Level 20, State 0, Line 0

A severe error occurred on the current command.  The results, if any, should be discarded.

A bit of background. The server is running SQL Server 2008R2 SP1 CU2 and the database in question is still in compatibility 90 (SQL Server 2005). The table in question has a spatial index on a Geography column.

So, how do we fix this? Well there’s a couple of options.

  1. Change the compatibility to 100
  2. Install SQL Server 2008R2 SP1 CU3…

This is a documented issue (kb 2635827) and the fix can be found on Microsoft’s Support Pages.

FIX: Access violation when you run a DBCC CHECKDB command against a database that contains a table that has a spatial index in SQL Server 2008 or in SQL Server 2008 R2

As to which fix we deploy, well that’s for tomorrow’s fun and games 😉

Implementing a Data Warehouse with Microsoft SQL Server 2012 exam (70-463) – My Thoughts — September 13, 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012 exam (70-463) – My Thoughts

Well I finally got around to completing the MCSA aspect of the SQL Server 2012 Certification and I’m pleased to say i passed with flying colours. As some of you may be aware I managed to nab and pass 3 of the Beta exams (70-461, 70-462 and 70-465) back in April and decided to see the MCSE through.

I really wasn’t sure how this exam was going to go as I’ve been working a lot recently with MDS 2012 and SSIS 2008 and revised the new 2012 features but went in with no real expectations. The exam consisted of 55 questions, again ranging from multiple guess, select the 3 things you’d do in order to a new feature i’ve not seen before and that is a drag n drop facility of a SSIS control flow which I thought was nifty. 

The area I thought I’d struggle on was DQS but in fact found that aspect relatively simple, the difficult area for me was the “select the 3 things you’d do in order” relating to the new Project Deployment area of SSIS 2012. I’ve done a fair bit of “tinkering” with this over the last few months but its obvious I’m not as prolific as I thought as I found certain questions difficult to get my head around what it was suggesting in the answers. I obviously did ok in this area (according to the score sheet) but at the time i was sweating a bit.

Anyone wanting hints and tips, I obviously can’t go into detail but I’d definitely brush up on the new features of SSIS 2012!!! 

Oh, and anyone wanting to know, the pass mark is 700 – none of the Beta exams told you this and I know some have said it was actually 800……

Now onto 70-464 – Developing Microsoft SQL Server 2012 Databases, to complete the SQL Server 2012 MCSE certification!!!!

SSIS SCD vs MERGE Statement – Performance Comparison — July 3, 2012

SSIS SCD vs MERGE Statement – Performance Comparison

I wouldn’t class myself as an expert in SSIS but I certainly know my way around but came across something today which I thought I’d share. As with a lot of things there are “many ways to skin a cat”, none of which is something I’ll go into at the moment but what i will concentrate on is updating columns in a table where the data has changed in the source.

One of the projects I’m currently working on requires this very process and when i set about doing so I created the T-SQL Merge statement to do the business. However, the question was raised as to why I didn’t use SSIS’s built in component Slowly Changing Dimension (SCD)? I didn’t really have an answer other than personal preference but decided to delve into it a bit further and compare the performance of each method.

As a test, I created a source table with a Key and Name column:

USE TempDB;

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'dbo.iSource') AND type in (N'U'))
	DROP TABLE dbo.iSource;

CREATE TABLE dbo.iSource
(
   ID INT,
   Name varchar(100)
);

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'dbo.iTarget') AND type in (N'U'))
	DROP TABLE dbo.iTarget;
	
CREATE TABLE dbo.iTarget
(
   ID INT,
   Name varchar(100)
);

and populated it with some dummy data:

INSERT INTO dbo.iSource (ID,Name)
SELECT TOP 10000
ROW_NUMBER() OVER (ORDER BY t.object_id) AS rownumber
,'Name_'+convert(varchar(4),ROW_NUMBER() OVER (ORDER BY t.object_id))
FROM sys.tables t
CROSS JOIN sys.stats s;

INSERT INTO dbo.iTarget (ID,Name)
SELECT TOP 10000 
ROW_NUMBER() OVER (ORDER BY t.object_id DESC) AS rownumber --Done in descending order
,'Name_'+convert(varchar(4),ROW_NUMBER() OVER (ORDER BY t.object_id))
FROM sys.tables t
CROSS JOIN sys.stats s;

SELECT ID, Name FROM iSource;
SELECT ID, Name FROM iTarget;

So we now have a source and target table with different Names and we’ll look to update the iTarget table with the information coming from iSource.

Method 1 – MERGE Statement

MERGE dbo.iTarget AS target
	USING (
	SELECT ID, Name
	FROM dbo.iSource
	 ) AS  source (ID, Name)
		ON (target.ID = source.ID)
		WHEN MATCHED AND target.Name <> source.Name 
		THEN 
			UPDATE SET Name = source.Name
	 WHEN NOT MATCHED THEN 
		 INSERT (ID, Name)
		 VALUES (source.ID, source.Name); 

Using this method simply in SSMS for simplicity, profiler output 2 rows for Batch Starting and Batch Completing, CPUTime of 125ms and Duration of 125ms and it updated 6678 records. Top stuff, as expected.

Method 2 – SSIS SCD Component
I rebuilt the tables to put them back to where we started and set about creating the same thing in SCD setting ID as the business key and Name as the changing attribute and not setting inferred members, below is a screen dump of the outcome of this:

BEFORE:

I clear down the profiler and run the ssis package and the outcome is quite astounding.

DURING/AFTER:

The profiler output 13456 rows including 6678 rows of queries like this:

exec sp_executesql N'SELECT [ID], [Name] FROM [dbo].[iTarget] WHERE ([ID]=@P1)',N'@P1 int',8

as well as 6678 rows of queries similar to this:

exec sp_execute 1,'Name_3304',3304

Total Duration of 37 seconds (yes that’s seconds not ms!!)…….and this is on a table of only ~7k rows!

Well I’ll be damned, the SCD basically runs a cursor looping each record checking for a match on ID and updating that record if so. I can’t actually believe that MS have built a component which performs in this way.

So, to answer the question asked ” why I didn’t use SSIS’s built in component Slowly Changing Dimension (SCD)?”, I now have a definitive answer, it doesn’t perform!

I’m sure SCD has its place but for me, the requirements and the datasets I’m working on I think I’ll stick with MERGE for now….. 🙂

NOTE: This was done on SQL Server 2008R2 Developer Edition running on Windows 7 Ultimate, not sure if SQL Server 2012 has improved the SCD performance but I’ll leave that for another day.

It’s that time of year…..Exceptional DBA Awards 2012 — June 25, 2012

It’s that time of year…..Exceptional DBA Awards 2012

Being a 2011 finalist I felt I should try and rally all those who truly are exceptional to get their nominations in and quick sharp as the closing date is getting close.

I was lucky enough to be nominated for this award last year and wasn’t going to follow it through as I felt I didn’t really stand a chance but when I sat and thought about it, if someone is willing to think of you as being exceptional at what you do, enough so to nominate you then why not, what’s the worst that can happen!!??!!

The level of talent out the is phenomenal and the 4 guys I was up against last year are up there with the best in the world. Don’t let that put you off though, I feel that this award is very much focused towards those in the USA and not many actually make it through to the finals from the UK (Kevan Riley Blog / Twitter and myself I think are the only two!) so I think we need to give a bigger push this year and try and get more than one finalist from the UK 🙂

If you haven’t been nominated by one of your peers then nominate yourself, there’s no rule saying you can’t and in fact Redgate encourage it.

Get entered, the questions answered and cross your fingers!

Good luck!!!!

Querying Microsoft SQL Server 2012 Beta exam (70-461 / 71-461) – My Thoughts — April 13, 2012

Querying Microsoft SQL Server 2012 Beta exam (70-461 / 71-461) – My Thoughts

Well I’ve now done the final SQL Server 2012 exam I managed to get a slot booked for. The Querying Microsoft SQL Server 2012 exam wasn’t going to be my strongest subject as I’m more of a DBA than Developer but i felt it went quite well.
The exam consisted of 55 questions, varying in structure from multiple guess to drag n drop. There were only about 5 or 6 questions i left comments about relating to the content not being clear, typo’s or in one instance an actual mistake in the question so all in all a better setup than the Administrator exam I took first off (70-462 / 71-462).

The biggest issue I found was down to my own fault. I didn’t revise on the syntax of the new 2012 T-SQL functionality. Don’t get me wrong, I know i’ve got a lot of them right but with some, although I knew the answer was down to 2 of the 4, I didn’t know it well enough to be 100% certain as there was only 1 word different in the syntax which I’m a bit disappointed with……..but no-one to blame but myself 🙂

I’m still not sure whether the pass mark is 70% or 80% and hoping I’ve answered enough of the non-2012 questions correctly to scrape through.

As always, I’d be interesting to hear other peoples thoughts on any of the 2012 exams they’ve taken so far…..

Designing Database Solutions for Microsoft SQL Server 2012 Beta exam (70-465 / 71-465) – My Thoughts — April 6, 2012

Designing Database Solutions for Microsoft SQL Server 2012 Beta exam (70-465 / 71-465) – My Thoughts

After sitting the Administering Microsoft SQL Server 2012 Databases Beta exam (71-462) on Monday, I was still a little disappointed with Microsofts approach to questioning for these exams. So I went into this exam with pretty much the same mindset that the questions were going to be vague and in some cases completely wrong.

Much to my surprise, I found the questions in this exam far far better. The exam itself was split into sections. There were 44 in total, 26 the standard multiple choice and 5 further scenario based sections, each with either 3 or 4 questions. Section one was much the same as the 71-462 exam but I felt the questions were in the majority, more concise and in my opinion gave enough information to make a valid judgement when answering. I did leave a few comments as there were a few of the questions that could do with a bit more work and had a couple of typo’s.

The scenario sections again provided enough information to select the relevant answers, the only criticism of these sections were on question 2 of my second scenario, there was a major typo on the answers of question which didn’t pry me away from the answer but requires sorting.

Now, the main thing that really got me with this exam was the amount of SQL Azure questions in section one. Is was not mentioned as a skill measured so needs looking into in my opinion, either add it as a skill measured or remove it from the exam.

A much more enjoyable exam than the administrator, mainly due to the higher level of quality in the questioning resulting in far fewer comments being left and for me, I love the scenario based questions!!

As always, I’d be interesting to hear other peoples thoughts on any of the 2012 exams they’ve taken so far…..

How to output from invoke-sqlcmd to Powershell variable — February 3, 2012

How to output from invoke-sqlcmd to Powershell variable

Sorry for another Powershell post but I’ve been doing a lot of it recently and coming up with (what i think are) a few nifty tricks.

One of the issues I encountered recently was with Kerberos delegation whilst trying to automate Log Shipping. What I was trying to do was use an OPENROWSET query to run against the Primary and Secondary servers in order to obtain the Primary_id and Secondary_id in order to pass to the script to be ran on the monitor server. However, seeing as the environment was not setup for Kerberos I encountered the “double-hop” issue.

Enabling Kerberos delegation for the service account would be too high a risk without thorough testing so wasn’t an option in this instance so I decided to look into using invoke-sqlcmd against each of the servers to get the IDs required and pass it to the monitor script.

So how did I go about doing this you ask, well its actually really simple. After a bit of googling I came across this blog by Allen White which gave me a starting block.

Firstly, you have to amend your TSQL script to SELECT the parameter you want to output and use within the rest of the script, something like this:

TSQL snippet to be ran against the Primary Server:

--Cut down version of the script for readability
EXEC @SP_Add_RetCode = master.dbo.sp_add_log_shipping_primary_database 
		@database = N'$(Database)' 
		...
		,@primary_id = @LS_PrimaryId OUTPUT --This is what we want
		,@overwrite = 1 
		,@ignoreremotemonitor = 1 

--Need to output this in order for powershell to take it and use it in the monitor script
SELECT @LS_PrimaryId as LS_PrimaryId 

Do the same for the script to run on the secondary server but obviously for the secondary_id 🙂

So, now you’ve setup the TSQL side of things, you need to then call these from Powershell and assign the output parameter to a Powershell variable like so:


$script = "LogShip_Primary.sql"
$PrimaryID = Invoke-Sqlcmd -InputFile $ScriptLocation$script -Variable Database=$DatabaseName, etc etc etc -ServerInstance $PrimaryServer 

$script = "LogShip_Secondary.sql" 
$SecondaryID = Invoke-Sqlcmd -InputFile $ScriptLocation$script -Variable Database=$DatabaseName, etc etc etc -ServerInstance $SecondaryServer

So, relatively simple. Basically your setting the output to a Powershell variable. keeping things tidy, re-assign it to another variable and something to note is that the output is actually a DataTable object. Make sure you use the name of the alias you used in your last TSQL statement.


$PID = $PrimaryID.LS_PrimaryId
$SID = $SecondaryID.LS_SecondaryId 

Once this is done then you can use this in your script to run against the monitor server


$script = "LogShip_Monitor.sql" 
Invoke-Sqlcmd -InputFile $ScriptLocation$script -Variable Database=$DatabaseName, etc etc etc, PrimaryID=$PID, SecondaryID=$SID -ServerInstance $MonitorServer

And there you have it, nice n simple! All you then have to do is wrap it in a foreach loop for the databases you want to setup and a nice and simple automated logshipping build script.

Obviously I’ve omitted a lot of the setup / checking of scripts etc from this post as I don’t want to be doing all the work for you!

Enjoy 🙂