SQLGeordie's Blog

Helping the SQL Server community……where i can!

Docker, Kubernetes and SQL Server on Linux series — June 20, 2018

Docker, Kubernetes and SQL Server on Linux series

DockerBuildShipRunAnywhere

I’ve been playing around with Docker (more specifically with SQL Server on Linux) the beginning of 2017 to see if indeed I can “Build, Ship, Run, Any App Anywhere” and will be putting together a series of how-to’s and issues experienced so you guys don’t spend the countless hours trying to investigate and work out what is going on when things do not work.

I’ll be adding links to this main page as and when new posts are added as a central point of reference.

Microsoft Ignite 2021 – Book of News — March 4, 2021

Microsoft Ignite 2021 – Book of News

Microsoft Ignite 2021 – Book of News

Microsoft Ignite 2021 is currently underway and if, like me, you can’t attend every session you would like to, well Microsoft have this covered with their Book of News.

There’s a ton of new stuff (very little for Data which is my primary focus) but this gives a great overview of all the latest information. Some of the highlights for me are:

Enjoy the rest of conference and if you miss any of it, enjoy the Book of News 😃

Should I split my SQL Server Drives on a SAN in 2021? — February 7, 2021

Should I split my SQL Server Drives on a SAN in 2021?

NOTE: This blog post references HPE as our example but is relevant to other storage vendors out there. I am in no way affiliated with HPE 😉

Back in the day, “when I was a lad“, the recommendation for SQL Server was to split your data, logs and tempdb files onto separate drives/luns to get the most out of your storage. Jump forward to 2021, is this still relevant and should I be splitting my SQL Server drives on to separate luns on new SAN storage? A question which is often asked not just by customers as well as their 3rd party managed service providers / hosting companies. This question can also be along the lines of, “Why can’t we just put everything on a C:\ because the backend is all on the same lun“. This is slightly different as they’re questioning the drive lettering more than creating separate luns but still relevant to this topic.

The majority (not all) of SQL Servers will have a SAN to host its storage and SANs these days are super duper quick, especially those that have tiered SSD or even fancier, flash storage. The technical differences between the older spinning rust and the new fancy dan flash storage is not something we’ll delve into as there’s plenty of other blogs out there and not really the scope of this blog post.

Each storage vendor will (should) provide their own documentation specific to how the SAN should be configured for SQL Server to get the best bang for your buck. Taking HPE as an example, they have pdf’s for their various offerings including 3PAR/Primera as well as Nimble. Although there are some slight differences, each of them suggest that you SHOULD split your drives onto separate volumes.

I won’t disect the documents in their entirety but some of the sections which will help with answering the question but these mostly relate to which performance policy to set for your data, logs and tempdb based on the workload (ie. OLTP / OLAP and size of files):

  1. Array Snapshots and remote replication
    • You may not (won’t) want tempdb as part of this due to its large rate of data change
  2. Volume Block Size
    • According to the documentation, depending on the workload, you may (or may not?) want 8kb for data and 4kb for logs as per their default policy
  3. Caching
  4. Deduplication
  5. Compression
  6. Number of disks available

To provide a great overview, below is a snippet from the HPE Nimble recommendations:

Storage Layout for SQL Server Volumes
In general, using multiple volumes for SQL Server deployments offers several advantages:
The ability to apply custom performance policy settings to specific volumes
• The ability to control cache use for volumes on adaptive flash arrays
• The choice of limiting replication of certain volumes, such as the tempdb volume
• A gain in performance for I/O-intensive databases, especially if the queue depth of a device is at risk of
becoming saturated
• The ability to group databases in separate volume collections and apply custom backup and replication
schedules
• The ability to quickly recover specific databases with volume-based restores
Before designing the storage layout, carefully consider the administrative overhead of managing multiple
volumes. The ideal solution provides a balance between optimization and administration.

Allocation Unit Size

Something that often comes up during these conversations is the configuration of the volume formatting. Regardless of the chosen Performance Policy and indeed volume block size, the default recommendation from HPE is to use 64kb for your data, logs and tempdb. This is a recommendation, only testing for your specific environment will truly give you the answer as to what allocation unit size to set.

Additional Information

Below are some further snippets from the HPE documentation regarding default configurations:

HPE 3PAR/Primera:

Nimble:

Comparison of SSD / tiering uses for SQL Server files:

Conclusion

Should you split your SQL Server drives in 2021? The Nimble documentation gives a sentence which sum’s it up very well:

The ideal solution provides a balance between optimization and administration.

Having everything thrown into a single pool will make life a lot easier for the SAN guy, splitting things up could lead to an administrative nightmare but may be required to get the best out of it for your workload.

What I will say is, review your storage vendor’s documentation / recommendations, compare it with what sort of environment / workload you have and if it fits your specific setup, configure it, test it and use it as a baseline to compare with other configurations.

If testing proves that their guide/recommendations do not provide optimal performance for you environment then it is perfectly fine to deviate from their suggested recommendations, as long as the admin overhead is manageable. All I would add to this is that whatever the configuration decision made is, make sure it is fully documented as to why this has been done as someone several years later will no doubt ask the question as to why 🙂

Docker in 12 Steps – YouTube — December 30, 2020

Docker in 12 Steps – YouTube

I have eventually got around to start tidying up my Youtube channel since I moved it and the process made a proper mess of everything 🤦‍♂️

Want to get started with Docker, containers and even using SQL Server on Linux in just 12 easy , hands on steps (short videos)? If you’ve moved onto this sentence then the answer to the previous question must have been YES!

Have a look, I’ve purposely kept them as short as possible, the total time for all 12 videos is less than 90 minutes so there really is no excuse 😉

SQLGeordie – Docker in 12 Steps

Microsoft Ignite 2020 – Book of News — September 24, 2020

Microsoft Ignite 2020 – Book of News

Microsoft Ignite 2020 is currently underway and if, like me, you can’t attend every session you would like to, well Microsoft have this covered with their Book of News.

There’s a ton of new stuff but this gives a great overview of all the latest information including:

Enjoy the conference and if you miss any of it, enjoy the Book of News 😃

SQL Server / Docker Desktop and WSL2 — August 5, 2020

SQL Server / Docker Desktop and WSL2

This isn’t anything ground breaking but something that is really awesome for those that run SQL Server on Linux using Docker Desktop for Windows, Windows 10 (SDK version 2004, Build 19041 or higher) now ships with Windows Subsystem for Linux (WSL) 2.

I won’t go too much into what this is as you can read the article in the links above but to summarise, this will improve the experience of docker on windows:

  • Improvements in resource consumption
  • Starting up docker daemon is significantly quicker (Docker says 10s as opposed to ~1min previously)
  • Avoid having to maintain both Linux and Windows build scripts
  • Improvements to file system sharing and boot time
  • Allows access to some cool new features for Docker Desktop users.

Some of these are improvements we’ve been crying out for over the last couple of years so in my opinion, they’re a very welcome addition.

In order to get started using WSL2, there’s a couple of steps you need to run through which I’ll try and show below with a few screen shots.

Downloads

Process

Before we can do anything, please make sure that you have the downloads from the link in the previous section

Install Docker Desktop

Nothing to report on here, fairly straight forward click, click click

nstalling Docker Desktop 23.0.4 (46911) 
Docker Desktop 2.3.0.4 
Installation succeeded 
Close

Once installed, depending on what the state of you machine is currently, you may see this message regarding installing WSL2:

Docker Desktop - Install WSL2 
WSL 2 is not installed 
Install WSL using this powershell script (in an administrative powershell) and 
restart your computer before using Docker Desktop: 
Enable-WindowsOptionalFeature -Online -FeatureName 
"Microsoft-Windows-Subsystem-Linuk') 
Use Hyper-V Stop Docker

Simply follow the information and enable the optional feature either using the powershell provided:

Enable-WindowsOptionalFeature -Online -FeatureName $("VirtualMachinePlatform", "Microsoft-Windows-Subsystem-Linux")
PS C: \WINDOWS\system32> Enable-WindowsOptiona1Feature 
-Online 
-FeatureName 
"Virtua1MachineP1atform", "Microsoft-Windows-Subsystem-Linux" 
Path 
nline 
RestartNeeded 
True 
False

Or you can go to the windows features and enable it manually:

Windows Features 
Turn Windows features on or off 
O 
To turn a feature on, select its check box. To turn a feature off, clear its 
check box. A filled box means that only pat of the feature is turned on. 
D TFTP Client 
Z] Virtual Machine Platform 
Windows Hypervisor Platform 
Windows Identity Foundation 3.5 
Z] Windows PowerSheII 2.0 
Windows Process Activation Service 
Windows Projected File System 
Windows Subsystem for Linux 
Windows TIFF Filter 
Work Folders Client 
Cancel

Once you have done this you will be prompted to install the Linux Kernel update package (See downloads). You can reboot before doing this (another reboot may be required after) but I managed to install it and just do a single reboot:

Docker Desktop - Install WSL 2 kernel update 
WSL 2 installation is incomplete. 
The WSL 2 Linux kernel is now installed using a separate MSI update package. 
Please click the link and follow the instructions to install the kernel update: 
https://aka.ms/ws12kernel. 
Press Restart after installing the Linux kernel.
Need to run the Linux kernel update and restart
Windows Subsystem for Linux Update Setup 
Welcome to the Windows Subsystem for Linux 
Update Setup Wizard 
The Setup Wizard will install Windovvs Subsystem for Linux 
Update on your computer. Click Next to contnue or Cancel 
to exit the Setup Wizard.
Run the update executable

Takes about 2 seconds to update

Windows Subsystem for Linux Update Setup 
Completed the Windows Subsystem for Linux 
Update Setup Wizard 
Click the Finish button to exit the Setup Wizard.
Update now complete

Depending on your setup, there may be a couple of additional steps if you follow this link:

Checking

Open the Docker settings and you should now see the option to use the WSL 2 based engine:

Select it and restart docker

If you wish to see what version of WSL you have then you can run the command below in an elevated command prompt:

wsl --list --verbose

I already have the Ubuntu distribution installed so I didn’t have to do this, you may need to install this:

If you click launch then the dstro will start:

Run SQL Server on Linux

docker run 	-e "ACCEPT_EULA=Y" `
				-e "SA_PASSWORD=P@ssword1" `
				--cpus="2" `
				--name SQLLinuxLocal -d `
				-p 1433:1433  `
        		<<Put Your Image Name here>>
docker ps -a

I can’t say too much about the various performance improvements so far but the daemon startup is certainly a lot quicker than previously. Perhaps some testing of the performance is in order and another blog post……. 😜

Links

Free DataSets – Kaggle — July 5, 2020

Free DataSets – Kaggle

Pretty much anything and everything that you would need for a SQL Server, Azure, AWS, GCP, Power BI etc demo dataset can be found on kaggle. The search facility is easy to use and you can also filter on the particular file type and license you want. I also like the “quicklook” option so you can quickly check to see if indeed it is the sort of data you want as opposed to having to download the data, open it and then realise it’s nothing like what you are after.

I never knew a lot about kaggle and to be honest, other than being able to get datasets from it, I never had a need to. Taken from wikipedia, “Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.

It’s actually no surprise that they’re a subsidiary of Google when you look at most if not all of Google’s datasets, they all seem to link to kaggle….

I’ve included a couple of useful datasets that I’ve used in the past, hopefully you’ll find them useful also:

Back to dataset main page

Free DataSets – Google —

Free DataSets – Google

For most people on the planet with internet access, the first place to visit when looking for anything on the web is Google……….other search engines are available. Not only can you search using the standard search engine but they have various spin offs specific for searching datasets.

Google tracks dateset searches, allows you to explore these and provide the ability to download the data either direct from one of their charts or by providing back links to the site in which it came from. Chances are, kaggle will be one of them which is quite obvious when they are a subsidiary of Google LLC 🙂

There are various places within google to search the datasets, depending on what it is you’re looking for, below are some examples:

Back to dataset main page

Free DataSets…..for SQL Server, Azure, AWS, Power BI etc etc Demo Purposes —

Free DataSets…..for SQL Server, Azure, AWS, Power BI etc etc Demo Purposes

This is just a short overview page of various dataset sources I’ve used in the past for usage in my SQL Server, Azure, AWS and Power BI demo’s. Most if not all of these are free as I don’t like paying for stuff and there are some whereby you can generate your own data if there is something more specific you need……..just be careful as although its “randomly” generated, I have had issues where I pushed the files to github and as it happens, one of the email addresses randomly generated was that of an employee of a company which I’d never heard of. Pure chance (I believe) but something to be mindful of if that is your intention for use.


Kaggle

Pretty much anything and everything that you would need for a SQL Server, Azure, AWS, GCP, Power BI etc demo dataset can be found on kaggle. The search facility is easy to use and you can also filter on the particular file type and license you want. I also like the “quicklook” option so you can quickly check to see if indeed it is the sort of data you want as opposed to having to download the data, open it and then realise it’s nothing like what you are after.


Google

For most people on the planet with internet access, the first place to visit when looking for anything on the web is Google……….other search engines are available. Not only can you search using the standard search engine but they have various spin offs specific for searching datasets.


Power BI Sample DataSets

Not only datasets but also Power BI sample pbix files for anyone looking to get started with it.


GenerateData – Create your own

If there is a specific dataset you are looking for, with specific column names and data types then this is a useful tool. Data is restricted to 100 rows initially unless you sign up / donate but worth a look:


data.gov.uk

At the time of writing there are 54,846 datasets available on here. Some are a bit bizarre (See below) and are specific to the UK so unless you’re looking for that kind of thing, it may not be the place for you.

Other very specific examples (apart from that below) I’ve found are “NHS Bassetlaw CCG Expenditure over £25K” and “Planned road works on the HE road network“………..


Super Bizarre DataSets

I’ve not used any of these but saw this article via Brent Ozar’s weekly newsletter – https://thestacker.com/. Some of them sound pretty cool, others are a bit out there such as: “Abandoned Shopping Trolleys in Bristol Rivers” from data.gov.uk and “The Great British Toilet Map“…….

MSBuild 2020 – Build Book of News — May 20, 2020
How To Fix OBS Black Screen when recording SQL Server Session Demo’s? — April 2, 2020

How To Fix OBS Black Screen when recording SQL Server Session Demo’s?

Everyone loves a demo. When I say everyone, I mean I personally love seeing a demo in a session which is why 99% of any sessions I present will have at least one demo. This is a risky business, especially when dealing with Cloud stuff and conference wifi which is why I always record my demo’s in case they’re ever needed.

Those that know me know that I’m tight and don’t like spending money on stuff that I really need and tend to waste it on cars and watches instead so when it comes to recording I use the FREE opensource software Open Braoadcasting Software (OBS). This software is more than I’d ever need for recording and did I mention, it’s FREE!!

However, there has always been a bit of an issue with getting it to work. Especially for those fortunate to have a separate GPU. My laptop has a NVIDIA Quadro P1000 GPU, not the best out there but it came with the beast and does a job for me. This however is where the issue begins. When you have a separate GPU the software doesn’t know which one to use so when you fire up OBS and choose your Display Capture (ie. screen to record) you will see nothing but a black screen. You can see by the shock on my face that I’ve seen this before but updates seem to reset the changes you need to make (see further down):

Now, the fix used to be very simple, you would:

  1. Open the NVIDIA control panel
  2. Click on “Program Settings” tab
  3. Choose the Open Broadcaster Software in “Select a program to Customise”
  4. Change the preferred graphics processor to “Integrated Graphics”
  5. Click Apply and re-open OBS

However, recent updates to either windows, NVIDIA and/or OBS has meant that this alone no longer works and you will end up tearing your hair out as per image below

There is now an additional step you need to undertake which is to change the OBS exe to run in “Power Saving” mode.

To do this, navigate to Settings–>Graphics Settings and Browse to the OBS exe which can be found at “C:\Program Files\obs-studio\bin\64bit”. Once selected, click Options and choose “Power Saving” for the Graphics Specification as per image below:

Once you have saved this then the screen Capture should now show the screen you are wishing to record you demo’s on and we’re in a happy place :).

As to why this is now required, I really don’t know at the moment and I’m sure in a few months time there’ll be something else which stops it from working.

Hopefully this helps others and if nothing else, will act as a reference guide for when I have to do this again in a few months and will have forgotten the process!