SQL Ninja

Thursday, August 4

GovHack 2016 and Power BI

From last Friday night until Sunday evening, Mandeep Goraya and I participated in GovHack 2016.

If you’re not familiar with it, it’s a nationwide (and NZ) competition to build some sort of useful &/or fun product using open government data, of which there’s a massive and interesting amount these days.

We integrated and cleaned the data and built our visualisations using Power BI.

We worked with IP Australia’s IPGOD dataset, which contains over 100 years of Australian patents, trademarks, plant breeders rights and designs applications and registrations.

Here's the 3 minute video we recorded to explain our efforts.

If you’re intrigued, you can see our efforts here: http://ipcentury.azurewebsites.net/

Wednesday, August 3

SSMS Latest Update - July Hotfix

There's a new version of SQL Server Management Studio, called the "July Hotfix Update", or if you're into version numbers, 13.0.15600.2.

Downloadable here.

Wednesday, June 1

Generate a GUID/newid() in SSIS 2008

Inexplicably, SSIS doesn't natively include a method for generating new GUIDs in the Derived Column Data Flow Transformation.

The get-around is to create a custom Script Component. In SSIS 2008, we have a choice of using Visual Basic or C# to write the script code; I'm going to demonstrate a C# script.

Let's say we have a Data Flow Source and a Data Flow Destination.
(you can click on these pictures to enlarge):

For the sake of this demo, the only difference between the source and the destination is that the destination has a GUID column, with no default constraint which adds its own NEWID() on insert. So we need to generate the GUID in SSIS. The usualy was we'd add a new column worth of data is by using the Derived Column transformation, but as I've mentioned (and you've probably found if you're reading this) there isn't a new GUID option.

So we add a script component.

Choose the "Transformation" option, hit OK, and drag the Source's green arrow onto the Script Component.

Double click the script component, then "Inputs and Outputs", then expand Output 0 and Add Column. Call the column whatever you like, but I'm going to call it "SQLNinjaGUID" so you can see how it's referenced in the C# code. Change the data type of the column to unique identifier [DT_GUID].

Jump back to the script tab, make sure we're using Microsoft Visual C# 2008 as the ScriptLanguage, Edit Script, and replace ALL of the code with the following.

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        Row.SQLNinjaGUID = System.Guid.NewGuid();
    }

}

Now when you drag the Script Component's green arrow onto your Data Flow Destination, you should see SQLNinjaGUID (or whatever you called your new column) pop up as an available Input Column in the destination's Mapping tab.

Thursday, May 26

Project Lucy

Quest Software, the kind people behind SQLServerPedia are running an experiment in cloud based performance analysis called Project Lucy.

The idea is that you can upload a SQL .trc (Profiler trace) file, and Project Lucy will do some analysis for you - analysing CPU time, durations, IO etc. There's histograms of statement durations, and the ability to filter the trace easily by picking from drop downs.

All this is of course possible by uploading your trace file to a database table (or saving the trace output directly to a table) and writing your own SQL queries. However, Project Lucy's aim is to start mining the crowd sourced data to provide comparison between your trace and similar workloads uploaded by the rest of the community.

It's an interesting project, and like SQLServerPedia it needs community support to succeed. It's useful now, but to reach its potential it needs feeding with data. I've started uploading traces over the last couple of days, and I'm finding that the analyses available thus far are useful enough to keep you interested while Quest work on extending the functionalities. Each trace file you upload is its own "analysis", and all your analyses are saved for later appraisal.

Also, they're giving away a $50 Amazon voucher each day (to US residents only unfortunately), so that's a pretty good incentive to give it a try.

Wednesday, April 27

Get a list of all the Tables in a Database

Script follows, with little to no fanfare!

SELECT 
        schemas.name + '.' + tables.name SchemaTableName
FROM 
        sys.tables 
  JOIN
        sys.schemas
     ON tables.schema_id = schemas.schema_id

Friday, February 25

Report Permissions

A reader of this blog asked me this morning: "If you wanted to design a query to make a report for management that would show which AD Groups or users have access to which reports, how would you go about it."

This is what I quickly whipped up:

SELECT
         Catalog.Name ReportName
         ,Users.UserName
         ,Roles.RoleName
   FROM [dbo].[Catalog]
         JOIN
         dbo.PolicyUserRole
         ON   [Catalog].PolicyID = PolicyUserRole.PolicyID
         JOIN
         dbo.Users
         ON   PolicyUserRole.UserID = Users.UserID
         JOIN
         dbo.Roles
         ON   PolicyUserRole.RoleID = Roles.RoleID

Can anyone see any problems with that?

Wednesday, February 9

Microsoft Certified Master - Database Structures

The MCM program is aimed at the uber-uber-uber SQL gurus, and from what I'm reading on various blogs it's no walk in the park for them.

The awesome thing that's popped up out of the newly revised program is that there are a bunch of free training material videos available.

While I'm not personally anywhere near thinking about attempting the MCM certification, I'm interested in hearing about what lies under the covers of SQL server from people who know it inside out. The absolute nitty gritty: like learning about DNA and mitochondria as opposed to the more systemic/anatomical view that the MCITP type courses cover.

The first video in the series covers database structures: the ways that data is stored and managed on the disk. It's presented by Paul Randal, who spent 9 years working on the storage engine for SQL server, so the info is really from the horse's mouth.

To paraphrase the summary of the video:

Records - the rows, or "slots" which make up our tables,
Pages - the 8kB chunks where the records live,
Extents - collections of 8 contiguous pages,
Allocation bitmaps - keep an eye on the extents, and;
IAM chains and allocation units - keep track of what's living where.

If you've a spare 42 minutes, check it out here.