Try-Catch-FAIL

Failure is inevitable.

My day as a data analyst

clock February 10, 2009 09:17 by author Matt

That's right, I have fallen from grace as lead developer and have been reduced to writing mundane SQL to generate spreadsheets for customers.  Fortunately this is just a temporary assignment, but it's a PITA nonetheless. 

I did learn a few useful things today though.  First, SQL sucks.  And second, SQL sucks. Third, if you have a really complicated query that's going to require a lot of nested tables and selects, you can really save your sanity by refactoring things into UDFs.  Yeah, it's a little dirty having UDF helpers for what amounts to an ad-hoc query, but it really makes life simpler.  Consider the following (greatly simplified) example:

Task: Create a report of projects, one line per project.  Each line should contain a single column with all the personnel involved, another with all the organizations involved, and a link to the project.

What we have: Projects are stored in a table.  There is a one-to-many relationship to a table that contains organizations.  Projects are also one-to-many related to documents.  Each document is one-to-many related to a table containing people.  Each document also has a URL.  A project can have a lot of URLs because of this.

The (first) problem: We want to aggregate some of the one-to-many rows into a single field in the SQL output.  Yeah, that would be easy/easier with MySQL and group_concat, but this isn't MySQL, it's SQL Server.  There are ways around it, but I found that it was easier to just create UDFs for exactly what I needed, like so:

   1: --------------------------------------------
   2: -- Gets a string of all the distinct funding
   3: -- organizations for an activity.
   4: --------------------------------------------
   5: ALTER function [dbo].GetFundingOrgs
   6: (
   7:     @activityID uniqueidentifier
   8: )
   9: returns varchar(1000)
  10: as
  11: begin
  12:  
  13: declare @orgs as varchar(1000)
  14:  
  15: SET @orgs = ''
  16:  
  17: SELECT @orgs = @orgs + PrimaryName + ', '
  18: FROM Organizations
  19: INNER JOIN ActivityOrganizationMetadata 
  20:     ON Organizations.OrganizationID = ActivityOrganizationMetadata.Organization
  21: WHERE Activity = @activityID
  22: AND Propogated = 0
  23: GROUP BY PrimaryName
  24:  
  25: --Remove the trailing ', '
  26: IF LEN(@orgs) > 0 
  27:     SELECT @orgs = substring(@orgs, 1, LEN(@orgs)-1)
  28:  
  29: return ( @orgs )
  30:  
  31: end

I could have done something similar using nested CTEs and other hideous garbage, but this works well enough.

The second problem: We want a *single* URL for each activity.  That means we can't just do a join between the projects and documents, because then we get multiple rows for each activity.  Again, UDFs to the rescue:

   1: --------------------------------------------
   2: -- Gets the original URL for the first 
   3: -- document in the activity.
   4: --------------------------------------------
   5: ALTER function [dbo].GetOriginalUrlForActivity
   6: (
   7:     @activityID uniqueidentifier
   8: )
   9: returns varchar(100)
  10: as
  11: begin
  12:  
  13: return
  14: (
  15:     SELECT TOP 1 Url FROM DocumentsToQueries
  16:     INNER JOIN Documents ON DocumentID = Document
  17:     WHERE ActivityID = @activityID
  18:     AND Url LIKE 'http%'
  19: )
  20:  
  21: end;

And like magic, my report works!

Hopefully tomorrow will be back to non-SQL work. 

Share or Bookmark this post…
  • del.icio.us
  • DotNetKicks
  • Digg
  • msdn Social
  • Reddit
  • StumbleUpon

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


SQL Server Tip: sqlcmd variables

clock September 29, 2008 09:35 by author Matt

Here's a fun little tip you can use in your SQL scripts to make them scriptable via the sqlcmd utility while still supporting SSMS.  Say you want to make the name of your database configurable so you can script out the deployment of your database.  You could do that with a script like this:

   1: SET @dbName = '$(DBNAME)'
   2: ...
   3: EXEC sp_executesql 'CREATE DATABASE [' + @dbName + ']'
   4: ...


But what happens when someone decides to run your script in SSMS?  It will try to create a database named '$(DBNAME)'!  That's not even remotely what we want.

NOTE: Yeah, I know you can make SSMS execute the script in SQLCMD mode, but then it just dies with a 'variable not defined' error, which isn't what we want either.

What we'd like to do is have the script fall back to a default when run in SSMS.  That's easy to do, just add these two lines of code after your initial SET statement:

   1: if @dbName = '$' + '(DBNAME)'
   2:     SET @dbName = 'DefaultName'

All it's really doing is checking to see if @dbName is set to the string that would have been replaced by an actual value had the script been run via sqlcmd.  You do have to break it up into two parts though to prevent sqlcmd from performing substitution there, otherwise your sqlcmd script will always use the default name instead of whatever was passed in to it.
Share or Bookmark this post…
  • del.icio.us
  • DotNetKicks
  • Digg
  • msdn Social
  • Reddit
  • StumbleUpon

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


About Matt

I am an overworked (and apparently overpaid) software developer with aspirations of acquiring a PhD in Computer Science. I started off coding in C over a decade ago.  Since then, I've migrated from C to C++ and branched out to C#, PHP, VB.NET, JavaScript, and worked with a wide assortment of other languages that I hope to never deal with again (I'm looking at you, COBOL). Oh, and yes, I've written some Java.  Does that make me a bad person?

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in  anyway.

© Copyright 2009

Sign in