The Nested Set Model++

This time we talk about adding a Depth field, and good ol’ CrUD ops – Create, Update, Delete.

Since my first post on this topic got a lot of attention and traction, I felt it appropriate to expand on the topic a bit, even if it’s been largely covered by other bloggers in the past.  I’ve also found it very useful to have a “depth” field, which isn’t canonically part of the model (hence the “++” in the title!), but is quite handy not only for display purposes (while you’re querying & testing the thing), and also for making certain “get” ops easier.  Sure, it adds a wee bit more to structural maintenance, but since that’s already the most complicated part of the model anyway, it’s hardly worth a second thought.  So let’s dive in!

The big topic last time was this operation of “move a subtree” — of course, sometimes you’re just moving one node, but only if it’s a leaf; otherwise you’re moving a node and all its descendants, so I’ve kept the procedure name MoveCatSubtree intact.  This time we’ll talk about good ol’ CrUD ops – Create, Update, Delete.  In my implementation, I chose to handle these with table triggers.  Some would argue in favor of stored-procs, and while that would seem “more consistent” with the precedent set, I’d counter with 2 points:

  1. To be really fool-proof, you’ll need to prevent ungoverned inserts/updates/deletes anyway; you could either do this with GRANT/DENY permissions, or triggers.  Permissions would be more complex because you’d still need your users to be able to exec the CrUD procs, so you’d end up using some convoluted security mechanisms that can be tricky to maintain over time.
  2. With triggers, we can allow the consumers of the data (apps, users) to continue to use “plain-ol’-TSQL” to access and manipulate the data, instead of having to remember stored-proc names and hunt for documentation on them.  (The exception being, of course, MoveCatSubtree, which, honestly, could be integrated into the insert trigger, but I’ll leave that as an exercise to the reader!)

Again, yes, we could easily do the same implementations in stored-proc form, and you’re welcome to fork my GitHub repo if you feel like exploring that.

Let’s outline the steps and draw some pictures.

1. InsertMake a hole!

When we INSERT a node, we want to specify its parent and a name, and let the triggers do the rest!  We place it at the right of its siblings-to-be, and update the position values of all nodes to the right so that everything stays kosher.  This should sound familiar — it’s essentially that “make a gap” part of the subtree-move op.  In terms of depth, we just +1 to the parent’s.

nsm-cats-insert-gadget-under-stripes
Stripes is a breeder; Gadget comes in and makes Fluffy & children move over.

Also, for some reason, our cats reproduce asexually…

 

2. Delete: Think of the children!

Similarly, to DELETE a node, we want to “close the gap” left by said deleted node.  But what of the children?  We don’t want to leave any orphans behind!  So we “promote” the children of our deleted node to the level (depth) of their parent, sandwiching them in between the deleted node’s siblings (aka their former aunts/uncles!).  This is easier than it sounds.

nsm-cast-delete-fluffy
He killed Fluffy!

Fluffy is survived by his children, who are now for some reason his siblings, and are very confused by their sudden increase in age & status.

3. UpdateRename; everything else is encapsulated.

Finally, we only allow UPDATEs on the Name, because everything else (position values, depth, parent) is structural, and encapsulated by our tree maintenance logic.  Moving a node or subtree?  MoveCatSubtree.  Swapping positions with another node?  SwapCatNode (TBD!).

4. Depth: Set it once, & encapsulate it!

Depth is pretty simple to add if you’ve already got a tree full of data.  We can use a recursive common table expression, or “rCTE“.  While normally these are frown-worthy (remember, recursion is not SQL’s strong suite), we’re only using it one time to populate an existing data-set, so we can keep on smiling.

;WITH CatTree AS
(
    SELECT CatID, ParentID, Name, PLeft, PRight, Depth = 0
    FROM nsm.Cat
    WHERE ParentID IS NULL
  UNION ALL
    SELECT cat.CatID, cat.ParentID, cat.Name
        , cat.PLeft, cat.PRight, Depth = tree.Depth + 1
    FROM CatTree tree
    JOIN nsm.Cat cat
        ON cat.ParentID = tree.CatID
)
UPDATE cat SET cat.Depth = CatTree.Depth
FROM CatTree
JOIN nsm.Cat cat
    ON cat.CatID = CatTree.CatID

The last order of business (for now) is to add Depth support to our MoveCatSubtree method.  As illustrated below, we have to move the subtree “up” or “down” in Depth depending on its new parent’s position relative to its old position.  The details are, of course, in the GitHub repo, but here’s a quick snippet of what that looks like: NodeNewDepth = /*NodeCurrent*/Depth + (@NewParentDepth - @SubtreeOldDepth) + 1  (where @SubtreeOldDepth is the depth of the top node of the moving subtree.)

nsm-cast-move-jack-to-mittens
Move Jack to under Mittens; I won’t repeat the Left/Right logic, just note the Depth logic.

 

In a future little addendum, I’ll briefly go over the “get” queries and that TDB SwapCatNode method.  For now,  enjoy the cats (again)!  Thanks or sticking around, I know it’s been a few more weeks than normal.

PS: A big thank-you to the dudes in the CodingBlocks #blogging Slack channel for their encouragement and motivation to get this done!  You guys rock.  Check out their blogs for some terrific content: http://dotnetcore.gaprogman.com/ , http://www.codeshare.co.uk/ , http://thereactionary.net/ .

Update:

I’d like to point future readers at two very informative articles for those interested in deep-diving down the hierarchical rabbit-hole: Aaron Bertrand, and Jeff Moden.  There are many more tweaks and enhancements that can be made to the “classical” Nested Set model, which those lucky Devs/DBAs who are in a position to actually [re]implement their hierarchies will want to read about and take advantage of.

The Nested Set Model

The #1 rule of the Nested Set Model is: FAST READs. The #2 rule of the Nested Set Model is: see #1

There are probably definitely several articles out there which cover the SQL implementation of the Nested Set Model, aka “modified preorder tree traversal” (which is more the name of the algorithm by which you traverse the tree, rather than the structure itself).  But I found it interesting enough, and more importantly, applicable enough to my job experience, that I feel it deserves some treatment.  Not the basic “how to”, but more an example of a particular operation and a specific pitfall to avoid. (Jump straight to the example diagrams.)

Now, we’re not going to debate about whether this model is “the best” representation of hierarchical data in an RDBMS (some argue that Closure Tables, aka “Ancestor Tables“, or some kind of hybrid approach is better, and I’d probably agree).  The fact is, sometimes (read: almost always) as a DBA/DBDev, you’re “stuck with” an existing database in a legacy application environment that you pretty much can’t change — or if you can, changes need to be small, incremental, and non-disruptive.

Okay, with that disclaimer out of the way, let’s dive in.  First things first:

The #1 rule of implementing the Nested Set Model is: FAST READs.

I can’t stress that enough.  Fast SELECTs.  Everything else pales in comparison.  In other words, we don’t care how long and painful and slow write operations are against this table (updates, inserts, deletes), as long as our SELECTs remain super speedy.  If that is not your use-case, consider a different model.

The #2 rule of the Nested Set Model is: see #1

Moving on…

The #3 rule is: encapsulate tree operations to maintain its integrity & structure.

Put another way, the #3 rule is that you should always operate on the tree (CrUD ops) using stored-procedures and/or triggers that encapsulate all the nitty-gritty details of maintaining the correct position values during said insert/update/delete operations.  Of course, somebody is responsible for writing those stored-procs.  Any volunteers?  Easy now, don’t raise your hands all at once!  Generally, this responsibility falls to the DBA(s) or DBDev(s).

The problem at-hand, in my current situation, was that of “moving a sub-tree”, i.e. taking a node and all its descendants, and moving it to place it under another “parent” node.  In some models, and/or in some languages, this is a simple recursive operation.  However, SQL is not spectacular at recursion — after all, we’re working in a relational engine — so let’s try to play to its strengths:

namely, SET-BASED operations!

A previous DBDev had written a stored-proc for just such an operation.  However, as (somewhat) expected, it was horribly slow, to the tune of hours of run-time.  This is not acceptable, even given the #1 rule stated above.

Well it turns out that most of it was pretty efficient, but the last step, in which they attempted to “fix” the left/right values in the entire table “just to make sure we didn’t leave any gaps“, was, frankly, quite silly.  Because the only “gaps” you create are created by the previous steps in the proc, and you know exactly how big that gap is (the width of the subtree you’re moving), and where it is, so you should be able to target that specific area of the tree and close the gap more intelligently, using some simple math. (addition and subtraction — the simplest math there is!)

Doing that improved the performance of the whole proc by a factor of 10.  That’s huge.  Or, “yuuuuge“.

So let’s get specific.  As you’ll see from my diagrams, the model actually is a hybrid, combining an Adjacency List (each record knows its “parent”) with a Nested Set (each record has a “left” & “right” position value).  We do this for two big reasons.  First, having the parent relationship along with the position values makes all that nasty book-keeping (rule #3) a bit easier to manage (and to check our work).  And second, because, conveniently, we can store the data from both models in one table.

On to the examples!

First, we have our tree of Cats.

cat-tree-1
Or, as a coincidentally cute table alias, CatTree

Now, we want to move Jack & his children to become descendants of Mittens (Jack being the child, Smush & Smash being grandchildren).  So we start by “making a gap” of the subtree’s “width” (6, the distance between Jack’s PLeft and PRight inclusive of end-points).  We add that amount to all PRight values >= Mittens’ original PRight,  and add it to all PLeft values > Mittens’ PRight — see the blue #s in diagram below, and code here:

UPDATE Cats
SET PLeft = (CASE WHEN PLeft > @NewParentRight
             THEN PLeft + @SubtreeSize
             ELSE PLeft END)
  , PRight = (CASE WHEN PRight >= @NewParentRight
             THEN PRight + @SubtreeSize
             ELSE PRight END)
WHERE PRight >= @NewParentRight

The red values haven’t changed (yet) but are now wrong, so we’ll have to fix them next.  And of course the green values are the moved subtree’s new positions based on the new parent’s (Mittens) PLeft.

cat-tree-2
Jack is now Mittens’ child.

Finally, now that we’ve moved Jack & his children under Mittens, we need to “close the gaps” that we created at first, to make sure that the tree’s position values remain contiguous.  This isn’t as difficult as it sounds: if we’ve stored Jack’s original PRight value (10), we can use that as a cutoff to subtract the subtree width from higher position values and intelligently (and quickly) close the gaps we created before.  Again, code & diagram:

--Notice this looks very similar to the previous
--code snippet! (We're basically doing the reverse)
UPDATE Cats
SET PLeft = (CASE WHEN PLeft > @SubtreeOldRight
             THEN PLeft - @SubtreeSize
             ELSE PLeft END)
  , PRight = (CASE WHEN PRight >= @SubtreeOldRight
             THEN PRight - @SubtreeSize
             ELSE PRight END)
WHERE PRight >= @SubtreeOldRight
cat-tree-3
Red values indicate “closing the gap” that was created by removing the subtree of Jack. Blue values indicate the incidental gap closures for the rest of the tree (above and right). Green values, you’ll notice, are “reverted” (i.e. same as they were originally).

SQL-wise, this should translate pretty well.  I’ve posted the setup and stored-proc scripts to GitHub, so the distinguishing reader can review and offer feedback.  In theory, there’s probably a way to exclude the green reverted values from the first pass operation (gap-making) so that we don’t have to revert them (at gap-closing), but again, since we’re doing SQL set-based operations, it seems hardly worth the effort — i.e. the potential speed gain would be outweighed by the logical/maintenance complexity.

 

So what’s the lesson here?  Well hopefully, if you’re “stuck with” a SQL DB with a Nested Set Model table containing a hierarchical tree of data, you don’t have to completely re-invent the wheel and write your CrUD ops from scratch.  But if your predecessors didn’t plan for certain kinds of operations, and this “move a subtree to a new parent” happens to be one of those, this should help you (re)implement it efficiently.

I’d love to get some feedback on this.  Let me know if I’ve missed anything conceptually, if there are better ways or methods to doing any of this, or any other tips & tricks that folks might have for dealing with such data.  Leave me a comment!

[footnote 1]
The root of the problem, in this case, was simply taking the code from a slideshare presentation and copy-pasting it into the routine without analyzing its effectiveness and efficiency.  It proposed re-calculating the position values after a move, across the entire tree, by using a triple-cartesian-product (or cross-join) to “get the count of nodes to the left/right of each node” for every node, which should sound dirty even as you say it silently in your head, let alone attempt to write it in query form!

[footnote 2]
There’s a 3rd model that we could consider storing in the same table, called “Enumerated Path” or “Materialized Path” or “Breadcrumbs”, which may look good on paper and to your human eyeballs, but breaks down spectacularly when you start talking performance and scale — but to be fair, so do most of these models, eventually, in one way or another, which is why we’ve invented fantastic alternative technologies to address these problems… and frankly, if you’re using all 3 models at once, you’re #DoingItWrong, creating a veritable maintenance nightmare for yourself and everyone around you.  Note that the elusive 4th model, the Ancestor Table, requires (as the name would imply) another table — not an argument for or against anything, just an observation.

PS: Happy 2017!

Views & Mismatched Datatypes

If you’ve ever wondered “who would win in a fight” amongst SQL datatypes…

Allow me a short revisit to the previous topic, because this example came from “real life” in my actual job in a production environment, and I wanted to share.  And remember, take this with a grain of salt — there are very few absolutes, a DBA’s favorite answer is “it depends”, and even when we spout rules or decrees, there are usually some exceptions.

Now, let’s start at the top.  T-SQL is a strongly typed language.  SQL Server, like most RDBMSs, counts datatypes as a foundational element in its ecosystem.  Things like bit, int, datetime, varchar (aka string), etc.  We also have this concept of NULL.  Any element of any datatype can be NULL, which, coincidentally, means a bit field can actually be 3-valued instead of just binary; BUT, that’s not the topic of this post.  However, it does involve a bit field.  And, as the title says, a VIEW.

First, the larger problem, or “what these little oversights / mis-steps can lead to”, as blogged by someone much smarter than me: Jonathan Kehayias – Implicit Conversions.

Now the diagram:

category, categorygroup, view-category-by-group
In English, the view’s DisplayInMenu field is defined as “If Category’s IsRoot bit is false, always use False; otherwise, use the Group’s DisplayInMenu bit.”  In even plainer English, it means we only want to “Display this Category in the Menu if it’s a ‘Root’ Category and it’s a member of a Group that gets displayed in the Menu.”

Do you see the problem?  It’s not so obvious if you’re not thinking about datatypes, but look closely.  Yes, our view’s DisplayInMenu field is a different datatype than its base field (origin).  That’s because the result of the CASE expression that defines it is “promoted” to the int type, instead of remaining a bit.  The same would be true of a COALESCE or ISNULL expression.  This is an example of datatype precedence.  If you’ve ever wondered “who would win in a fight?” amongst the SQL datatypes, Microsoft has the answer right there in black & white.

This isn’t necessarily a problem, all by its lonesome.  So why did it bite us in the proverbial behind?  Two reasons.  First, query & usage patterns:  vwCategoryByGroup.DisplayInMenu happens to be an important field in our queries, and usually it’s compared against a bit parameter or variable that’s passed down from the app’s lower tier.  Sometimes it’s even JOINed to another column, say, GroupMenuProperty.DisplayInMenu — which is, of course, a bit.  But because it’s an int in the view, SQL is doing extra work every time to implicitly convert those bits to  ints so that each side of the comparison operation is of the same type.  And again, not always, but sometimes, this causes performance problems.

ms-sql-stop-making-me-work-so-hard
I’m tired! Can’t you see I’ve been implicitly converting these datatypes all day?

The second reason is, admittedly, a bit beyond my understanding, so I’ll just explain the symptoms and leave the technical details to the more inquisitive minds.  Basically, during a performance crisis, one of the measures we took was to “fix” the view by turning DisplayInMenu back into a bit.  However, we found that it literally broke the dependent .NET application sitting on top, which started throwing exceptions of the “invalid cast” flavor.  I believe it’s using Entity Framework.  Thanks to a helpful tip from the Brent Ozar Office Hours webcast, I found out that it’s because the EF entity mapped to this view had that field (property?) defined as an int, and in order to make it “see” the datatype change, the code itself would need to be changed, i.e. that property would need to be defined as a bit.  Yay learning!

the-more-you-know
Also, did you know that EF really isn’t all that bad? 😉

So, dear reader, be conscious of your decisions with datatypes, especially when it comes to views with superficial computed columns.  But more to the point, beware of implicit conversions and mis-matched datatypes.  They can be, at best, a source of technical debt; at worst, a silent killer.

Til next time!

Views – the Good, the Bad, & the Lazy

If your VIEW has a dependency tree more than 2 levels deep, you’re doing it wrong.

Let’s talk for a minute about views.  No, not like scenery.  Not like my family (they’re good peeps).  I mean SQL views – i.e. CREATE VIEW vwFooBar which combines commonly used fields (columns) from tables Foo and Bar into a single entity, which datavelopers (term borrowed from Richie Rump, who has an awesome name btw) can then use & abuse without having to know the details of and relationships between those two tables.

So far, this sounds like a great idea, right?  Encapsulation & reusability are great principals in IT/Dev land.  And I agree!  Views have their place, for sure.  But some reason, I find myself constantly deconstructing them– refactoring complex queries & stored-procedures that use them — into their base tables (in the above example, that’s Foo and Bar).  And I find myself asking the proverbial WHY?  Why is this such a common misstep of devs & query writers, that the DBA has to spend such time “fixing” the issues this can cause?  Naturally, that train of thought spawned a blog post.

Let’s dive in!

Organization, Customer, Order: vwCustomerOrder
A very simple example of a 3-table view.

So, somebody created a nice high-level view for us that summarizes Customer, Organization, and Order data into a single entity that should prove pretty handy for things like reports, ad-hoc analysis, and maybe even a display-grid on a page somewhere.  And the happy developers don’t need to care about the foreign keys between the tables or how to JOIN them properly.

But!  Do we see a problem yet?  Yes, our vwCustomerOrder doesn’t include our Customer’s Email.  Well what if we need that in our code?  Do we go through change-management procedures and get the column added to the view?  Do we JOIN the view to the Customer table again just to get the Email?  Neither of those options are ideal.  The latter means that now we’re referencing the Customer table twice; the former involves refactoring, and is even more difficult if the view is an indexed view.

Okay, well let’s say we’ve added Email to the view, and it’s working fine.  Now let’s add another layer of complexity.  Say Customer is actually a VIEW, which consists of base-tables CustomerHeader and CustomerContact, where the latter stores a collection of contact entries for each Customer.  Now, vwCustomerOrder is thus a 2-level nested view.  These aren’t really super fun, but they’re not the most offensive thing in the database.  But again, what happens when we need something from CustomerContact that’s not already in our “master” top-level view vwCustomerOrder?  Maybe the primary Phone1, for example.  So to build that query, we now have to combine our 2-level nested view with a redundant table (i.e. a table that’s already in the view)!  But because it’s so terribly important to some reporting modules or some code bits, it becomes a permanent fixture of the schema.  And on it goes.

Expanded example from earlier:

organization-customer-contact-order-views
Because vwCustomer doesn’t include Phone2, if we built vwCustomerOrder on top of it, we’d have to JOIN again to CustomerContact, when we know it’s already included in vwCustomer. Thus, it’s smarter to build vwCustomerOrder from the base-tables and avoid that double-join (duplicate reference).

This is the problem with VIEWs that are grown organically, re-actively, without careful attention to the underlying schema and dependencies.  And look, I get it.  Mature software system systems DO evolve organically; iteration is the name of the game, and that goes for the database too, not just the app code.  But let’s agree that, like the boyscout principle espoused earlier in my ode to Clean Code, we can try to leave things a little better than how we found them.  DB refactoring is … not necessarily harder, but definitely different (than app code refactoring).  You probably have to jump thru more hoops to get the changes pushed to production.  Regardless, it’s worthwhile.  Why?  Glad you asked!

Nested views are icky for a couple reasons.  First, they’re more difficult to troubleshoot – not because they annoy the snot out of your DBA, but because they legitimately make index tuning more tedious.  Second, laziness begets more laziness – when you’ve got these views, you’re automatically tempted to simply use, re-use, & abuse them, instead of taking the time to analyze their effectiveness and refactor, replace, or reevaluate the problem.  Rampant ill-constructed views can lead to some serious technical debt.  And…

technical-debt-is-bad-mmkay

There’s another nugget of wisdom in software development that’s appropriate here: Less is More.  Your views should address a specific need or use-case, do it well, and call it a day.  Feature-creep is not a goal; views that try to be many things to many people ultimately fail at it, because you end up committing sins against set-theory & the relational model in favor of “oh just one more thing!”  While the purpose of views is ultimately abstraction, we should not forget the underlying parts — tables, indexes, statistics, schema relationships.  In fact, a good view can help us construct the right indexes and visualize the relationships.  Whereas a BAD view will obfuscate indexing needs and confuse (or even outright lie about) the base relationships.

Venn-diagram time.  We’ll use the circle areas to represent that “scope” of each view, meaning the base-tables that comprise it.  In a healthy schema, there will be mostly independent views with a few overlaps in small portions.  These views “do one thing well”, with little redundancy and hardly any nesting.  Conversely, in a haphazard ill-managed schema, there is lots of overlap and redundancy, multi-level nesting, and ugly colors (who the heck likes brown, anyway?).

views-family-good
Views in happy harmony & balanced dependence vs. independence.
views-family-bad
Views in sad land with ugly brown and black overlapping areas. (I was going to try for puce or burnt-umber but Paint is sadly not up to Crayola 64-pack standards.)

So, dear reader, what’s the point?  (I feel like I rhetorically ask that fairly often.)  Build your SQL views with targeted use-cases and clear purpose.  Avoid the laziness trap and the temptation to tack-on “one more thing”.  A view should  be a concrete, concise, abstracted representation of the underlying tables & relationships.  Evaluate existing views for how well they meet the developer needs, and don’t be afraid to deprecate, drop, or rewrite bad code.  And above all, stop the insane nesting.

If your view has a dependency tree more than 2 levels deep, you’re doing it wrong.

Or, more visually…

ridiculous-swiss-army-knife
This is not a useful tool, despite the sheer volume of functionality.

That’s all for this week!  Thanks for reading, and apologies for the slightly longer delay between posts.

SQL Server for the IT Helpdesk, part 2

Part 2, as promised.  We were talking about the components of SQL Server.  So far we hit “the big ones”: the database engine sqlservr.exe, the data files (mdf, ldf), and the agent sqlagent.exe.

Thirdly, less of a “component” and more of a “tool-set”, the venerated SSMS – SQL Server Management Studio. (Formerly known as SQL Server Enterprise Manager, if you’ve been around for a long time.) SSMS is the “stock” application interface for SQL Server, the DBA’s basic swiss army knife – all of the essential functionality, a couple odd-ball modules here & there, and no “fluff”. It’s not the prettiest, nor the sexiest, but it gets the job done. It even has some basic monitoring capability; as an infrastructure person, you would most likely use it to check up on DB statuses, Agent Job runs, and perhaps the occasional “everything is slow!” panic moment by peeking at the Activity Monitor (right-click on a server instance).

swiss-army-knife
Just the essentials!

SSMS is powerful, and with great power comes great responsibility. Your DBA should be practicing good security principles and granting limited access to all but the most trusted, experienced team members, because that top tier access, sysadmin (or sa for short), is literally capable of destroying the server (maybe not physically, but logically!). SSMS is a client-side application, primarily – meaning that yes, while you do “get it for free” when you install SQL Server on your server (at least, until SQL Server 2016, when they finally broke it out to a standalone installer, praise the gods), you should really install it on your own workstation and use it from there.

Key takeaway: ask your DBA to install SSMS on your workstation, IF it’s appropriate (and trust me, they will tell you if it’s not); often, there are better monitoring tools and infrastructure-focused apps that can “talk to” the SQL servers without needing you to dive down deep into SSMS-land. But, in a pinch, it helps to know a few basic things about what it can do.

with-great-power-comes-great-responsibility
The original; don’t you forget it!

Finally, and less commonly, we have a collection of supplemental services that work within and around those two main components to provide additional functionality and support for things like bulk data interchange, reporting, data warehousing, and scale-out options. Again, the curious reader can easily find more detailed info at his/her favorite reference sites, so I’ll define these briefly and move on. They all start with “SS”, which of course stands for SQL Server. So, here we go. SSIS – Integration Services – bulk data import/export, ETL (extract/transform/load), and similar tasks. SSRS – Reporting Services – lets us build & manage reports, and provides a basic no-frills web interface from which any business user can view said reports to which they’re granted access. SSAS – Analysis Services – data warehousing, cubes, “analytics” – basically, we’re filling up disk-space with pre-defined roll-ups and drill-downs of our data for faster reporting queries & data analysis. So as you can imagine, often those last two go hand-in-hand, in a mature environment that’s cognizant of its “business intelligence” (BI) needs. But that’s another topic way outside the scope of this post!

Key takeaway: if any of those SS*S components are a part of your environment, monitor them at the service/process level, and, just like with the Agent, have a procedure in place to address incidents.

So what have we learned? A SQL server machine runs one (or more, but hopefully/usually just one!) instances of SQL Server. We should primarily be aware of the engine & agent service processes, the locations of the mdf & ldf files, and the agent job schedules; and secondarily, if applicable, the other components (SS-whatever-S).

Oh, but wait! What about performance concerns?!? Glad you asked.

SQL Server EATS disk for breakfast, memory for lunch, and CPU for dinner. In that order.

om-nom-nom-nom-wait
I may have made a slight miscalculation..

So your dedicated SQL server machines better have FAST disks, LOTS of RAM, and a decently fast, multi-socket multi-core CPU. The disk speed helps keep transaction log throughput & data file read/write activity up-to-snuff, so faster == better. Memory is for holding the “hottest” data for fastest access so it doesn’t always have to go to disk for everything, so more == better. And CPU cores help it serve 100’s (or even 1000’s) of concurrent requests in a sane, efficient manner.

zoidberg-wants-moar
Because moar… and Zoidberg.

In reality, SQL Server performance is a huge topic; there are entire consulting companies dedicated to it, and tons of excellent blogs about it. You want a good DBA on your team who knows how to troubleshoot and address performance issues with your SQL servers. But when you’re tasked with helping spec-out a new physical box, or sizing a new VM, it doesn’t hurt to know what kind of demands you’ll need to meet.

Which brings us to our last key point…

Key takeaway: SQL Server is primarily a scale-UP technology, not scale-out. If you’re not OK with that, you might want to look into other DBMS’s.

scale-out-and-scale-up-illustrated
I have to give full credit to this guy because his slide really illustrates my point well.

That’s not to say you can’t scale-out with it; obviously, large corporations with mature ecosystems often have 100’s of instances in some resource-groups, but as I discussed earlier in What’s in a Name?, the applications which depend on those SQL servers are also architected with that scaling in mind, and it’s not like an “easy switch” you can make to an existing monolithic app.

Anyway, I hope this helps IT Helpdesk and SysAdmins/Engineers get a little more visibility into the SQL DBA’s world, and bit more understanding of how things work behind-the-scenes.

Thanks for reading, see you next time!

SQL Server for the IT Helpdesk

Know which servers are running SQL Server, monitor the services, and have an escalation procedure for failures.

Today’s post will be much less philosophical and (hopefully) much more practical. This is a topic that I’ll actually be presenting at work to my fellow IT team members, who are mostly all Helpdesk and Infrastructure people, with no Dev or database background. My goal is to explain the basics of SQL Server from an IT infrastructure perspective, and I hope that others find it useful as well as my team! So let’s get to it. Fair warning: lots of abbreviations ahead! But fear not, I will define them – I don’t want to assume any prior knowledge, other than basic IT skills & Windows familiarity.

admin-disabled-contact-admin
This might be a bad sign…

I also decided to break this up into 2 posts, because it got pretty lengthy.  So, here goes part 1!

SQL Server is, obviously, Microsoft’s relational database management system, or RDBMS. “Relational” means it’s based on the concept of relationships between entities (or objects, if you will; though mostly we mean tables). In layman’s terms, the relations can be expressed as “is a..”, “has a..”, or “has many..”; for example, a Customer object “has a” Contact-Info object, which probably “has many” Phone#s, Emails, etc. SQL Server is a transactional, fully ACID compliant database system – Atomicity, Consistency, Isolation, Durability. I won’t bore you with the computer science theory stuff; you can read all about it on Wikipedia or your favorite reference site, so let’s move on.

rdbms-family
One of the family!

There are a handful of components that make up SQL Server, and I’ll list them in what I think is the “most to least important” order, or at least most to least commonly used. Infrastructure-wise, at its core, SQL Server is basically an “application” and persistent data storage mechanism. It’s a set of processes (services) that run on your server and operate on specific sets/types of files, which we’ll cover in a bit.

First and foremost of said core components is the database engine. This is the DBA’s bread & butter – it’s the service process, sqlservr.exe, which handles all the queries and scripts, manipulates the data files & transaction logs, ensures referential integrity and all that other ACID-y goodness. Like any good service, it is best run under a dedicated service account, especially in a Windows domain environment. Otherwise, it will run as a built in system account, like NT Service\MSSQLSERVER. As a service, it’s set to “Automatic” run-mode and it accepts a number of startup parameters which can help govern and tweak its behavior. Most of those details are outside the scope of this post, so go ask your friendly (or surly, depending on the day) DBA if you’re curious.

msdn-sql-core-components
It’s an older infographic, sir, but it checks out.

A bit of key terminology: each running database engine is what we call a SQL Server instance. Typically you dedicate one physical (or virtual) server to each instance; however, in rare cases (usually in the lower tier environments like dev/test), you will see multiple instances on one server – this is called instance stacking, and it’s not usually recommended by DBAs due to resource governance complications. You typically reference a SQL Server instance simply by the server name that it’s running on, because the “default instance”, while technically named MSSQLSERVER, does not require any special referencing. So if my SQL server box was named FooBar, I would connect to the SQL instance with the name FooBar. If you get into stacked instances, you have to start naming the other instances uniquely, and your connections now have to include them with a backslash, e.g. FooBar\SQL2, FooBar\SQL3, etc.

As I said, the engine manages the data files. We’re talking about a permanent data storage system, so that data has to “live” somewhere – namely, mdf and ldf files (and sometimes ndf too). The mdf (and ndf) files hold the “actual data“, i.e. the tables, rows, indexes, & other objects in each database. The ldf files represent the transaction logs, or “T-logs” for short, which, again, are a fundamental part of the RDBMS. They allow for and enforce those fancy ACID properties. All of these files are exclusively locked by the sqlservr.exe process, meaning, nothing else is allowed to touch or modify those files but SQL Server itself, while SQL Server is running – not your antivirus, not your backup software (SQL has its own backup methodology), not even Windows itself. The only way to interact with and manipulate those files, or rather, the data within those files, is through the engine.

cant-touch-this-permission-denied
You can look, but you can’t touch!

Key takeaway: know which servers are running SQL Server, monitor the sqlservr.exe service(s), and work with your DBA to ensure that your AV software excludes the locations of those data files, and that your backup strategy includes & works in conjunction with SQL-based backups.

In a close second, we have the SQL Server Agent, sometimes just “agent” for short. Think of this like Windows Task Scheduler on steroids, but specifically focused within the realm of SQL Server. Like the engine, it runs as a service process – sqlagent.exe– again, typically set to automatic startup and running under a dedicated service account. (There’s a small debate on whether it’s best to use the same service account as the engine, or a separate one; in the end it doesn’t really matter much, so it’s mostly at the discretion of your DBA team and/or the Active Directory admin.)

sql-agent-man
Shamelessly borrowed from the Database Whisperer

SQL Server Agent is basically free automation – we can use it to schedule all kinds of tedious operations and database maintenance tasks, anything from “run this query every day at 12pm because Mary in accounting runs her big report after lunch and it needs to have data from X, Y, & Z”, to running the hallowed “good care & feeding of indexes & stats” that any DBA will tell you is not only essential to decent database performance, but essential to life itself (or at least, your SQL server’s life). These sets of work are called Agent Jobs (or just “jobs”), which consist of one or more Job Steps. A job has one or more Schedules, which dictate when it runs – such as, again, daily 12pm, or “every Saturday & Sunday at 5pm”, or “every 15 minutes between 8am and 9pm”. Ideally, jobs should be configured to alert the DBA (or, in some cases, the broader IT team) when they fail, so that appropriate action can be taken.

Key takeaway: monitor the service sqlagent.exe and have an alert & escalation procedure in place for handling failed Agent Jobs.

database-automation
Automation is good, mmkay?

A quick sidebar about transaction logs.  SQL Server uses what’s called a “write-ahead log” methodology, which basically means that operations are literally written to the T-logs before they’re written to the actual data files (MDFs/NDFs).  As stated so eloquently in this blog:

[When] neglected, it can easily become a bottleneck to our SQL Server environment.

It’s the DBA’s job to ensure the T-logs are being properly backed up, maintained, and experiencing good throughput.  And I’ll have another blurb about performance at the end of part 2.  However, as an infrastructure person, it’s worth knowing two things: A) these aren’t your “traditional” logs, like error or event logs; these are fundamental core part of SQL Server.  And B) —

Key takeaway: The disks which house the transaction logs should be FAST.  Like, stupid-fast.  Srsly.  At least at sequential writes.

samsung-960pro-m2-ssd
Yes please, I’ll take several… just charge it to the CIO’s platinum card.

Edge Cases

“Oh, that’ll never happen!”

I have a confession. I enjoy edge cases. More than I probably should. I take an unhealthy amount of pleasure in finding those unexpected oddities and poking holes in the assumptions and “soft rules” that business users and even developers make about their processes, applications, and most importantly, their data.

But let’s back up a minute. What exactly is an edge case? Well, I’ll quote Wikipedia’s definition at you, then I’ll put my own spin on it.

An edge case is a problem or situation that occurs only at an extreme (maximum or minimum) operating parameter.

edge-cases-99-to-1
it’s in a pie chart, so it must be true…

In my mind, there are really two types of edge cases. Although the results and treatment are largely similar, so it’s not a terribly important distinction, but for conversation’s sake, here they are.

  1. Known: while we can (and usually do) identify these cases, they are typically thought of as ultra-rare and/or “too difficult to reproduce”; thus, the plan for handling them involves tedious one-off or ad-hoc procedures, which are often completely undocumented.
  2. Unknown: the saying goes, “You don’t know what you don’t know” – these cases don’t even cross our minds as a possibility, either due to ignorance or because they are [supposedly] contrary to our business rules or our [assumed/remembered] application logic.

Again, the end result is the same: panic, discord, technical debt, and wasted hours of remediation. So why do we do this to ourselves? Well, one common justification tends to be “Oh, that’ll never happen!”, and we sweep it under the rug. Then there’s the laziness, busy-ness / lack of time, pressure to deliver, gap in abilities or tool-sets, passing the buck, etc. We’re all guilty of at least two of these in any given week.

basic-technical-debt-legos

So let’s move on to the important part: What can we do about it? Largely, we can simply take the excuses & reasons and simply turn them on their heads. Take ownership, commit to learning, communicate with management, make time for planning and documentation, and once a path is laid out for remediation, actually do the work. It’s often not easy or pretty, but a little pain now beats a lot of pain later.

tech-debt-cute-line-graph

I know, easier said than done, right? :o)

Example time.

Let’s say our sales offices are closed on Sunday – this is our “operating assumption”. Therefore, no orders will be processed on Sundays – this is our “business rule”. So because of this, we’ve decided to do some ETL processing and produce a revenue report for the previous week. Now, we’re using some antiquated tooling, so our first batch of ETL, which takes the orders from the sales system and loads them into the bookkeeping system, runs from, say, 2am to about 5am. Then we have a second batch, which moves that bookkeeping data into the report-staging area, from 6am to about 7am. We need those hours of “buffer zones” because the ETL times are unpredictable. And finally, our reporting engine churns & burns at 8am. But along comes Overachieving Oliver, on a Sunday at 5:30am, and he’s processed a couple orders from the other day (so perhaps he’s really Underachieving Oliver, but he’s trying to make up for it, because he enjoys alliteration almost as much as I do).

lumberg-work-on-sunday

Woah nelly! What happened? Oliver’s sales didn’t make it into the report! Not only that, but they don’t even exist according to bookkeeping. But come Monday, if he tries to re-process those orders, he’s probably going to get an error because they’re already in the sales system. So now he’s gotta get IT involved – probably an analyst, a developer, and maybe even a DBA. Damn, that’s a lot of resources expended because an assumption and a rule were broken!

warning-assumptions-ahead

Here’s another one. An order consists of 1 or more order-lines, which each contain 1 or more of a given item (product). Let’s say we store these in our database in a table called OrderLines, and each line has a LineNumber. Now, we have an assumption that those LineNumbers are always sequential. It’s even a rule in our applications – maybe not all parts or all modules, but at least some, and enough to cause a fuss if there’s a gap in that sequence or if a line is moved around somehow without proper data dependency updates (which, by the way, are application-controlled, not database-controlled). Plus there some manager who depends on this old reporting metric that also breaks when those line numbers are out-of-whack. But this should never happen, right?

assumptions-blind-spots-blanchard

The operative word there being “should”. But apparently there was a bug in an “update order” routine that left a gap in the sequence. Or maybe the DBA was asked to delete something from an order post-mortem, and there’s no way within the app’s ecosystem to do it, so he had to write some queries to make it work. And guess what? Because the Dev team is super busy working on the hot new feature that everybody wants, it will be 2 weeks before they can circle back around to address that update-bug or add that utility function for line-deletion. So now the DBA’s writing a stored-proc to wrap in a scheduled job to “fix” those order-line sequences every night, to prevent that one app module from breaking and keep those management reports accurate. And, to quote my very first post ever, the DBA waxe[s] wroth.

picard-why-we-cant-have-nice-things
Picard knows best..

So, prevention? Well, I’d probably start by locking down the order entry system during that off-limits window. We should also wire-up those big 3 processes so that there’s no need for indeterminate buffer-zones and inconsistent timing. And yeah, it could be a big change needing lots of team buy-in and approvals, but it’s worth the investment. Right? Right!

Hope you enjoyed reading! Until next time…

Drafted with StackEdit, finished with WordPress

Is Coding Style still a thing?

Make it readable, minimize scrolling.. provide context, clarify complexity.

I can hear it now.. the moans and groans, the wailing and gnashing of teeth.. “Another post about coding style / formatting? REALLY!? Why can’t we all just get along read/write/paste code in our own environment and let our IDE/toolset worry about the format?” I hear you, really I do. But you’re missing the point. The single most important trait of good code is readability. Readability should be platform-independent, i.e. it should not vary (too greatly) depending on your viewing medium or environment.

The ratio of time spent reading (code) versus writing is well over 10 to 1 … (therefore) making it easy to read makes it easier to write.

-Robert C. “Uncle Bob” Martin, Clean Code

And yet, all around the web, I continue to find terribly formatted, nigh-unreadable code samples. Specifically, SQL scripts. That’s me; that’s my focus, so that’s what I pay the most attention to. But I have no doubt that the same applies to most other languages that Devs & IT folks find themselves scouring the tubes for on a daily basis. There is just an excessive amount of bad, inconsistent, language-inappropriate formatting out there. And honestly, it’s not that surprising — after all, there are millions of people writing said code — but it’s still a source of annoyance.

Clean code always looks like it was written by someone who cares.

-Michael Feathers, Clean Code

Don’t you care about your fellow programmers?

Maybe it stems from my early days working in a small shop, where we all had to read each other’s code all the time, and we didn’t have these fancy automated toolsets that take the human eyeballs out of so much of the workflow/SDLC. So we agreed upon a simple set of coding style/layout rules that we could all follow, mostly in-line with Visual Studio’s default settings. And even though we occasionally had little tiffs about whether it was smarter to use tabs or spaces for indentation (protip: it’s tabs!), we all agreed it was helpful for doing code reviews and indoctrinating onboarding new team members.

My plea to you, dear reader, is simple. Be conscious of what you write and what you put out there for public consumption. Be aware of the fact that not everybody will have the exact same environment and tooling that you use, and that your code sample needs to be clean & structured enough to be viewed easily in various editors of various sizes. Not only that, but be mindful of your personal quirks & preferences showing through your code.

That last part is really difficult for me. I’m a bit of an anal-retentive bastard a stickler for SQL formatting, particularly when it doesn’t match “my way”. So that’s not the point of this post – I don’t mean to, nor should I ever, try to impose my own personal style nuances on a wide audience. (Sure, on my direct reports, but that’s why we hire underlings, no? To force our doctrine upon them? Oh, my bad…)

 

All I’m saying is, try to match the most widely accepted & prevalent style guidelines for your language, especially if it’s largely enforced by the IDE/environment-of-choice or platform-of-choice for said language. And I’m not talking about syntax-highlighting, color-coded keywords, and that stuff – those are things given to us by the wonderful IDEs we use. Mostly, I mean the layout stuff – your whitespace, your line breaks, your width vs. height ratio, etc.

For example, Visual Studio. We can all agree that this is the de-facto standard for C# coding projects, yes? Probably for other MS-stack-centric languages as well, but this is just an example. And in its default settings, certain style guides are set: curly braces on their own lines, cascading smart-indents, 4-space tab width, etc. Could we not publish C# code samples to StackOverflow that have 7 or 8 spaced tabs, curly braces all over the place (some on the same line as the header code or the closing statement of the block; some 3 lines down and spaced 17 tabs over because “woops, I forgot a closing paren. in a line above, I fixed it but I was too lazy to go down and fix the indentation on the subsequent lines!”).  GRRR!

Always code as if the [person] who ends up maintaining your code will be a violent psychopath who knows where you live.

John Woods

Now, on to SQL. Specifically, TSQL (MS SQL). Okay, I will concede that the de-facto environment, SSMS, does NOT have what we’d call “great” default style/layout settings. It doesn’t even force you to use what little guidelines there are, because it’s not an IDE, so it doesn’t do smart auto-formatting (or what I’ve come to call “auto-format-fixing, in VS) for you. And sure, there are add-ins like RedGate SQL Prompt, SSMSBoost, Poor Man’s T-SQL Formatter, etc. But even they can’t agree on a nice happy compromised set of defaults. So what do we do? Maybe just apply some good old fashioned common sense. (Also, if you’re using such an add-in, tweak it to match your standards – play with those settings, don’t be shy!)

I can’t tell you how frustrating it is to have to horizontally-scroll to read the tail-end of a long SQL CASE expression, only to encounter a list of 10+ literal values in an IN () expression in the WHERE clause, that are broken-out to 1-per-line! Or even worse, a BETWEEN expression where the AND <end-value> is on a brand new line – REALLY?? You couldn’t break-up the convoluted CASE expression, but you had to line-break in the middle of a BETWEEN? Come on.

youre-killin-me-smalls
You’re killin’ me, Smalls!

Or how about the classic JOIN debate? Do we put the right-hand table on the same line as the left-hand table? Do we also put the ON predicates in the same line? Do we sub-indent everything off the first FROM table? Use liberal tabs/spaces so that all the join-predicates line up way over on the right side of the page? Again, common sense.

Make it readable, minimize scrolling, call-out (emphasize) important elements/lines (tables, predicates, functional blocks, control-flow elements, variables). It’s not revolutionary; just stop putting so much crap on one line that in a non-word-wrapped editor, you’re forced to H-scroll for miles. (FYI, I try to turn word-wrap on in most of my editors, then at least if I have some crazy-wide code, I can still read it all.)

SWYMMWYS. Say what you mean, mean what you say. (“Swim wiz”? “Swimmies”? I don’t know, I usually like to phoneticize my acronyms, but this one may prove troublesome.)

swimmies
swimmies!

Modern languages are expressive enough that you can generally make your structure match your intended purpose and workflow. Make it obvious. And if you can’t, comment. Every language worth its salt supports comments of some kind – comment, meaning, a piece of text that is not compiled/run, but rather just displayed to the editor/viewer of the code. Comments need to provide context, convey intent, and clarify complexity.

So what’s my point? Basically, let’s all try to make our code samples a bit easier to read and digest, by using industry-standard/accepted best-practices in terms of layout & style.

Drafted with StackEdit, finished with WordPress.


PS: Just to eat my own words, here’s an example of my preference to how a somewhat complex FROM clause would look, if I was king of the auto-formatter world:

SQL-sample-dark-code
Hipster dark theme FTW!

Send comments, thumb-ups, and hate-mail to my Contact page. 😛

Header image: our husky Keira at 2.5 months, being a sleepy-head!

Mongos, anybody?

You’ve heard of this thing called MongoDB…

Unless you’ve been living under/in the proverbial rock/hole-in-the-ground, you’ve heard of this thing called MongoDB. It’s one of the biggest flavors of NoSQL database systems out there – specifically, it’s a “document DB”, which means it stores semi-structured data in the form of JSON documents, grouped into collections, grouped into (of course) DBs.

Now, I’m a Windows guy. SQL Server & Windows Server are my comfort zones. I like GUIs, but I’m ok with a command line too. PowerShell has become my friend good acquaintance. But every once in a while, we have to step outside our comfort zones, no? So when I was told “hey, you’ve got to get a handle on managing these MongoDB instances that we’ve got running on servers so-and-so, as part of this ‘Imaging system’ application”… I thought “Hahaha…”, but what came out was “Sure thing!”  Something a bit like this:

hahahayeah

(courtesy of DBA Reactions)

So the first order of business was to decide on a MongoDB “GUI” – a Windows app that could at least connect-to and give me a visual overview of the running MongoDB instances (properly referred to as a mongod, stemming from the Linux term “daemon”, which on the Windows side is basically like a process or service). I tried both the “official” desktop app from the big org, MongoDB Compass, and a neat open-source tool called Robomongo.

And I actually like them both, for different reasons; most of which can probably be attributed to my lack of depth with the technology, but hey. Anyway, Compass is really nice in that it gives you this kind of statistical overview of the collections in your DBs, performing some basic aggregates on a 10% or 1k sample of the collection documents to give you a graphical 40-thousand-foot view of the data. But where it breaks down for me is that little “query” bar, which is just an empty pair of curly-braces. It’s only for “selecting” (finding, querying); no other operations to see here. So we can’t manipulate our data with it, but it’s definitely great for viewing!

 

Whereas with Robomongo, I can right-click on the DBs/collections and do very simple things like “show documents”, “show statistics”, etc. And it actually writes the equivalent mongo shell command for me to poke at; say, to inject more logic to the find to get something specific or to write a new command or two as I read thru the docs and learn things like aggregates, indexes, and whatnot. Being a shell, it allows us to write or update data as well as read it.

But alas, GUIs will only take you so far. This is a tech born & bred on the command-line, so to really dig into it, I needed to let go of the mouse. And the mongo shell was actually pretty straightforward! A couple visits to their docs pages & some visual verification by using the GUI tools to check my work, and things were starting to come together. And even for a complete Javascript noob like me, the command syntax was straightforward enough. Soon I was configuring a replset and mongodump/mongorestore-ing and re-syncing a replica after purging some old stale collections.

Even though it’s a CLI/Linux flavored technology, it works perfectly fine in Windows… Except for one thing.  So, you install it as a service, and you typically start & stop services using net start & net start and as long as your service’s CLI arguments are all correct, you should be good — in theory! Trouble is, the service tends not to stop gracefully. So I found that, instead, the following command was more useful: mongo admin --eval "shutdownServer()". This uses the actual mongo shell to send the native shutdown command to the mongod, instead of relying on the Windows services gymnastics to do it correctly.

It just goes to show, dear reader, that you’ve got to get your hands dirty and try out new technology before dismissing it as “not my job” or “somebody else’s problem”.

PS: Nope, that’s not Compass’s logo or anybody else’s; I made it meself, with good old Paint.NET!

Drafted with StackEdit, finished with WordPress.