a quick tip for passing default values to an SSRS multi-value parameter during report drill-thru.
Inspired by some StackOverflow-ing and this particular answer.
Aka: “How do I pass/assign a default set of values to a multi-value parameter in SSRS?”
We often have an SSRS report that needs to “drill thru” to another report, usually to go from a “high level view” to a “lower level” or more detailed view, or sometimes just to lead the user down the path we’ve chosen as the head analyst / BI architect. And part of that report navigation involves initializing (set to defaults) the value(s) of the sub-report’s parameters, including multi-value parameters (which I’ll now refer to as mvp, even though, I know, it’s a ridiculously overused acronym). These are basically arrays, but are often represented as simply comma-delimited strings, so it’s easy to forget their true nature.
Let’s fabricate an example. In our Sales Summary report, we have an element (a textbox, image, placeholder, or whatnot) that, when clicked, should drill-thru to Sales by Person. Said next report requires a multi-select parameter (another term for the mvp) to have 1 or more values selected, otherwise it doesn’t render. We’ll call this parameter SelectedNames, with value-label pairings 1=Bob, 2=Alice, 3=Mary. When we drill-thru to this by-Person report, we want it to initially show (have selected by default) all available people.
So how do we do this? In the properties of the “clickable” element on Sales Summary, say it’s a text-box named GoToDetails, we go to the Action tab. We choose the “Go to report” action, select/specify the target report, Sales by Person, and then add the parameters we want to pass to it. For comparison, I’m going to pass a “regular” (single value) parameter called @ReportDate, as well as the mvpSelectedNames. Here’s what that all looks like, in picture form.
The single parameter pass-thru is, as you’d expect, very simple. But for our mvp, we need to use the expression-builder, that little fx button stylized to look like your old high school math class days. Here’s what that function looks like:
And presto!, we have converted a comma-delimited list into an array to pass into our Sales by Person report’s SelectedNames multi-value parameter. Now it will initially render with all 3 selected people as desired.
So there you have it, a quick tip for passing default values to an SSRS multi-value parameter during report drill-thru. But what if you wanted to pass the selected values of one mvp down to another? Or do some clever on-the-fly mapping (conversion) from one to the next? Well, stay tuned! I’ll write about that next time. =)
Thanks for reading! For a lot more on SSRS and multi-value parameters, check out these articles: @sqlchick, @mssqltips, and @msdn.
What could possibly go wrong? As it turns out, plenty.
Faithful reader(s), it’s been a while! I’ve been busy preparing for some big transitions. I’m also getting better at MDX queries, tweaking SSAS-based reports to more accurately reflect the business rules. But enough about that, on with the post!
In which we doubt the SAN
A storage area network (SAN) is a management & administration solution, not a performance solution.
-someone wiser than me
SANs are wonderful technology. They inspire all kinds of geekery and are purported to solve all your storage woes. But there’s a catch: they’re expensive. Not just as a capital expense, but in maintenance and licensing costs. And if you ever want to upgrade it, like add some more drives to a particular tier/pool — fuhgeddaboudit.
So what do we do with SQL on a SAN? Well, it has tiers, right? Slower storage with huge capacity, faster storage with less, etc. We put the data files (heavy random read workload, typically) on the pool optimized for that kind of I/O pattern. We put the TLog files (heavy sequential write workload) on the pool best suited for that. And what about good ol’ TempDB? Its access pattern is fairly unique — random writes and reads, and frequent overwrites, which means it could potentially wear out your typical prosumer SSD relatively quickly. But we’re not complete cheapskates, we’ll buy enterprise class SSDs, no?
So we go read some stuff and figure, hey, sounds like a great idea, right? Put TempDB on a local SSD, or better yet, a pair of SSDs in RAID-0 for pure performance (because this is a cluster, we’ve got HA already). We’ll reduce the load on the SAN I/O channels and make our overworked TempDB happier with lower latency and better throughput. Right?
In which we discover what could possibly go wrong.
Once the new drive(s) is(are) installed and “presented” to Windows (that’s my SysAdmin’s term), it’s fairly trivial to do the SQL configuration change — it does of course require a SQL service restart (or cluster failover). Code example, assuming your new drive is ‘T’:
alter database tempdb
modify file (name=tempdev, filename='T:\tempdb.mdf')
alter database tempdb
modify file (name=tempdb2, filename='T:\tempdb2.ndf')
You do of course have multiple TempDB data files, yes? Good.
Should we put templog (TempDB’s transaction log) on the same drive as the TempDB data files, or put it on the same storage pool as the regular DBs’ TLogs? As usual, “it depends” — ask your favorite SQL gurus and do some testing.
Back on topic
We’ve made the change, we’ve done the cluster failover. TempDB is now running on our spankin’ new SSD. So we start monitoring performance metrics. Things like file I/O stats (from SQL DMV sys.dm_io_virtual_file_stats), latency and waits (from our monitoring tools), and good ol’ PerfMon.
But wait, what’s this? I/O stalls are higher? Write latency is higher?!? Perfmon agrees?
Write latency on the TempDB files was over 10x higher than it was when they were on the SAN (the performance tier, to be clear). The file_stats DMV showed large increases in I/O stalls. Sad-trombone.
In which we have several theories
Then ensued various conversations and brainstorms among my colleagues.
Someone check the firmware/drivers!
Maybe it’s got the wrong block-size.
Well, it’s only 6Gbps SAS… maybe we should’ve sprung for the 12Gbps.
The write latencies went up by a factor of 10. I don’t think an improvement by a factor of 2 is going to win you any trophies.
Why didn’t we get an NVMe or M.2 one?
Because the damn blades don’t have those slots, goober.
Another interesting observation, and potentially the silver lining. Overall instance waits (wait stats), according to our monitoring tool, went down. That’s good news, right? Maybe. Does application performance & user experience corroborate it? Possibly! We’ll be observing the patient for another week or so.
Let’s turn to the community again to see what others have experience.
Oh wait, it might not be such a fantastic idea after all.
And by “we” I mean “me”. Being the DBA and the primary proponent of the SSD addition, because I knew our workloads were very TempDB-heavy, I had to hang-tail and admit that the SAN gods won this round.
But wait, what about the fact that our wait stats are down? What about app/user experience? Valid arguments, I agree. That’s why we’re still observing. But I’m not optimistic, given the follow-up links above. We may utilize local SSDs for something else (index filegroups?) — but if those write latencies don’t improve, I’m concerned that it won’t help anybody.
In which I ask for your help
Yes, you! If you have ideas on what we did wrong, what we’re missing, or any other advice about getting the most “bang for the buck” out of a direct attached SSD on a converged-infrastructure Cisco UCS blade server platform with a VNX SAN, by all means, drop me a line. I’m all ears.
..while there are likely even better ways to do this in the long-run, this quick & easy approach was sufficient to save me time and effort..
At this point in my career, I’m not managing massive environments, so I don’t feel the need (nor have the expertise) to use a large scale solution like DSC or SCCM. But I’ve had to install SQL Server a few times, so I figured it’s worth at least scripting out a standard pre-configured installation, so that A) I don’t need click through a GUI ‘wizard’ hearkening back to the ’90s, and B) the SysAdmins can “fire and forget” (read: stop bugging me about it).
Thus, I’m attempting to K.I.S.S., while making it configurable & repeatable. There are some limitations of this approach, as alluded above. It’s not “massively scalable” (scaleable? scale-able?) because:
The PoSh script still needs to be deployed locally to the server in question
The installer config (.ini) also lives locally (though it probably could be a UNC path, it’s just a file after all)
The script prompts you for the service account (SQL engine, Agent) credentials and the sa password using the Read-Host -AsSecureStringmethod cmdlet, so some meatbag still has to type those in. This is because we don’t have an enterprise pwd/secret-management system where I could, say, ask it for a service account credential set and tell it to embed that securely in a script without it actually being visible to me. So, while yes, they’re kept in a “vault”, it’s not query-able by anything else, so an admin still needs to copy & paste them into whatever configuration screen he’s working with at the time. Not ideal, I know, but we work with what we’ve got.
PS: Yeah, yeah, “don’t use sa, rename it or disable it; or use Windows Auth only!”. Rage, howl, fire & brimstone. I’m not going to argue about it; we can save that for another post. This environment dictates that its used during setup and then disabled later, so that’s beyond the scope of the installer config.
So yes, while there are likely even better ways to do this in the long-run, this quick & easy approach was sufficient to save me time and effort for the occasions when a new SQL box/VM needs to be spun-up.
A primer on SQL cmd-prompt installation & its arguments
And finally, twothings that I attempted to understand but ultimately failed to implement, because (apparently, at least to me), PowerShell remote-ing is a P.I.T.A.
First we need an .ini file to work with. You could either create it from scratch, or take it from an existing SQL box’s “Setup Bootstrap” folder. Example path C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log\20170801_073414\ConfigurationFile.ini — indicating this was an install done on 8/1/2017 at 7:34am. Right above that, at simply C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log\, you’ll see a Summary.txt file, which can actually come in handy while you’re testing these unattended installs and wanting to see why it failed.
The first link above, from MSFT Docs, does a pretty nice job of telling you all the things that make up this config file. You get to bypass the TOS prompt, enter service account details, specify drive letters (paths) for default data/log file locations & tempdb, slipstream update packages (UpdateSource), and even more advanced stuff like AG settings and whatnot. My example will be a simple standalone instance using the default name, so I’ll be sticking with the basics.
We can then use this file in the ConfigurationFile argument of setup.exe from the SQL Server install media. To put a little more color on that: the .ini file is really just a collection of command-line arguments to setup.exe; you could also list them all out in-line, but that would be tedious and silly. Here’s a couple major selling points of creating your own config file:
Slipstream updates (SP’s, CU’s), instead of having it go out to MSFT update servers (or *aghast* sticking with the original RTM bits, you heathen you!)
Specify drive letters / default file locations: sure, this may be considered old-hat if you’re running super slick storage, but I still find it makes management a bit easier if I know where my MDFs, LDFs, TempDB, & backups will always be.
Take advantage of 2016’s better TempDB setup options (# files, size & growth)
We will, however, keep a couple arguments out of the .ini file and instead throw them into the ArgumentList from the calling PowerShell script. Speaking of, here’s what the PowerShell script needs to do:
Prompt the operator (SysAdmin or DBA) for the SQL & Agent service account credentials, and (optionally) the sa pwd (if using it).
Fetch our install media from the central network share where we store such things (server & office ISOs, for example).
Mount said ISO to our virtual disc drive.
Run its setup.exe with the following arguments:
The config .ini file
The service & sa accounts
After it’s done, un-mount (dismount) the ISO.
Then the DBA can connect to the brand-spankin’-new running SQL instance and do other post-setup configurations as desired (i.e. set max-memory, maxDOP/CTFP, etc). And sure, those could also be done in PowerShell (thanks in no small part to the awesometeam at DbaTools), I chose not to do so in this case.
As the bloggers say, “that’s left as an exercise to the reader”.
Plus, they’re never quite as deterministic as we’d like them to be — they depend on the server’s compute resources, i.e. memory size & CPU cores, as well as estimated workload & environment tier, so it’s often a gamble in “how correct” your initial settings will be anyway. Still, anything is better than the defaults, so configure-away!
Here are the Gists I’ve created to go along with this post. If I’ve made a mistake, or if you, dear reader, have a suggestion, we can incorporate them into the gist without me having to go back and edit the blog post!
I’d love to get feedback on how you would improve this, what you might do differently, etc. Drop me a comment or a tweet!
How it lost its original clustering key is a perfect topic for this month’s T-SQL Tuesday!
As previously referenced, we had a half-billion row table (which we’ll “round up” to a billion, just for the sake of argument) that needed an index. A clustering key, to be exact. How it lost its original clustering key isn’t that interesting is a perfect topic for this month’s T-SQL Tuesday (we planned to replace it, before realizing how long it would take, and had to cancel the job after it was removed but before it could finish adding the new one). Anybody can drop an index; it takes mere milliseconds. But to create (or rebuild) an index, the SQL engine needs to touch every single row that index will include, and of course for a clustered index, that’s errverybody.
When planning an index replacement for a billion-row table, don’t just schedule the job as “drop old index, then create new index“, and expect nothing to go wrong.
Don’t try to index a billion-row table all at once. Use a smarter methodology. Unless you can literally afford to have that table be offline for several hours (depending on your hardware, and assuming you’re on Standard Edition.
Of course, some of you crazy kids with Expensive Edition are scoffing and thinking “Oh that’s nothin’!”, with your billions of rows and online index rebuilds. Well you can go back to your fancy Always Encrypted data and your terabytes of RAM and just pretend I said “trillion” instead of “billion” — maybe that’ll fit your scale a bit better. But for the rest of us mere mortals…
Anyway, since we’re peons, we can’t just go create the billion-row index without taking the table offline. So we need to come up with a way to minimize that downtime for the table, and avoid causing excessive blocking or I/O overload on the instance. As I said before, SSIS to the rescue!
I call this the “setup, dump, & swap”. Essentially we need to create an empty copy of the table, with the desired index(es), dump all the data into it, and then swap it in. There are couple ways you can do this, but it boils down to the same basic premise: It’s “better” (probably not in terms of speed, but definitely in terms of efficiency and overhead) to fill this new copy of the table & its indexes, than it is to build the desired index on the existing table.
So here’s an outline:
Script out the table (SSMS, right-click, script table, create to… or, if you use oneofthesecoolextensions, F12 or similar “get definition” shortcut) — say, if the original is MyTable, script-create & replace MyTable with MyTableCopy
Here’s a little room for choice. You could create the table in the same schema with a new name; or, you could create the table in a different schema, with the same name or another name. This will determine how you do the “swap” toward the end.
In the first case, we’d use sp_rename
In the 2nd, we’d use alter schema transfer
Both are essentially meta-data changes, but the latter could be potentially take a hair longer just because it pulls more strings; whereas the former requires taking care of dependencies ahead of time so you don’t break a schema-bound view or orphan a foreign key.
Add the desired index(es), e.g. create clustered index CX_MyTableCopy_Datestamp_ThingName on dbo.MyTableCopy (Datestamp, ThingName)
Build the SSIS task to copy the data from MyTable to MyTableCopy
Schedule said task via SQL Agent.
Use the previous tip to monitor its progress and estimate time to completion!
Once done, prepare to swap!
Again, use sp_rename, and if needed, alter schema transfer.
Likely, this will involve several renames – the constraints and other indexes can’t be named the same either, so get all that stuff renamed with an _old suffix first, then you can swap the actual tables.
Clean up after yourself and drop the old table once you verify everything’s working well and all dependencies are accounted for.
Let’s back up a bit. Why are we doing it this way? More importantly, what are the general use-cases for this kind of thing? Well as I said, it’s “better” in certain ways, than simply creating the index on the “live” T.O.U.S. It avoids locking said live table, and it has the potential to put less stress on the database and less synchronous I/O against its primary files. Finally, and most importantly, as a bulk insert operation, with configurable batch sizing, the SSIS task will put exponentially less load on the transaction log. Whereas, with the regular inline index creation, it could easily fill up the Tlog and will definitely cause performance issues.
Furthermore, swapping tables, or in more advanced cases, partitions, is a fantastic way to move massive amounts of data around with minimal impact. It also happens to be quite helpful when we have to deal with indexing such massive amounts of data.
In my case, this was actually an archive of 2016 audit-trail data, so if I’d have chose, I could have taken it out of the partitioned view it was part of, and dealt with it being offline for several hours. But because I’m a sucker for a good problem, and because I wanted to be able to semi-accurately monitor the progress to let the boss-man know when it might be done, I chose to take this route instead. It’ll come in handy again soon, I’m sure — we have several other jumbo-tron tables laying about that may need some index tuning and/or partitioning.
So, happy swapping, and don’t let those T.O.U.S.‘s get the best of you!
PS: thanks to @SQLDoubleG for hosting, and thanks in advance to the rest of the community being lenient of my “submission” being a re-purposed existing post with a few extra blurbs thrown in to bring it on-topic! =D
We think the total operation will take several hours, possibly days. So we want to be able to check up on it.
While I try to compose my resources and samples for the previoustwo teasers, I found this to be a particularly interesting problem and thought I’d share my solution.
We have a very large table, half a billion rows. It needs to be copied (well actually, indexed, but that’s another part of the discussion). We know that standard TSQL would suck for this, even if we wrote a batch-loop proc for it (sure, it might be less blocking, but it’ll take forever). Plus, we might be going between two different servers (well, not in my case, but you very well could be, and in fact that’s probably the more likely case, given the implications of such a large operation). SSIS to the rescue!
Now, again, we’re lazy, so we use the Import/Export Data wizard (a component of SSIS) to build the .dtsx package. Then we can schedule it via a SQL Agent Job.
Sure, we could build it with Visual Studio and/or BIML and be super-awesome, but where’s the fun lazy in that?
Based on some preliminary estimates of the throughput using SSIS for a small sub-set of the data, we think the total operation will take several hours, possibly days. So we want to be able to check up on it — to see its status & elapsed time, and get an estimate of time remaining. Sound good?
Two assumptions going into this, to keep the example simple.
We know the Agent Job’s scheduled start time, and it does start on-schedule.
We’re only dealing with 2 tables — 1 source, 1 destination. Furthermore, they’re both on the same server instance; or if not, we can query a linked-server connection once to get the “source” table size-stats, because they won’t change.
(Okay that was slightly more than 2, but again, simple. One can imagine expanding this to multiple tables & multiple instances, but then at some point you’re going to need a “witness” that can talk to all the disparate sources of data and conglomerate those bits & pieces together, and then you have to ask yourself “is it really worth it, or can I just give a SWAG & move on with my day?”)
Before I move on: helpful SOanswer that reminded me how surprisingly-not-that-difficult (and-still-pretty-efficient) it is to convert a time interval (seconds, in this case) in a “human friendly format” like Days.hh:mm:ss. Yay.
One key component of any kind of “check-up” or general monitoring solution is, you need it to be light weight, i.e. have low overhead. We don’t want our method of monitoring the process to add some non-trivial extra load to that process (or the servers doing said processing). So let’s avoidCOUNT(*) shall we? Unfortunately, the top Google results don’t readily point to this, but MS actually gave us a handy-dandy built-in way of measuring this, sys.sp_spaceused. Yay again.
Granted, it doesn’t get you “up to the millisecond” accuracy, but it’s truly close enough for what we’re doing; and frankly, it’s your only sane option when asking the question “how many rows?” of a TOUS.
So we’re going to use the output of that system proc, specifically the rows column, to measure how much data has been bulk-copied into our destination table. We’ll also measure the source table, once, as stated, because (we’re assuming!) that won’t change.
Finally, we’ll use those row-counts, with some simple tricks & nonsense — basic math, dateadd/datediff — to calculate a percent-complete, elapsed time, estimated total time, and ETA (estimated time of arrival completion).
This is a work-in-progress, but I thought I’d put it out there because it may be interesting to others who use the tool.
This is a “teaser” in that I’m not done writing the content yet, but I wanted to get it out there as “something I’m working on”. The problem at-hand is that we have a table trigger which is constantly being disabled/enabled by a stored-proc , which in turn controls the only acceptable method of updating the columns that would otherwise be forbidden from update by said trigger. Clear as mud? K. Basically there’s a schema change (“drift”) of a certain specific type that I want DLM Dashboard to always ignore on this DB.
SELECT TOP 100 * FROM RedGate.SQLLighthouse.DDL_Events de
ORDER BY de.PostTime
SELECT TOP 100 * FROM RedGate.SQLLighthouse.DDL_Events de
WHERE de.appname NOT IN ('SQLServerCEIP', 'Spotlight Diagnostic Server (Monitoring)')
AND de.appname NOT LIKE 'SQLAgent%'
ORDER BY de.PostTime DESC
ALTER TABLE SQLLighthouse.DDL_Events DROP CONSTRAINT PK__DDL_Even__3213E83FB62EF9A3;
ALTER TABLE SQLLighthouse.DDL_Events ADD CONSTRAINT PK_DDL_Events PRIMARY KEY NONCLUSTERED (id);
CREATE CLUSTERED INDEX CX_DDL_Events_PostTime ON SQLLighthouse.DDL_Events
CREATE INDEX IX_DDL_Events_SPID ON RedGate.SQLLighthouse.DDL_Events
INCLUDE (transaction_id, options, nestlevel);
CREATE INDEX IX_DDL_Events_AppName ON RedGate.SQLLighthouse.DDL_Events
CREATE PRIMARY XML INDEX XI_DDL_Events_EventData ON SQLLighthouse.DDL_Events
CREATE XML INDEX XI_DDL_Events_EvenData_PATH ON SQLLighthouse.DDL_Events
USING XML INDEX XI_DDL_Events_EventData
CREATE XML INDEX XI_DDL_Events_EvenData_PROPERTY ON SQLLighthouse.DDL_Events
USING XML INDEX XI_DDL_Events_EventData
Every big enterprise product has them. I just happen to pick on SQL because it’s my area of expertise.
This week’s topic was triggered by an amazing quote from Adam Machanic (b|t) in the sqlcommunity Slack group, where he was trying to help someone troubleshoot a slow linked-server query:
That default, I consider to have been checked into the SQL Server codebase directly by Satan.
He’s referring, in this case, to the default option of remote proc transaction promotion, or “Enable Promotion of Distributed Transactions for RPC” in the GUI, which is set to Trueby default. Admittedly, linked-servers are a dicey area of the technology and not everybody needs them, but when you do need to deal with them, there are some caveats to consider and a few potholes to avoid.
I won’t go into gory details, but the mile-high gist of it is that you should probably change a couple of the default linked-server settings when you’re going from MSSQL to MSSQL (which most of the time, you are): rpc and rpc out both to true, and the above dude to false. The reasons behind that are subtle, but it boils down to usage patterns: If you’re purely running basic SELECT statements against the remote server, you can leave this stuff alone; but if you want to do cool stuff like exec MyLinkedServer.RemoteDB.sys.sp_executesql '--some awesome dynamic-sql', you’ll want to change these. (That last bit comes in especially handy if you’re building a dynamic query string, then running it against the remote server to bring in the results to a #temptable to further massage/mangle/munge said data.)
Even though you probably really shouldn’t be doing that in the database (that’s what web server farms are for!).
So, what are some other “stupid defaults” in SQL Server?
Every big enterprise product has them. I just happen to pick on SQL because it’s my area of expertise. And it’s not even just “defaults”; there are some options which shouldn’t even be a thing — they should be completely and irrevocably in love with Edward removed from the product. Yet, because the RDBMS tech space is infamously conservative and slow-to-change (the accepted euphemism is “mature“), these options and defaults have remained in the product despite our best attempts to convince MS that they’re heinous.
1. Parallelism settings (server/instance level)
Your servers have many-core CPUs, right? And you want SQL to utilize those cores to the best of its ability, distributing the many users’ workloads fairly amongst them, yes? Damn right, you paid $3k or more per core in freaking licensing costs! “OK”, says SQL Server, “I’ll use all available CPUs for any query with a ‘cost’ over ‘5’“. (To give context here, in case you’re not aware, ‘5’ is a LOW number; most OLTP workload queries are in the double to triple digits). “But wait!”, you protest, “I have more than 1 user, obviously, and I don’t want their horrible queries bringing all CPUs to their knees and forcing the 50 other user queries to wait their turn!”
Hardly seems fair, no? But those are the defaults, have been for over a decade, despite how much hardware has evolved since. Just for completeness, I am talking about the following 2 options, which you can view by right-clicking on the instance in SSMS, or by running sys.sp_configure. They are max degree of parallelism and cost threshold for parallelism, respectively (# cores to use, and how expensive the query should be to “go parallel” — as opposed to serial/single-threaded, which is perfectly fine for those itty-bitty-teeny-tiny queries that are so quick that 1 CPU core is more than enough to handle them). We commonly abbreviate the former as maxDOP; the latter is less commonly abbreviated, but because I’m a sucker for acronyms, I like to call it CTFP, or maybe costFP if I’m feeling verbose.
And if you have somewhere between 9 and 15 CPU cores, don’t ask me, because you’re running some.. interestinghardware. Figure it out yourself, and get that gun out of my face!
OK, I know 12 cores is probably not an unreasonable config, so.. extrapolating my logic above, divide by.. 3? Yeah let’s go with that.
2. Auto Close (database option)
It’s at the top of the list in the GUI under Database Properties -> Options. Yet nobody knows why you would ever enable it. I once heard a community member theorize that it might have been put into the product back when it was considered a viable “local persistence option for Windows CE/Mobile apps” and it would help w/ resource management on said mobile device by freeing up resources when the DB wasn’t in-use. Well, we all know how well that product line did in the market (hint: poorly). There are so manybetteroptions for localized data stores in mobile dev, MS isn’t even a blip in the conversation. (Again, talking local data persistence.)
If we’re talking cloud, MS is a big part of that conversation — Azure is amazing, and a solid competitor to AWS & GCP.
Anyway, if you ever find a SQL DB with the auto_close option enabled, find the person responsible, and slap them with a trout.
3. Server Max Memory
Last one for today. This is something that’s not completely heinous, but could use a LOT more built-in intelligence during the installation process so that DBAs & SysAdmins didn’t need to think about it so much. SQL Server will, by default, sets its max-memory to some-odd-billion-MBs (technically it’s the max value of a 32-bit int, which the more geeky among you have probably memorized), which is of course some-odd-million-GBs, which is more than even the most bleeding-edge servers have to date. Which is fine in theory — you paid a crap-ton of money for this system, it might as well use up all the RAM that it can to perform to its potential, right?
Right. Until you realize that “Oh wait, it’s running inside an OS” (whether that’s Windows or Linux, thanks to 2016 & 2017 product versions) — that that OS needs some RAM too, to keep itself running! (Cue the Linux zealots with their “it’s so much more lightweight than Windoze, haha!!1” — yeah, well you still need some memory space, don’a ya?)
Here’s what I’d like to see, in my ideal world. During SQL Server installation, it would detect how much RAM is on the system, subtract about 10% or 4-8 GB, leave that for the OS, and use the resulting number as its limit. Boom, done, nobody has to think about configuring it and checking off another checkbox on their setup checklist.
But noooo… The vaunted MSSQL engineers can built all sorts of amazing things into the product like QueryStore, Adaptive Query Processing, and The Artist Formerly Known as Hekaton, but heaven forbid we get a little more intelligence in the installer. It got a lot better with 2016 when it let you configure tempDB reasonably correctly (multiple files, different locations, etc), but there’s still a LOT that could use some lurv.
Do you have a favorite “stupid default” or “horrible setting” related to SQL Server or any other technology that you work with? Share in the comments!
MDX is not SQL. It may look like it has SELECT/FROM/WHERE clauses, but god help you if you start drawing parallels to your standard TSQL query.
This is an exercise I had to go through recently, because A) the reports in question were deployed in SSRS but used an SSAS backing, i.e. cubes, and the source queries (MDX) were not stored in source-control, and B) I don’t write MDX queries.
Run Profiler or XEvents against the SSAS server
set to capture “Query Begin” events only, with “Event Subtype = 0” (for MDX query)
optionally, set filter on NTUserName to the dedicated SSRS account (if you have it set up that way)
Run the SSRS report(s) that you want to dive into
For each event in the Trace, copy-paste the MDX query to a new MDX editor window
SSRS parameter substitution happens via some XML at the bottom; but in MDX, the parameters are standard @params like in T-SQL. So we need to manually substitute our parameter values.
2 blocks of XML: the “Parameters”, and the “PropertyList” — delete the latter.
In the former, text-replace & for simply & .
Side-bar: You’ll notice that the MDX parameters are usually inside STRTOMEMBER() or STRTOSET(), which are built-in MDX functions that do exactly what they sound like — parse a string into a dimension’s attribute’s member or set of members. That’s why they’ll usually have at least 3 .‘s (dots) — Dimension.Attribute.&MemberValue, for example. I’m grossly oversimplifying that because it’s beyond the scope of this post, but read the docs if you need more gritty details.
For each parameter node:
Copy/cut the <Value> node content (I like to ‘cut’ because it helps me keep track of which ones I’ve done already)
Find-and-Replace @ParameterName with that Value node’s content, surrounded in single-quotes
Example: we have parameter @ReportDate (in MDX), corresponding to <Parameter><Name>ReportDate</Name> in XML, with <Value xsi:type="xsd:string">[Some Dimension].[Some Attribute].[Some sub-attribute].&[2017-05-01T00:00:00]</Value> — where that last bit is a standard SQL datetime literal.
So you replace @ReportDate with '[Some Dimension].[Some Attribute].[Some sub-attribute].&[2017-05-01T00:00:00]' .
Delete the XML block.
Boom, now you have a valid MDX query that you can run and view results.
Why do this? Well, it can help you learn MDX from a working example, instead of from super-basic dummy examples. That’s not always a good learning style — you should still learn the fundamentals of MDX and why it’s so very different from SQL. Especially if you’ll be responsible for writing and maintaining more than a few MDX queries. But, in a pinch, if you need to start somewhere, and possibly all that the MDX / overlaying report needs is a slight tweak, this may be enough to get you going.
Profiler can still be a useful tool, despite some people’s attempts to kill it.
MDX is not SQL. It may look like it has select, from, and where clauses, but god help you if you start drawing parallels to your standard TSQL query.
SSRS does parameter-passing in an odd way.
SSAS & MDX are fascinating and I need to learn more about them!
A typical part of a DBA’s work-week might involve the occasional DB user-role-membership management, so I hope this helps the lone-wolf DBAs out there and/or the developers who need to know what to ask for…
Just a brief post on adding/removing users (database level users) to/from roles (database level roles). It’s relevant because several shops are still stuck supporting at least a few 2008 (or hopefully, 2008R2) instances, and there is a key difference between those and newer (2012 & up) versions in the “preferred” method of doing this security task.
There are reams of documentation and books and articles written about SQL security in general. That is beyond the scope of this post (and indeed, beyond the scope of any single blog, unless you’re an SME on the subject!). But a typical part of a DBA’s work-week might involve the occasional DB user-role-membership management, so I hope this helps the lone-wolf DBAs out there and/or the developers who need to know what to ask for, when they’re planning/deploying a new app against their SQL DB(s).
The “old” method involves calling system stored-procedures, sp_addrolemember and sp_droprolemember, in which you pass the role-name and username. The “new” method, supported starting with SQL 2012, is to use the command-phrases ALTER ROLE [role] ADD MEMBER [user], and ALTER ROLE [role] DROP MEMBER [user].
The latter is more ‘standard‘, while the former is more ‘Microsoft-y‘. I couldn’t easily find whether it’s part of the official ANSI standard or not… that’s an exercise for the reader. What I find very interesting is that Azure’s data warehouse offerings require the old method. Of course, hopefully in a DW setting you’re not messing with security nearly as much as a typical OLTP system, but… yeah.
Does that mean those Azure services are built on top of older SQL engine versions? Possibly. MSFT isn’t too open about the deep internals of such tech, but neither is any other cloud vendor, so we can’t really ask them such a question and expect anything more than a blank-stare. But it is curious, no?
Syntax examples: Let’s add the user foo to the database Bard, in the db_datareader built-in role. Then we’ll remove him. (Or her, I guess; “foo” is a pretty gender-neutral name.) Creating said user is easy, so I’ll start with that, and it’s the same in all supported versions. You need a server-level login to link it to; if you don’t have one, I’ll show you how to create it first.
Create server-level login:
--preferably, you create a login for an existing AD/Windows account:
CREATE LOGIN [yourdomain\foo] FROM WINDOWS;
--or, you can just create a SQL login (not connected to domain/Windows/ActiveDirectory; also less secure, as discussed here and here)
CREATE LOGIN [foo] WITH PASSWORD = 'foobar';
Create database-level user:
--if you made the domain/Windows login:
CREATE USER [foo] FOR LOGIN [yourdomain\foo];
--or, if you just made the SQL login:
CREATE USER [foo] FOR LOGIN [foo];
ALTER ROLE db_datareader ADD MEMBER [foo];
Check (see above)
Remove user from role:
ALTER ROLE db_datareader DROP MEMBER [foo];
Notice that, because the “old way” is simply executing sys-sp’s, we can actually run it from any database context. Whereas the “new way” requires you to connect to the database in question.
Note: I am in no way shape or form responsible for you screwing up your database or SQL instance, nor for you getting yelled at by your DBA or security admin or any other form of verbal assault you may incur as a result of running these commands. But since you need server-admin & database-owner equivalent permissions anyway, you’re probably one of those people already, so you’ll just end up yelling at yourself.
Cleanup (just so you don’t muddy your instance/DB up with a silly example user):
DROP USER [foo];
DROP LOGIN [foo];
If you have any questions, feel free to reach out to me!
Enough with the pitchforks; this is Test/QA. Here, I talk about 3 gotchas.
Today we’re going to talk about SQL Server instance stacking.
Right, in production. I’m talking about DEV/TEST environments.
Settle down. If your server is set up correctly and has the resources you want it to have, and you divide your resources up per instance in a few very simple ways, it’s fine. Enough with the pitchforks, the wailing and gnashing of teeth.
Okay, now that that’s out of the way…
Remember our cute little DEV server? So, the way he’s set up is, he’s got 3 SQL Server instances on him, each with its own dedicated SSD, and another dedicated SSD just for tempdbs. Ideally, we’d have a separate SSD for each instance’s tempdb, but sadly, motherboards with 3 M.2 or NVMe slots aren’t (weren’t?) in production at the time, at least not for desktop class systems. But I digress.
This is called instance stacking. And yes, it’s a big no-no in production. Mostly because performance troubleshooting is a pain in the arse. But also because it’s more difficult to divvy-up resources like RAM and I/O & network throughput channels than one would like. But it’s super simple to set up — you simply run the SQL Server installer 3x, each time creating a unique instance name. Then, at the end of it, your SQL instances are addressable by MachineName\InstanceName, e.g. SQLDEV\Foo, SQLDEV\Bar, etc.
Now the time came to create a “QA” environment. Which, like DEV, didn’t need to be very performant (that’s a made-up word that consultants like to use, but it’s generally accepted in our industry so we go with it), and so, since we had some hardware laying around from a recent “up-gration” (upgrade-migration… okay, now I’m being ridiculous), we said “let’s use that thing!”. It was a 2-node cluster setup with shared DAS storage. For the uninitiated, DAS is Direct Attached Storage, i.e. an array of disks that you can directly attach to 1 or more servers using whatever interconnect is available on the endpoints (usually SAS, serial-attached SCSI – which is one of most fun acronyms to pronounce in IT: “scuzzy”). DAS is not to be confused with a SAN, Storage Area Network, which is a super fancy storage array with performance tiers and snapshot technology and de-duplication and all that hotness.
The interesting thing with a cluster is, when you install SQL Server instances, you can’t actually use the same “MachineName” for the 3 different “InstanceName”s. Because in a cluster, the former is actually the “VirtualServerName”, which must be unique per clustered instance, in order to properly configure cluster resources, storage pools, and networks.
The reason this is interesting, is that it contrasts with stacked instance setup on a standalone server (non-clustered). So if you compared our DEV and QA setups side-by-side, it’s a bit odd-ball: instead of SQLDEV\Inst1, SQLDEV\Inst2, etc., we have instance names like SQLQA1\Inst1, SQLQA2\Inst2, etc. That makes the ol’ “find and replace” in config files a bit harder. But, at the end of the day, it’s all just names!
Another interesting “gotcha” revolves around SQL 2008R2, which I know shouldn’t be on the short-list of versions to spin up, but unfortunately, a legacy ERP system demands it. Well, it only happened to me with the 2008R2 instance installation, not the 2016’s, but that’s not to say it couldn’t happen with others. Anyway, after installation, SQL Agent was not working; it wasn’t coming up as a cluster resource. Basically, exactly what was outlined in this timely & detailed article at mssqltips. I won’t restate the fix instructions here, just give it a read! I do want to clarify something though.
In part of the fix, we use the handy-dandy PowerShell cmdlet Add-ClusterResourceDependency . In its basic form, it requires 2 arguments, Resource and Provider. To someone who’s not a cluster expert, this terminology might be a bit confusing. Resource in this case is the SQL Server Agent, while Provider is SQL Server itself. But we’re adding a Dependency, right? Which depends on which? Well, we know that Agent depends on the engine, so, Resource depends on Provider. Yes, I know, that’s what the article tells you to do — I just like to understand why.
Finally, there’s the question of divvying-up resources to the stacked clustered instances. Now, in a standard cluster, you’ve got your active node and your passive node. But if we’re stacking instances, we might as well split the SQL instances up and take advantage of the compute resources on both nodes. (Storage is still shared; this is a cluster, after all!) The CPUs are no problem — however many instances are stacked on a node, they’ll share the CPU cores pretty cooperatively. Memory is a bit of a different story. We want to take advantage of all the available RAM in the cluster, but…
As you know, you can configure each SQL instance to use a set amount of max. server memory. So let’s say each cluster node has 32GB RAM, and we’re stacking 4 SQL instances total (to keep the math easy!). If we split them up among the nodes at 2 each, each instance can use 16GB. But if for some reason a node goes down, and all 4 instances move to 1 node, now they’re fighting for that 32GB! So we should reduce their max-memory settings to 8GB each, instead of 16. But we don’t want to do this manually! Fortunately Aaron Betrand has an excellent blog post on the subject, with some useful ideas about how to do this dynamically & automatically. The only issue I have with it is that it requires the linked-servers to use a highly privileged account (sysadmin or maybe serveradmin role) to be able to set that max-server-memory setting. But wait, remember what we said at the beginning? This ain’t production! Who cares about security? (That’s facetious, sort of — in reality, yes, we don’t care as much about security in lower environments, but we should still care a little!)
That concludes this week’s adventure! Thanks for reading.