In light of https://natej.dev/2021/08/13/new-beginnings/ , which you should also read if you care about personal celebrations of life and love. I also have a bit of good news to celebrate at work. I’m finally not the LONE DBA!!
Look ma, I haz underlings!
Also, if you’re at all into the collectibles market, our company is hiring. And our tech org is growing FAST. Like, blazingly fast. Come check us out and apply. We’re super nice folks. (Except this Nate the DBA character, he gets ornery.) =P
I have almost zero down time anymore so don’t expect an impressive amount of content out of me, but I am getting back into writing and teaching, so I hope to do at least a few TSQL-Tuesdays over the rest of this year.
Build a solid CS/IT foundation, build your soft-skills, and then learn the latest & greatest tech stack. Or the mature enterprise stuff. There’s plenty of room for both!
Our benevolentoverlord prompted us (months ago; I know, I’m always late to the party) to write about what we do in our day-to-day jobs as SQL Server professionals, aka #SQLCareer. The actual idea is to do 4 posts in a row, for 4 working days (either adjacently or the same day each week for 4 weeks).
Sticklers, us? Nope, never.
Without any kind of future guarantees, here is my first.
Emails, tickets, and alerts, oh my! Today was actually pretty quiet on the monitoring front — no major performance issues, no interruptions, and no job failures. That one that occasionally takes 8+ hours did okay last night. More often than not, I have to kill it if I find it’s been running into the work-day. That should be automated. /backlogged
Re-wrote some analyst’s data change script to be more “friendly” and set-based, instead relying on identity values — which is important when they can differ between Dev & Production environments. Deployed a few change requests, including that one.
On the side of less tedium, I fleshed-out more of the plan to virtualize our remaining bare-metal SQL servers. The usual ‘gotchas’ have to be discussed with the Server Admins — reserving compute resources, preventing the ‘noisy neighbor’ problem, and having enough storage to do the migration.
Yes, that means about 2x the storage as they’re currently using. Got a problem with that? No? Good. 😉
Finally, I worked on some code for querying the SSRS ReportServer database to obtain report metadata (including stored-proc usage, parameters, and user activity). The core concepts came from Ted Stathakis‘s SQLSaturdaysession on SSRS 2016. Hopefully, some of that code will be coming to GitHub soon.
“Delight” being sarcastic. No really, I love my job, but sometimes there are just some things that make you go ‘grr arg’.
First up, developing and finalizing some “data cleanup effort” reports. These are things that show the business users “what’s wrong with the data” and give them some guidance on how to “fix it”. Now, because there is no easy “management GUI” for this data, and because actual changes need to go through change-control, it’s not going to be a cake-walk. But given the right inputs and some domain knowledge, they can make the decisions as to what those changes should be, and then hand them off to the BA’s (business analysts). Usually in Excel form.
Next we have the ‘grr’ part. This is where I spent about 40 minutes trying to explain to the 3rd-party software vendor exactly where and how to integrate with our core data systems. Most of the info should not be ‘new’ to them, but since I never know which support tech will be assigned to the issue, it’s best to err on the side of verbosity. To make things more confusing, the folks on my end, who have been interfacing with the vendor thus far, aren’t intimately familiar with the underlying technology bits, so the conversation gets a little dicey before we all arrive on the same page.
Fortunately, since I had their attention, I was able to sneak in a suggestion to upgrade the back-end database for the software (which we self-host, but is essentially managed by them & their software) to something from this decade. MySQL 5.0 is old, guys, mmkay?
A developer needs a new table & some initial data-load to production; it’s passed QA. Great! They wrote the query a little weird, but nothing ol’ SQLPrompt can’t fix.
Commit outstanding stored-proc code & RDL files (SSRS reports) to source control. Even if I’m the only one working on them, I’d rather they be safe & track-able.
Ruh-roh. A developer set off some proc in a Dev db, it’s been running for over 30 minutes… and then they cancelled it! But wait, it’s stuck in rollback. Yes, kids, rolling-back a transaction can take even longer than the transaction itself, due to rollbacks being single-threaded.
Now, since the user’s hitting of the ‘Cancel’ button (aka “Stop” in SSMS) does not wholly and entirely cause a kill SPID command, I do that for them (since only sysadmin or processadmin can kill sessions). Then I run ye oldekill 64 with statusonly (64 happened to be the SPID) to check on its rollback status. Good news, everyone! It’s at… 0%.
And it stays at 0%. For fifteen minutes.
I move on to other tasks of course, since there’s nothing more I can do (short of pulling the plug on the server, which is never a good idea (unless you’re testing your Disaster Recovery Hope plan!). We also create a ticket to fix this stored-proc so that it doesn’t “take forever”, even if it has to churn through millions of records.
Finally, the statusonly check starts moving. It’s about 45% rolled back, with about “300 seconds remaining”. This, as anybody who’s been a DBA will tell you, is in “Microsoft time”. You know those progress-bars you typically get in Windows while something is happening? They’ll start off fast, make it to around 90-some percent, tell you there’s about 1 minute remaining, then take at least five minutes to “finish that last little bit”? Yeah, that’s a “Microsoft minute”. Patent-pending.
But fortunately for us, it does finish in just about the time it promised, maybe a bit longer. Just in time for the Dev to test what they wanted to test before going home for the day. And just in time for me to have my evening not ruined by a runaway transaction.
PS: Anybody who’s ever worked with SQL, including yours truly, has done this (set off a long transaction and/or tried to rollback a long transaction) at least once in their career. If you ask a seasoned DBA or DB-Dev, and they say “No, I’ve never done that!”, they’re a dirty rotten liar. =D
Are You Not Entertained?
If the life of a DBA sounds like fun to you, get in touch with your local IT recruiter and find out what the job market is like! You’ll hear all sorts of doom & gloom about how “the Cloud is taking over everything” and “the DBA is dead; long live DevSecDataOpsUnicorns!”.
No, I joke. Yes, some of the fundamental responsibilities of the DBA are shifting to the cloud vendors, to IaaS/PaaS services, but at the end of the day, being in IT is about learningtech and working with people. Build a solid CS/IT foundation, build your soft-skills, and then learn the latest & greatest tech stack. Or the mature enterprise stuff. There’s plenty of room for both!
This is partly inspired by the recent #StackOverflowPodcast episode in which Jon Skeet and the some of the Stack Overflow women talk about the Feminist movement and what it’s means to them.
So I’m not going to even “go there” in terms of the movement in general, where I stand, or anything really deep, because it’s a huge iceberg of a topic and I can’t do anywhere near justice to it. Even the discussion above was pretty basic & high-level, but they give lots of links in the show-notes to truly deep-dive into, if you feel like it.
One burning question, essentially, boils down to this:
Why is there still a striking disproportion of females in the Data Professionals career space (and, more broadly, the IT/Dev space in general)?
And like so many questions I ask here, I won’t actually have an answer for you (that would be too easy, wouldn’t it?). Instead, I’ll simply reflect on my own (very limited) experience in working with women in tech, and hopefully get your thoughts in the comments as well!
In a very small software shop, there were actually no women for a time. But shortly before I was hired, they brought on a husband & wife team – two developers who were excellent at what they did, and had, surprisingly, broken the stereotype of “spouses can never work together”. She was so easy to work with, and I think my experience here helped me at least start to understand some of the nuances around “women in tech” and what it all meant. Basically, she wanted to be treated just like “one of the guys” — which, I understand now, is, in itself, an anti-feminist phrase, but back then I wouldn’t have known — and it reflects the culture of the time, which is that this was a team of mostly male developers and we were still “finding our way” on the long trail of equality in the workplace.
So what this meant, in practical terms, was a couple things:
No bias for job-assignments of things like documentation, task management, or communication-centric tasks. While yes, she was actually quite good at these, and would later become the de-facto PM and Scrum-master for us, it was understood (and probably stated) that this was not “because she was female”, this was because she was the best at it. But again, is that specifically because she’s a woman? I don’t think so.
Addressing the group as “you guys” was perfectly acceptable.
Pay was equal – at least between the equivalent roles & seniority levels (e.g. her & her spouse). You don’t typically share or discuss salaries among peers, but we knew, and our bookkeeper ensured, that this was true. Because if it wasn’t, someone would’ve had some words about it.
Also, there were a few positive aspects of the culture that helped make it more equality-apparent, which I think were just byproducts of the quality of people hired. We didn’t do “dirty jokes” or have sexist (even unintentionally) discussions, nor did we engage in gossip or any kind of “just the guys” activities. We really just did the work and kept it professional, and any time we were outside the office together, it was almost always shop-talk. I think that’s the nature of a startup — you really don’t have time for anything else, any of the “fluff” or crud that spawns from idle hands & minds.
But it wasn’t all roses & sunshine.
A New Female Developer Candidate
After that dev moved on, we knew we had to replace her. And the company workload was pivoting a bit, so our candidate criteria weren’t the same as those of her position. But putting it that way makes it sound like we were either specifically looking for someone different, or that we had moved somebody else into her position and now had a completely different role to fill. Neither is the case, really; with a startup that’s organically growing and shifting, you don’t get the luxury of well-defined roles. You basically need what the business and the team needs at the time, and that becomes your reality, until it’s not, and your team pivots again to fill the new mold, learning & growing along the way.
So anyway, we were hiring for a somewhat nebulous developer position. And one of the candidates we saw was female. We did not end up hiring her — unfortunately, in my opinion. That’s not to say the candidate we did hire was bad; he was great too, that’s just not relevant here. After her interview, the discussions we had were interesting. And I think it would have been greatly beneficial if the previous dev (the woman I talked about above & who had left recently) could have been present to offer her insight into the hiring process; but she, understandably, was not available & already busy with her new stuff.
This new candidate had a good deal of “embedded systems programming” background, which was interesting because it was not at all what our software was about, but in hindsight, probably could have proved valuable in making our SDLC processes leaner & more efficient. She also had great general tech skills and was a quick learner. But ultimately the reasons not to hire came down to the dissimilarity of background vs our product, AND her personality as we perceived it — in a word, she was “nervous” and “not confident”.
This is a big failure, in terms of equality/feminism.
And as I said, this is all purely hindsight. None of us realized at the time what we actually meant. But that’s no excuse, just history. So let’s unpack that a bit. Or, in other words…
Of course she was nervous! She was A) in an interview, and B) surrounded by men. It’s not like we said anything or acted in a way that was actually misogynistic; we’d like to think we’d learned how to be open & equality-centric enough that anybody would feel welcome and able to talk about their experience and why they’re a good fit for the job & the company. We didn’t even have much of a “culture” to speak of — it’s not like we were a big enough team to even have cliques or established norms, we just kinda discussed our work, did the work, collaborated on the work, and went home at the end of the day to our families/friends. However, in the same breath, we DID have a “culture”, in the sense that we were a small tight-knit team (while in the office) with a set of personalities that, so far, worked very well together; and on a small team, personality-compatibility is important.
Anyway, here’s the crux. We didn’t even recognize that what we were saying was, underneath, an anti-equality statement:
She should have been more self-confident than the average male candidate, in an interview, in order to meet our expectations.
Now obviously, if you ask the hiring manager (aka owner/CEO/president/founder) of the company and the HR person, they’ll agree (rightfully so) that she was not hired due to the gap in technical experience & the fact that her skills did not fit with what we needed. And this is true, as I said before; we were doing web-based software in ASP.NET, with a SQL back-end, and none of those were at the top of her skill-set. So I’m not self-flagellating for us passing on her as a candidate. (Again, the person we did hire worked out just fine.)
I’m acknowledging, and apologizing for, the fact that we elevated her (completely understandable) personality/disposition to an artificially high importance in our discussion about the candidate.
That, I think, is what an equality-minded leader would try to make sure we avoid next time. If she had had very similar experience and skills to the next candidate, we should have certainly hired her. And if I were to retroactively predict the past-future (breaking all kinds of grammatical rules in the process), had she been hired, she’d have “come out of her shell” and gotten along swimmingly with the team & the work.
But again, this is all ancient history by now; it’s not like we can go back and change our decisions. We just need to be conscious of them and learn from them.
It’s always hard to say goodbye to a colleague, especially someone who’s so central and ingrained in the company lore and holds so much of the “tribal knowledge”. Hell, I was that guy just a couple years ago.
So now I’ve seen a couple such old-hats move on from my current team, and seeing both sides of the proverbial torch-passing is interesting. There’s definitely some very common, very important things that we should always do.
Documentation, documentation, and more documentation.
Indeed. Also, finishing critical tasks, handing off in-flight projects, re-assigning tickets, talking to managers, prepping teammates for the work overflow, and cleaning out that huge buildup of clutter that you’ve collected over the years. Virtual or physical… often both!
Unsurprisingly, where we all seem to differ widely is the human aspects. Breaking the news, saying goodbyes, doing those last-minute get-togethers and send-offs. What do those last few weeks and days look like? For some, it’s just business-as-usual up to the last minute — they’re literally so busy they have little other choice. That’s how it was with the helpdesk manager we parted with last year. I used some of the time to put together documentation and thank-you letters, which I hope ended up being helpful. Database diagrams were printed and taped. Wikis were written.
But the main thing is to make sure you exchange contact info and stay in touch. It gives the team a sense of comfort, knowing they can reach back out when those random questions that nobody’s thought about for several months resurface.
I’ve learned a lot from those folks that took the time to pass on their knowledge and made the effort to keep in contact. And I appreciate them for that! Today I’ll thank one of my exiting managers; she knows who she is. She taught me a lot about our internal application stacks, integration and interop, company culture, tribal knowledge, and not standing for anybody’s BS, including my own. Good luck with consulting, stay in touch, and kick some butt!
That’s all for this week. I promise I’ll work on that “database collation problems” post soon… :o)
I now present to you, a recent adventure in building & configuring a SSRS 2016 / VS2015 project to replace our aging SSRS 2008R2 / VS2008 setup. In trying to make things ‘better’, I wanted to centralize the storage of the Data Sources. I found a tip on StackOverflow with a neat (albeit hacky) idea on how to do this. So I started diving into it. Here’s how that went.
I have a bunch of published reports on a ‘real’ report server (SSRS instance), which were created off-the-cuff using Report Builder 3.0 (no source-control connection or anything). So the first thing I had to do was go and download each RDL (thanks, Microsoft… ugh!), to my local VS2015 project folder, so I could bring them into my SSRS project. I didn’t bother trying to download/copy the Data Sources (they’re basically just connection-strings, and anyway you can’t download them from Report Manager), so I settled for re-creating those in VS.
Then it was time to set up the solution & projects. I prefer to organize my reports in context-relevant folders, like “Operations”, “Marketing”, etc. In order to do this sensibly, within the VS solution, you need to create a Project for each major fold you’ll have. Makes sense; this is how the old solution was constructed too. Here’s where I tried to improve things: I created a “Datasources” project, which would house just the shared Data Sources. Here’s where I created my new data-sources, setting them up with saved SQL auth (login & pwd). Sure, maybe not the “best practice” but it keeps things simple — my SSRS user only ever has read-only db access, and the pwd is managed well enough to keep auditors happy.
Thus, my plan (in following the SO tip) was to have all the other projects’ Data Sources be pointers to these shared Data Source files (RDS’s). Sounds good in theory, right?
Starting to Implement
Well. I started small with just one project, one report, and one data source. I went into my Nate test project, did “add existing item” into the Reports folder, and browsed to pick the rdl that I had just downloaded from my live SSRS server. I then went to my Datasources project, copied the rds that I needed, pasted it into Nate test project’s Shared Data Sources folder.
Now at this point there are 2 copies of this file in my solution folder. We don’t want that. So I opened up trusty ol’ Notepad++ on the Nate test.rptproj file, and edited the XML node under <DataSources><ProjectItem><Name>My-Source.rds</Name><FullPath>My-Source.rds</FullPath></ProjectItem></DataSources> , changing the FullPath node value to <FullPath>..\Datasources\My-Source.rds</FullPath>. I then deleted the physical copy of the file at \MySolution\Nate test\My-Source.rds , so that only the shared one physically remains in the filesystem (\MySolution\Datasources\My-Source.rds).
Another way to accomplish this, you may read, is to right-click Shared Data Sources within your project, e.g. Nate test, and say “Add existing item..”, and go select the global shared RDL from your master Datasources project-folder. However, this still results in a physical copy of the file within Nate test folder! So in either case you’ll end up doing some editing & deletion.
With me so far? Okay, here’s where it gets interesting. I could successfully build and deploy this report to my SSRS server, verifying that it showed my “new thing” (I added a text-box that simply said “This is the new hotness!”). But I could not preview the report in VS! Sad panda. And of course, the most vague generic error message in the world:
An error occurred during local report processing
An error occurred during report processing
More Googling led to a somewhat random tip in the vein of “Go to the report properties, Open the report’s Data Source, and re-select the Shared Data Source from the dropdown”. Combine with “Edit the Shared Data Source, ensure your saved login/pwd is filled in, and Save it”. Presto! Previewing now worked.
But why is this? Well, as it turns out, it’s not all that surprising. The RDLs are stored without the saved login/pwd, because of course they’re just plain-text XML files. So to “protect you”, MSFT doesn’t store your SQL auth credentials, even if (and now matter how many times) you click “Save my password” on the connection-properties window (connection-string builder).
Great. Well, I did that once, for Nate test project. Let’s see what happens with another project (aka report folder), say Test again. So I add an existing RDL, I copy-paste the Data Source from the global shared Datasources project, manually edit the rptproj in Notepad++, delete the copied rdl, etc. I then try to preview the report… and behold, the same maddeningly vague error message!
“Well duh!” you may think to yourself, “the credentials still aren’t stored anywhere un-encrypted!”. Yes, dear read, you are correct. But do I want to right-click on this project’s Data Sources, edit it, fill in the credentials, save, rinse, repeat, for every single project?! A resounding NO!
So what should I do? I could go type in the credentials to the actual connection-strings contained within the ‘master’ rds files. Of course then they’re in plain-text for all to see… but wait, do I care? Another no. Let’s go edit some connection-strings!
Why did MSFT choose to name the XML node ConnectString in their rds schema instead of ConnectionString? Don’t ask me!
Here’s the first gotcha: My reports (RDLs) need to be re-pointed at the correct Shared Datasource, because currently, as they were downloaded from my ‘real’ SSRS server — which had my desired top-level folder layout of “Datasources”, “Nate test”, “Marketing”, etc. — their Data Sources xml node points at \Datasources\MyDatasource.rdl. This was the correct path on the SSRS server and still will be the correct path when all is said & done. BUT, to support local debugging/previewing, the build output has to make a copy of the rds in the Project’s bin\Debug folder, which it will happily do, and then just as happily proclaim Could not find a part of the path 'C:\Users\Nate\Documents\Visual Studio 2015\Projects\MyReportSolution\Nate test\bin\Debug\Datasources\MyDatasource.rdl'. — because DUH, the build system didn’t make you a ‘Datasources’ sub-folder under the Debug folder, why would it!? So by either manually editing the rdl file to remove the \Datasources\ path from the node, or by clicking on the Report’s Data Sources node in the Report Data pane and re-pointing it at the local Shard Data Source (which again, is itself a pointer to the globally shared Datasources!), you can fix this issue.
In the rdl XML, that looks like this (using strikethru to show what you remove):
And there’s another gotcha: After we do all this, if we open the rds from within VS solution-explorer, it still won’t “think” it saved the authentication! But the connection-string will be clearly visible with the plain password. Here’s a couple pictures of what that looks like.
Oh, by the way, did I mention how many times VS crashed on me while doing these gymnastics? At least a few.
So, what have we learned? Well, for one, this is a crappy situation born of poor in-product support. I should be able to configure Solution-level shared Data Sources, use them in as many Projects (within said Solution) as I want, and have VS configuration management support them; bonus points for doing so with saved & encrypted credentials. Ideally, when we check this into source-control, we’d check in the “DEV” environment flavor connection-configs. Then, when the reports get deployed to the “PROD” SSRS server, the same globally shared Data Sources are already present (and they don’t get over-written, thankfully by default!), configured by the DBA with prod credentials, and nobody in the development pipeline needs to know said credentials. Yay?
But alas. We live in an imperfect world. I also live in a world where I’m both the Report Developer, the Report Manager, and the Production DBA. So guess what I get to do? That’s right, not care about the connection-string security!
Okay, yes, I care a little. Depending on my mood. But not enough to turn my whole dang world upside-down trying to design and implement a robust multi-tiered-environment solution for one measly SSRS repository. Kudos to you if you do — you’re a better man than I.
Or “better woman”. But I’m not a woman. So I guess the correct phrasing would be “You’re a better woman than I am a man.” No, that’s not right. “You’re a better person than I am?” Sure, I guess that works.
Create the solution & projects
Create the global shared Datasources project, and create your Data Sources (RDS‘s)
Import your Reports (RDL‘s) to each requisite report-project
Manually edit your rptproj to point the data-sources at the central shared path
Manually edit your RDL‘s to remove previously established paths to the data-sources
Configure your rptproj‘s to output to the correct folders on the report server — this should be done by default, i.e. VS has the “intelligence” to guess these for you, but just double-check.
And now to go automate all this with PowerShell… right after this beer.
Anyway. Until next time, folks! If I’ve made any mistakes, missteps, or otherwise offending your technical sensibilities, and/or if you know of a better way to accomplish what I’m after, I’d love to hear from you in the comments! =)
The problem isn’t so much that the role is vaguely defined. Although, depending on the size of the IT org and the tech stack, it can vary widely from a jack-of-all (DB Dev, report writer, production ops, the works) to a highly specialized performance tuner who works with 7 other teammates, each of whom has a unique surgical specialty in the data platform. But that’s not the problem — well, not the main problem. It is a problem in the sense that the business folks, especially HR, are notoriously and astonishingly ignorant of what a DBA or related role actually involves. But you get past that once you start talking to the tech leads and IT directors.
No, the problem is that, like most higher level technical roles, you don’t really know how a candidate is going to function in it (the role) without actually seeing him or her.. IN it. Do they keep a cool head when production goes down? Do they have a solid plan of attack for the dreaded “everything is slow!” complaint-storm? Do they have a good handle on HA & DR architecture & implementation? Can you rely on them to actually practice and follow thru with those strategies? Can they be a continuous learner and keep abreast of new developments while still tempering that with wisdom & maturity, applying the correct tools to the proper problems? Do try add value to the team and organization by both teaching and learning from others?
These are truly difficult, complex questions that are nearly impossible to deeply assess and fully answer during an interview process. Largely because the only real evidence of their answers lies in actual experience. Sure, a cert here or an MVP there definitely helps your case. But at any rate, we try our best to chip away at the boulder.
Pivoting to a more positive note, I’ll share some of the better questions that I’ve experienced during my career so far.
Some good examples.
How would you design and build a data copy/sync process across/between tiered environments, say DEV-QA-PROD?
Really great question. This is a common problem is small-to-medium enterprises with legacy systems where DevOps hasn’t quite reached down to the depths of the internal application stacks and people are still dealing with “refresh cycles” on the order of months, quarters, or even years. You can approach it purely from a tooling perspective, but that’s not the whole picture. Thus, it calls for some thought and team-culture ideas as well as “knowing the nerd-knobs”.
We have a complex process flow that involves a lot of stored procedures, say 50 total. Some of these are non-sequential, meaning they can be executed in arbitrary order, while others need to be sequenced with each other in “blocks”. This is a vendor product, so ultimately, the customer gets to decide the schedule and order of execution of this flow. The solution needs to be maintainable by field engineers. How would you handle this?
Woah. Talk about diving down a rabbit-hole. This is interesting in the sense that it exposes a bit of the architecture and some of the potential pain-points that the team is hoping to solve, while leaving enough room for improvement and experimentation by the hopeful candidate. More to the point, it’s just an example of a more general technique, which to me is very effective: taking an architectural problem that actually comes from the “real world” (the company/team that’s interviewing) and asking for the candidate’s ideas on how to solve it. You don’t need to get in-the-weeds super-detailed about it, but outlining your ideas helps indicate how you think about complex challenges and shows what kind of value-add you would bring to the team.
And finally, a perennial favorite:
Tell me about a time you broke production, and more importantly, how you addressed and resolved it.
So many stories from the trenches involve downtime and mistakes, it’s good to ‘bond’ over them. It helps bring the egos back down to earth, and reminds us that we’re all just meatbags, making technology to do our bidding, occasionally to our own regret. It shows the candidate’s “pressure cooker” mentality, or at least, what they tell you about it.
If you’re a DBA, Dev, or IT pro, help your managers better understand your team’s needs when it comes to hiring. Get involved in the job description write-ups and screening process questionnaires. Barge your way into those ivory towers, if you have to — or you’ll regret the time you waste on candidates who really belong in a different role than the one you’re after.
If you’re a manager, PLEASE LISTEN to your reports and tech leads. They know what makes a good team member, they’ve been doing it for a long time. Don’t dismiss their advice or block them from being part of the hiring process — yes, they are busy, and yes, they can be crotchety, but their input is highly valuable toward bringing in effective & productive talent.
That’s all folks!
PS: I know I missed the “deadline” by about an hour..ish. I blame DST. Heck, it’s still Tuesday for the majority of the Western hemisphere. I’m not biased, but I write in English, so… ya know. Take it as you will. Now excuse me while I go hide from the blog-police in my ASCII-bunker.
..while there are likely even better ways to do this in the long-run, this quick & easy approach was sufficient to save me time and effort..
At this point in my career, I’m not managing massive environments, so I don’t feel the need (nor have the expertise) to use a large scale solution like DSC or SCCM. But I’ve had to install SQL Server a few times, so I figured it’s worth at least scripting out a standard pre-configured installation, so that A) I don’t need click through a GUI ‘wizard’ hearkening back to the ’90s, and B) the SysAdmins can “fire and forget” (read: stop bugging me about it).
Thus, I’m attempting to K.I.S.S., while making it configurable & repeatable. There are some limitations of this approach, as alluded above. It’s not “massively scalable” (scaleable? scale-able?) because:
The PoSh script still needs to be deployed locally to the server in question
The installer config (.ini) also lives locally (though it probably could be a UNC path, it’s just a file after all)
The script prompts you for the service account (SQL engine, Agent) credentials and the sa password using the Read-Host -AsSecureStringmethod cmdlet, so some meatbag still has to type those in. This is because we don’t have an enterprise pwd/secret-management system where I could, say, ask it for a service account credential set and tell it to embed that securely in a script without it actually being visible to me. So, while yes, they’re kept in a “vault”, it’s not query-able by anything else, so an admin still needs to copy & paste them into whatever configuration screen he’s working with at the time. Not ideal, I know, but we work with what we’ve got.
PS: Yeah, yeah, “don’t use sa, rename it or disable it; or use Windows Auth only!”. Rage, howl, fire & brimstone. I’m not going to argue about it; we can save that for another post. This environment dictates that its used during setup and then disabled later, so that’s beyond the scope of the installer config.
So yes, while there are likely even better ways to do this in the long-run, this quick & easy approach was sufficient to save me time and effort for the occasions when a new SQL box/VM needs to be spun-up.
A primer on SQL cmd-prompt installation & its arguments
And finally, twothings that I attempted to understand but ultimately failed to implement, because (apparently, at least to me), PowerShell remote-ing is a P.I.T.A.
First we need an .ini file to work with. You could either create it from scratch, or take it from an existing SQL box’s “Setup Bootstrap” folder. Example path C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log\20170801_073414\ConfigurationFile.ini — indicating this was an install done on 8/1/2017 at 7:34am. Right above that, at simply C:\Program Files\Microsoft SQL Server\130\Setup Bootstrap\Log\, you’ll see a Summary.txt file, which can actually come in handy while you’re testing these unattended installs and wanting to see why it failed.
The first link above, from MSFT Docs, does a pretty nice job of telling you all the things that make up this config file. You get to bypass the TOS prompt, enter service account details, specify drive letters (paths) for default data/log file locations & tempdb, slipstream update packages (UpdateSource), and even more advanced stuff like AG settings and whatnot. My example will be a simple standalone instance using the default name, so I’ll be sticking with the basics.
We can then use this file in the ConfigurationFile argument of setup.exe from the SQL Server install media. To put a little more color on that: the .ini file is really just a collection of command-line arguments to setup.exe; you could also list them all out in-line, but that would be tedious and silly. Here’s a couple major selling points of creating your own config file:
Slipstream updates (SP’s, CU’s), instead of having it go out to MSFT update servers (or *aghast* sticking with the original RTM bits, you heathen you!)
Specify drive letters / default file locations: sure, this may be considered old-hat if you’re running super slick storage, but I still find it makes management a bit easier if I know where my MDFs, LDFs, TempDB, & backups will always be.
Take advantage of 2016’s better TempDB setup options (# files, size & growth)
We will, however, keep a couple arguments out of the .ini file and instead throw them into the ArgumentList from the calling PowerShell script. Speaking of, here’s what the PowerShell script needs to do:
Prompt the operator (SysAdmin or DBA) for the SQL & Agent service account credentials, and (optionally) the sa pwd (if using it).
Fetch our install media from the central network share where we store such things (server & office ISOs, for example).
Mount said ISO to our virtual disc drive.
Run its setup.exe with the following arguments:
The config .ini file
The service & sa accounts
After it’s done, un-mount (dismount) the ISO.
Then the DBA can connect to the brand-spankin’-new running SQL instance and do other post-setup configurations as desired (i.e. set max-memory, maxDOP/CTFP, etc). And sure, those could also be done in PowerShell (thanks in no small part to the awesometeam at DbaTools), I chose not to do so in this case.
As the bloggers say, “that’s left as an exercise to the reader”.
Plus, they’re never quite as deterministic as we’d like them to be — they depend on the server’s compute resources, i.e. memory size & CPU cores, as well as estimated workload & environment tier, so it’s often a gamble in “how correct” your initial settings will be anyway. Still, anything is better than the defaults, so configure-away!
Here are the Gists I’ve created to go along with this post. If I’ve made a mistake, or if you, dear reader, have a suggestion, we can incorporate them into the gist without me having to go back and edit the blog post!
I’d love to get feedback on how you would improve this, what you might do differently, etc. Drop me a comment or a tweet!
; There are THREE static placeholders you need to replace/type when you use this:
; 1. <YOUR UPDATES/PATCHES FOLDER>, which is for slipstreaming whatever service-pack(s) and cumulative-update(s).
; 2-3. <YOUR DOMAIN> and <YOUR DBA GROUP>: for specifying domain account(s) which are given SQL sysadmin ('sa') rights,
; hence, usually your DBA group. You could move that part out to the calling script instead, if you wanted,
; but I chose to leave it here because it's not likely to change much within an environment.
; required first line
; Specifies a Setup work flow, like INSTALL, UNINSTALL, or UPGRADE. This is a required parameter.
; Specifies that SQL Server Setup should not display the privacy statement when ran from the command line.
; Use the /ENU parameter to install the English version of SQL Server on your localized Windows operating system.
; Setup will run silently, logging to files in the setup-boostrap directory
; Specify whether SQL Server Setup should discover and include product updates. The valid values are True and False or 1 and 0. By default SQL Server Setup will include updates that are found.
; If this parameter is provided, then this computer will use Microsoft Update to check for updates.
; Specifies features to install, uninstall, or upgrade. The list of top-level features include SQL, AS, RS, IS, MDS, and Tools. The SQL feature will install the Database Engine, Replication, Full-Text, and Data Quality Services (DQS) server. The Tools feature will install shared components.
; Specify the location where SQL Server Setup will obtain product updates. The valid values are "MU" to search Microsoft Update, a valid folder path, a relative path such as .\MyUpdates or a UNC share. By default SQL Server Setup will search Microsoft Update or a Windows Update service through the Window Server Update Services.
UpdateSource="<YOUR UPDATES/PATCHES FOLDER>"
; Displays the command line parameters usage
; Specifies that Setup should install into WOW64. This command line argument is not supported on an IA64 or a 32-bit system.
; Specify a default or named instance. MSSQLSERVER is the default instance for non-Express editions and SQLExpress for Express editions. This parameter is required when installing the SQL Server Database Engine (SQL), Analysis Services (AS), or Reporting Services (RS).
; Specify the root installation directory for shared components. This directory remains unchanged after shared components are already installed.
; Level to enable FILESTREAM feature at (0, 1, 2 or 3).
; Specifies a Windows collation or an SQL collation to use for the Database Engine.
; Set to "True" to enable instant file initialization for SQL Server service. If enabled, Setup will grant Perform Volume Maintenance Task privilege to the Database Engine Service SID. This may lead to information disclosure as it could allow deleted content to be accessed by an unauthorized principal.
; Windows account(s) to provision as SQL Server system administrators.
When a date range, or time period, is specified in SQL, it’s easiest, clearest, and most concise, to use a “greater-than-or-equal-to Period-Start, and less-than Next-Period-Start” logic. Mathematically speaking, we are defining the range as closed on the left, open on the right.
This is a bit rant-y, but… indulge me. I’ve been writing/refactoring a lot of old reporting queries. And, like most reports, they deal with dates and datetimes — as parameters, boundaries, or where/join predicates. I also got way too intense with a recent SSC post (Sql Server Central), which fueled the fire even more.
SQL Server is very good at handling temporal datatypes and calculations against them. We’ve got functions like dateadd, datediff, dateparts, datatypes datetime2 and datetimeoffset, date, time, etc. It supports all sorts of format conversions if you need to display them in various ways.
..even though that should be left to the presentation layer!
Here’s the issue. Well, there are several issues, but we only have time for a few.
Here’s the first problem
Report users don’t understand the “end of a time period” problem. I don’t have a good name for it; others might call it the “Day plus one” problem or the “Less than date” problem. What do I mean by this? Well, let’s back up a bit, to DBA Commandment #6, “Thou shalt not use between with datetimes.” In order to first understand the issue, we have to understand why this is a commandment.
When a date range, or time period, is specified in SQL, it’s easiest, clearest, and most concise, to specify it like so: @TheDate >= @StartOfPeriod and @TheDate < @StartOfNextPeriod. Mathematically speaking, we’re defining the range as “closed on the left, open on the right”. In other words, Min <= X < Max.
The reason we do this with datetimes is found right there in the name of the datatype — it has (or can have) a time component!
Let’s talk examples
Say you’d like to report on the month of March 2017. How do you determine if your data-points (stored as datetime or, hopefully, datetime2) are within that period, that month? Well sure, you could write where month(MyDateColumn) = 3 and year(myDateColumn) = 2017 …
NO. That is horrible, don’t do that.
It’s not SARGable and renders your index on that column useless. (You do have an index on it, don’t you? No? Make one!) Okay, let’s stick with something SARGable. How about MyDateColumn between '20170301' and '2017-03-31T23:59:55.999'? (You did read this post about using culture-neutral datetime literals right?) But wait! If your data is a datetime, it’s not actually that precise — your literal gets rounded up to 20170401 and you’re now including dates from April 1st (at midnight)!
Many developers and report-writers assume that the values in their data will never be within the typical “1 second before midnight” or “1/300th of a second before midnight” escape window of your “3/31/2017 23:59:59.997” bounding value. But can you guarantee that? Didn’t think so. Worse, if you use the .999 fraction as given in the 2nd example, you’re either “more” or “less” correct, and nobody can actually tell you which way that pendulum swings because it depends on the statistical likelihood of your data having actual literal “midnight” values vs. realistic (millisecond-y, aka “continuous”) values. Sure, if you’re storing just a date, these things become a lot less complicated and more predictable.
But then why aren’t you storing it as an actual date, not a datetime!?
So what’s the right answer?
As I said, “greater than or equal to ‘Start’, and less than ‘End'”, where ‘End’ is the day after the end of the period, at midnight (no later!). Hence, MyDateColumn >= '20170301' and MyDateColumn < '20170401'. Simple, yes?
But wait, there’s more!
I mentioned “date-pickers” in the title. When it comes to UX, date-pickers are a sore subject, and rightly so — it’s difficult to truly “get it right”. On a “desktop-ish” device (i.e. something with a keyboard), it may be easiest on the user to give them a simple text-box which can handle various formats and interpret them intelligently — this is what SSRS does. But on mobile devices, you often see those “spinner” controls, which is a pain in the arse when you have to select, say, your birth date and the “Year” spinner starts at 2017. #StopIt
I mean, I’m not that old, but spinning thru a few decades is still slower than just typing 4 digits on my keyboard — especially if your input-box is smart enough to flip my keyboard into “numeric only” mode.
Another seemingly popular date-picker UX is the “calendar control”. Oh gawd. It’s horrible! Clicking thru pages and pages of months to find and click (tap?) on an itty bitty day box, only to realize “Oh crap, that was the wrong year… ok let me go back.. click, click, tap..” ad-nauseum.
The point here is, use the type of date-picker that’s right for the context. If it’s meant to be a date within a few days/weeks of today, past/future — OK, spinner or calendar is probably fine. If it’s a birth date or something that could reasonably be several years in the past or future, just give me a damn box. (Heck, I’ll take a series of 3 boxes, M/D/Y or Y/M/D, as long as they’re labeled and don’t break when I omit the leading-zero from a single-digit month #!) If there’s extra pre-validation logic that “blocks out” certain dates (think bill-payer calendars or Disneyland annual-pass blackout-days), that probably needs to be a calendar too.
..just make sure it’s responsive on a mobile device.
And in all cases, pass that “ending date” to your SQL queries in a consistent, logical, sensible manner. For reporting, where the smallest increment of a period is 1 day, that probably means automagically “adding 1 day” to their given end-date, because the end-user tends to think in those terms. I.e. if I say “show me my bank activity from 1/1/2017 to 1/31/2017”, I really mean “through theend of the month“, i.e. the end of the day of 1/31. So your query is going to end up wanting the end-date parameter to be 2/1/2017, because it’s using the correct & consistent “greater than or equal to start, and less than start-of-next” logic.
I know it’s not easy to explain to business folks, and it’s not easy to implement correctly. But it’s important. The >= & < logic is clear, concise, and can be used consistently regardless of underlying datatype. You just need to adjust your presentation layer (whether that’s SSRS parameters or a .NET date-picker) to convey their intent to the user, whether that’s “show/enter the last day of the month, but translate to the next day to feed to the query/proc.”, or “make them enter the next-day (day after the end of the month/period) and understand the ‘less than’ logic.” I’m more inclined to the first, but it depends on your audience.
We need to talk about this stuff – candidly, openly, broadly, deeply.
More accurately, to legacy enterprise management.
Let’s say the following directive comes down from on-high: “Hey, our CEO wants us to provide better financial metrics reports and a dashboard that management can see to show real-time stats about the company.”
I mean… Sure! Yay, digital transformation, modernization, mobile friendly, all that good stuff!!
So, I have some thoughts on this, because I’ve seen the current state of things in small-medium enterprise, and am anxious to help improve that state to provide better value to the business. To misquote Dennis Miller, I don’t mean to on a rant here, but…
First topic: Reality Check
It starts at the top, with a couple realizations:
Data is ever-growing.
We need to get smarter about managing its growth, including archiving/retention schemes, data warehousing, etc.
This involves compliance regulations and operational resources.
We need to ensure compliance with biz standards and data shelf-life.
We need to automate as much as possible to avoid over-burdening our human resources (and to some extent our servers too).
For example, you can’t expect the same response-time for a query into 10-year-old financial data as you do for 1-year-old data.
Traditional SSRS (SQL Server Reporting Services) is an operational time-sink.
We spend way too much time assigning access, creating redundant “on demand” reports, and making seldom-used email subscriptions.
We’re probably running on an old version, say 2008R2
Vast improvements have come to the MS Data/BI platforms in the last decade and we need to take advantage of them.
It’s not mobile-friendly at all; it’s not even modern-browser friendly, as some of its UX elements are still explicitly functional in Internet Explorer
We tacked-on some 3rd-party application to attempt to bring some data-warehouse functionality into the environment, but only 1 person “knows it intimately” and is comfortable developing new reports with it.
Our ERP system, in its current state/version, is a tangled mess, to the eyes of a DBA & query-writer/report-writer.
We’ve bolted-on so much customization and special-configuration that it’s not suitable for stock/canned reports from the vendor, even if we upgraded to a version of the app that had a decent reporting engine.
We can’t even decide on very basic things like “What is a ‘unit‘ of production?”, or “What are the different areas/groupings we break-out for revenue metrics?”
Ok sure, maybe we can agree on what those groupings are, but we can’t even get a consensus on what we call them!
Second topic: Single Source of Truth
We need to agree on a standard, documented, official set of business rules that answer such questions as “how do we measure revenue?”, “what are our different sub-orgs/departments/groupings for how we report on revenue?”, “what is ‘production output’ and how do we measure it?”, “how do we calculate bonuses for this group of employees?”, etc. More than that, we need to agree on naming things – we need a common, consistent nomenclature and understanding of what it means when someone says “N# Units”, “Department X” or “Order Aging” or “Membership Level” or “Bonus Type Y”.
And even more than that, we need to map those concepts to concrete, documented rule-sets that are manifested in the data somehow (from the simplest example, a “look-up table” or “reference table”, to the complex examples like a “data mart” or “analysis cube” or “ETL process”). This concept is sometimes called a “data dictionary”, which kinda belies its complexity, because it’s really more of a “data encyclopedia” – it needs to document what, how, why, & when.
What our concepts/terms/data-points mean, how they’re used, why they’re useful, & when they should be used.
Third topic: KISS and KPI’s
Management reports need to be simple. Yes, there are power-users who want the detail, and there are auditors who in fact require the detail. But your average C-level (or even P/VP-level) exec doesn’t care about that stuff – they want very simple answers to deceivingly simple (i.e. can be very complex under-the-hood) questions, like “How much money did we make this quarter for department X?”, or “What kind of productivity bonus do I give to group Y?”. But that’s just the beginning – that’s descriptive analytics. What they really want, but are sometimes too afraid to ask, are more powerful questions, like “How much money can I expect to make in market Z or state XX?”, “What are our expected new loyalty program memberships, and how much will they profit us?” — predictive analytics. (And we’re not even going to touch prescriptive analytics yet, because you’re not ready for that.)
KISS means we need to try our best to hide the nitty-gritty details and “under the hood” logic/calculations from the end-user or report audience. But, that means fully knowing and understanding those details and rules and logic flows so that we can implement them!
KPI is Key Performance Metric. That’s the golden nugget, the one piece of information that the manager/report-viewer ultimately is after, the thing that makes them go “Got it! That’s the answer I was looking for!”, so they can make their business-decision and move on with their day. These aren’t necessarily just single numbers (like an overall revenue figure); they can be pie-charts, bar-graphs, a clear & concise grid, or whatever makes the most sense for the business-problem/business-decision at hand.
This all sounds fantastic, right? So what’s the catch?
Fourth topic: Time & Effort
Time is money, which is resources, which is people, learning, training, developing, implementing, testing, validating… rinse, repeat. You don’t put that all on the shoulders of a lone DBA; that life-cycle touches many different disciplines and team members – managers, business users, accounting folks, marketing people, analysts, developers, testers, operational leads, and yes, of course, all of IT infrastructure (helpdesk, engineering, DBA). And you don’t just buy a box off the shelf at your local software retailer and say “look, we’re gonna implement Tableu!”, wave a magical IT wand, and call it day.
Now we, as technologists, are more than willing to learn and educate ourselves, but…
There needs to be a matching dedication from the business to that effort, and to the platform(s) that is/are chosen.
That means, in concrete terms, a few things:
Training budget & resources
Conferences, courses thru online training providers, cross-team collaboration.
Product & technology investment
Upgrades, net-new products, whatever is needed.
Time allowances & agreements
Dedicated scheduling where the “daily grind” operations take a back-seat and we can focus on the new stuff.
Support from SME’s
The ability to call-out to a qualified expert when critical questions or roadblocks arise.
Can be contractors, consultants, service-providers, or platform-providers. The point is, you only use them if you need them, so you keep the cost relatively low.
That’s if you’re dedicated to in-house team/ability build-up. If you want to outsource, you have a different set of challenges:
Contractors are expensive!
Their requirements are exceedingly rigid.
They’re likely to scoff (yes, even outright laugh) at the quagmire of data & logic & rules that we’ve created and/or want to build into our “magical reporting stack”.
They’ll still require that same product/tech investment.
No contractor is going to accept your old legacy SSRS instance as a baseline for building a modern, responsive, effect reporting system. The first thing they’ll say is “upgrade that, & come back to us.”
Likewise for your legacy ERP system – sure, it’s a little less obsolete, and there are probably plenty of shops running it & developing on it, but good luck getting new-hire contractors to embrace it; at best, they’ll begrudge it; at worst, they’ll charge exorbitant fees for having to work on such an old platform.
Technical debt is their worst enemy.
Like it or not, like most decades-old enterprises, we have technical debt up the wazoo.
Contractors won’t work in a debt-heavy environment; they’ll insist you “fix the debt” and come back to them in a few months/years when it’s all happy & pretty & green.
Technical debt is our enemy, too, but at least we “own” it – i.e. we’re aware of it and we have ideas on how to fix it, if/when we ever get the time.
It’s like our city roads: at least we know where the potholes are, and how to avoid them.
My point, from this rambling and probably way too lengthy post, is this: We need to talk about this stuff. Yes, Mr. Manager, I know you already said that. Let me embellish:
We need to talk about this stuff, candidly, openly, broadly, deeply, cross-functionally(made-up phrase #2), even company-wide.
Because, while the end-goal is deceptively simple (“We want report dashboards!”), the underlying systems are complex, with lots of moving parts, requiring lots of knowledge (both domain/biz and tech), and lots of management (compliance, governance, automation, visibility/monitoring).
It’s not just a technology challenge. It’s a people challenge. It’s a cultural challenge. It’s an organizational challenge.
It’s a challenge that, when faced, met, and overcome, can lead to spectacular growth and success for all involved!
(And that’s my attempt to end this rant on a positive note. Enjoy!)
PS: No, I’m not happy about WordPress’s inability to understand the ‘style’ attribute of a simple <ol> tag, but I tried… so apologies if the outlines are not intuitive because each level is just another set of numbers, instead of Word-style outlining like 1.. a.. i.. etc. Grr arg!
Unless you’ve been living under/in the proverbial rock/hole-in-the-ground, you’ve heard of this thing called MongoDB. It’s one of the biggest flavors of NoSQL database systems out there – specifically, it’s a “document DB”, which means it stores semi-structured data in the form of JSON documents, grouped into collections, grouped into (of course) DBs.
Now, I’m a Windows guy. SQL Server & Windows Server are my comfort zones. I like GUIs, but I’m ok with a command line too. PowerShell has become my friend good acquaintance. But every once in a while, we have to step outside our comfort zones, no? So when I was told “hey, you’ve got to get a handle on managing these MongoDB instances that we’ve got running on servers so-and-so, as part of this ‘Imaging system’ application”… I thought “Hahaha…”, but what came out was “Sure thing!” Something a bit like this:
So the first order of business was to decide on a MongoDB “GUI” – a Windows app that could at least connect-to and give me a visual overview of the running MongoDB instances (properly referred to as a mongod, stemming from the Linux term “daemon”, which on the Windows side is basically like a process or service). I tried both the “official” desktop app from the big org, MongoDB Compass, and a neat open-source tool called Robomongo.
And I actually like them both, for different reasons; most of which can probably be attributed to my lack of depth with the technology, but hey. Anyway, Compass is really nice in that it gives you this kind of statistical overview of the collections in your DBs, performing some basic aggregates on a 10% or 1k sample of the collection documents to give you a graphical 40-thousand-foot view of the data. But where it breaks down for me is that little “query” bar, which is just an empty pair of curly-braces. It’s only for “selecting” (finding, querying); no other operations to see here. So we can’t manipulate our data with it, but it’s definitely great for viewing!
Whereas with Robomongo, I can right-click on the DBs/collections and do very simple things like “show documents”, “show statistics”, etc. And it actually writes the equivalent mongo shell command for me to poke at; say, to inject more logic to the find to get something specific or to write a new command or two as I read thru the docs and learn things like aggregates, indexes, and whatnot. Being a shell, it allows us to write or update data as well as read it.
Even though it’s a CLI/Linux flavored technology, it works perfectly fine in Windows… Except for one thing. So, you install it as a service, and you typically start & stop services using net start & net start and as long as your service’s CLI arguments are all correct, you should be good — in theory! Trouble is, the service tends not to stop gracefully. So I found that, instead, the following command was more useful: mongo admin --eval "shutdownServer()". This uses the actual mongo shell to send the native shutdown command to the mongod, instead of relying on the Windows services gymnastics to do it correctly.
It just goes to show, dear reader, that you’ve got to get your hands dirty and try out new technology before dismissing it as “not my job” or “somebody else’s problem”.
PS: Nope, that’s not Compass’s logo or anybody else’s; I made it meself, with good old Paint.NET!