Wednesday, August 19, 2009

Querying Active Directory vs SQL Server

Warning: This is a meandering rant.  Drink plenty of caffeine to avoid crashing into a guard rail.

I’ve had a few friends/co-workers hit me up about this subject recently.  For what reason – I don’t know.  Why me?  I’m only guessing it’s something to do with my extensive work gluing together web services and web apps using SQL Server, Active Directory and plumbing in things like WMI/SWBEM/WQL and so on.  My background is more in software development than in systems engineering, so I look at something like Active Directory and immediately want to take it apart and see what makes it tick.  Kerberos, KCC topology, links and replication processes, DFSR, events and logs, WMI and ADSI, and LDP.  Everything in the software world looks like Legos to me.  Microsoft really doesn’t get the credit they deserve for, as Paul Thurrott says: building a business of building platforms.  They are the king of platforms.  And nobody goes to the extent they do to document their numerous platforms.

Where was I?  Oh yeah, sorry.  I was blabbering about myself, as if anyone cares.

So, the issue is what are the concerns and what are the trade-offs between querying Active Directory directly as opposed to using SQL Server as a middle man (between the data and the consuming service or application).  The answer is (drum roll please):  it depends.

For most general information gathering, it’s fine to query AD directly.  The AD schema and various tentacles into it are actually tuned more for read/fetch operations than for write/modify operations.  That’s because the nature of AD is authentication and look-up.  Both of those chores involve far more reading than writing.  Don’t believe me?  Just think about how often you logon to you network as opposed to how often you change your password or some property within your user account.  Think about how often you boot up a computer as opposed to how often you join it to the domain.  That’s one of the reasons Microsoft didn’t push harder to move the AD partition suite into a relational database.  Some argued for doing that, but really, the cost factor just doesn’t offset the gains in performance or capability.

That said (yes, all of that… phew!) there are times when it makes sense to introduce a data store in the middle.  This can be XML, SQL, or just about anything.  Why?  Because maybe you want to maintain a faster performing cache with a subset of the overall data store.  You can dynamically filter LDAP, WMI, and so on, sure.  But sometimes it helps to at least tape things together a bit.  An example?  Ok.  One project I worked on involved pulling aggregate information about employees from several data stores, including AD.  The end result needed to be able to instantly display employee data, as well as AD and Exchange data, and things about the computer that employee uses.  Rather than having the web app reach out and query all of the domain controllers (to get the last logon data, for example), I used a SQL process to fetch and update that periodically and that allowed the web application to fetch the end result much quicker and with more reliability.

A relational database is also great for storing historical information in ways that are often cumbersome to do with native platform APIs.  One example is monitoring AD replication status.  A project I worked on involved running REPADMIN, NETDIAG and DCDIAG, on each of the clients domain controllers and feeding parsed results into a SQL database on a recurring schedule.  This is very similar to what MOM/SCOM does obviously, but you can do much of the same yourself with a little elbow grease (and much cheaper).  The EVENTTRIGGERS command also helps in this regard.  When collected into one data store like this, it is much more efficient to analyze it, query it and run secondary processes from it (alerts, reports, automated diagnostics and repair, etc.).

As I said, this article is really a frame of thought about how to approach these two avenues, more than giving discrete examples.  I may post some real examples if anyone is interested.  Some of them might take some serious prep work to shoehorn into a blog post.

No comments: