Week In Review : 06-15-2014

It’s time for another exciting edition of WIR! This week was filled with updates! Rolled updates to our Domain Controllers and one of them took nearly two hours to come back from a reboot! Normally not a big deal, but when your 30mi away…a little stressful! I also rebuilt my work laptop this week, earlier this year I had done something stupid with an external drive and wound up with Windows installed on Partition 2, on a disk with just one partition! Needless to say, rebooting my laptop didn’t happen all that often at all!

Speaking of Active Directory Domains, we are moving ever closer to having just one domain on campus. The internal private Edwards domain went away this week! It’s always just a little nerve wracking when running through dcpromo to remove stuff, but it went well. Didn’t appear to leave any unsightly meta data floating around AD!

Also spent a fair amount of time talking with the guys at Edwards to go over how they image machines. They routinely call us to have a workstation DNS entry removed, and needless to say it’s a little annoying. They ought to be able to do this themselves, but since it’s not their DNS they don’t have rights. Not to mention they way they do their image is a little different.

This is how it goes, a user is up for a new computer. In an effort to minimize the inconvenience this can sometimes to be, they image the new computer, load their software, and finally join it to the domain. This last part is what gets them, they tack on a “-1” to the new workstation name. Normally not a big deal, but the last part is where it gets hairy.

The new workstation is delivered to the user, the old workstation is unjoined from the domain, the new computer is renamed to the old computer name…and boom. Sometimes this works (they say) but I can’t imagine how. So, the first comment was hey, how about using service tags, or mac addresses to identify these machines uniquely, then you will never get hit with this issue. Nope, they like usernames as computernames, it makes it easy to correlate user to workstation. Apparently it’s too difficult to track that down in SCCM? Not likely, but oh well.

So, what to do, well we could just have them call every time, but that’s a hassle, not to mention there’s no code involved! My solution, create an Orchestrator runbook, that is provided a computername. With that information it scrubs AD and removes the DNS entry as well. This Runbook would run in the context of a service account that has rights to do this. They would simply login to it with their admin account, we would use their group information to verify that the computer they want removed lives in their OU and then remove it and the DNS entry. If it doesn’t live in their OU it fails. Sounds elegant to me 😉

A final solution, which will take much longer to implement, will be an appliance from BlueCat that sits between AD DNS and Proteus DNS. This appliance will use the Proteus web service and the MS RPC to translate information between AD and DNS. This will get us to a very similar place as my Runbook idea, but the one advantage is this will also get us to a place where we can pull our AD DNS out of the public facing DNS, effectively hiding thousands of servers and workstations.

Another fun one that happened, you can’t push the ops client to a Domain Controller using SCCM Client Push. If someone tells you they can, they are lying to your face! I’m going to write up a post, but the short of it is, Client Push relies on a local administrator to work, how do you do that on a Domain Controller?

OH! I also polished off my SQL PowerShell, so I’ll write about that as well. It works pretty well, created some new functions to let me more accurately find SQL Instances, still don’t have a good way to talk to the WID but it’s kicking around in the back of my head.

I also broke Active Directory Certificate Services..see you next week!

Oh, I suppose we should talk about that? So, I’ve been slowly pulling servers out of the old Ops servers and bringing them over to the new. Doing pretty well, 230+ servers in the new and growing, and under 50 in the old. The Domain Controllers got pulled in this week as well as the Certificate servers.

So, I’m working through the alerts, tuning Ops so I only hear what I need to. So, I started getting alerts about ADCS (Active Directory Certificate Services) and started working on that issue. I was seeing errors about the CRL Distribution Point being offline.

As part of the troubleshooting I had already decided to stand up a vhost to hold CRL’s among other things. So I reconfigured the CA to use that, after restarting the service as prompted by Windows, Certificate Services failed to start. The net result here was that the CRL’s were out of date and just needed to be published and then copied to the web location.

The only bit left here is to automate both the publishing and the copying of the files over to the web server. Of course this seems well suited to creating a PowerShell solution, check back later for that!

See you next week!

Week In Review : 06/08/2014

Still a lot of programming this week, but like I said before I think anymore that is more the norm than not. We did some interesting Active Directory stuff this week. We had a handful of servers get their AD objects deleted at some point, and we found out about the beginning of this week. Now my guess is these were deleted close to about 3mo ago, and they either rebooted recently or attempted to change their password recently.

About a year or more ago we changed our audit policy and started using Advanced Auditing. We were concerned about user account and group account management, but it turns out we should also have put in computer account management as well. When a computer object is deleted event 4743 is logged in the security log of the domain controller. We searched and couldn’t find that entry anywhere, when I started researching that event is when I found you need to tick the boxes for computer management.

Along those lines we had a similar issue, our admin accounts in our QA domain were disabled, since we do very little auditing at all in there, I enabled the same features so we can see when that happens. When a user account is disabled event 4725 is logged. To go along with both of these events I’m going to update our reporting in Ops on things like this.

While doing all this I found a very nice support article listing our the various event id’s and what they mean.

All of the servers that are supposed to report in to System Center Advisor, are now doing so. I feel rather stupid about the issue originally. So my first problem is that I wasn’t patched up to where I needed to be in order to even use the preview, so that was step 1. The next part is where it gets a little fuzzy, I don’t actually recall patching the clients on any of the agents reporting in, yet all 3 domain controllers reported an update agent. Coincidentally all 3 domain controllers were the only servers showing up. After some investigating with the SCA guys from Microsoft they quickly realized I had not patched my agents. So, I must have patched the DC’s, I just don’t recall doing it, hence the stupid.

So the result of this is a working program in SCCM that will patch outdated clients, which is good as my next step in this whole saga is to patch production. It’s either patch, or move over to R2 and currently I’m leaning towards patching. So currently in QA when a server gets discovered the ops client gets pushed down to them, now it will also get patched. Then the only manual part of this process left is to add them to the advisor management pack.

It’s been lots of fun talking with these guys about stuff, I’ve been invited to participate in an SCA board to go over new features and talk about how things work. My recent experiences dealing with some of the internal folks with Microsoft really make me want to work there more.

I’ve done some fun things with PowerShell this week. A new SQL module has been fleshed out and validated against just about all instances of SQL. I’m still having a hard time working with a connectionstring for the Windows Internal Database, but it will come. I’ll most likely write about this module after I’m done with this WIR.

I’ve updated the Orchestrator module. The Start-scoRunboook function worked incredibly well if you only ever had one parameter, as soon as you throw more than one it freaks out. How I originally handled it was dumb, so now the function accepts requires a hashtable object, it then compares the key (property name) field against what the Parameter object returns. This worked out extremely well, again probably a topic for a whole blog post as well.

One last pure PowerShell item is a function that writes functions. It’s not too terribly complicated and I *WILL* post about this later, but basically the idea is that Orchestrator contains Runbooks that perform some action, my module reads those Runbooks in, gets their parameters and allows you the admin to run them. What if we could have a function that would build cmdlets based on that information on the fly…

SharePoint Online! How much fun is it working with UserProfiles in SPO? Well, let me tell you, in order to do anything meaningful it appears you have to access a 10yr old web service that must be ripe for deprecation but has been forgotten about! I’d really like to get some more information direct from Microsoft about that. At any rate, I’ve got some POC code that will allow me to programmatically populate a SharePoint user’s profile with information that we glean from another source. The next step down this rabbit hole is using a 7yr old SDK (Office Server 2007) to see if I can create UserProfile subtypes! I’ve got some examples of how this works, but I’ve not written anything up yet to see if it will go, fun times ahead!

Keeping in line with the SharePoint Online topic, creating admin cloud accounts. So we have an Azure subscription that allows us to get into Azure AD for our tenant, which isn’t anything special. If you have an Office365 subscription, you can create an Azure account, hook the two together and boom…Azure AD! So I created an admin account for me, and one of the other guys on the project. After that I enabled the Multi-Factor Authentication on these two accounts. Now, when I login with my admin account, I receive a txt message with a verification code. So we have looked at this as THE way to secure access to these accounts as we begin to think about the cloud.

With that out of the way, I can talk about the Orchestration. I’ve created a Runbook that will connect to our tenant and provision a user. This came out of the Provisioning project for the larger SPO project. This code takes a single parameter, samaccountname, and then provisions that user in o365 with the appropriate licensing. There are two differences between an o365 user and a cloud admin. The first is licensing, a cloud admin gets none by default (our design), second the all important UPN, user@tenant.onmicrosoft.com. The idea is these accounts live solely in the cloud, and are used specifically for administering cloud things. I have a couple modification in mind, first I need to populate the AlternateAddresses field, as well as the MobilePhone field. Then I need to see if I can enable MFA in Azure for these accounts automatically.

Lots of Orchestrator this week, but now that I’m ready and the network is ready it’s time to start working on Orchestrating Windows Updates. I’ve started a rough draft of that at the moment:

  1. basically get a list of servers (or service)
  2. for each server start maintenance mode (ops and Zenoss)
  3. get the applicable updates (SCCM perhaps)
  4. apply the updates
  5. reboot if needed
  6. make sure the server is back online
  7. check if required services are running
  8. leave maintenance mode
  9. and move on to the next server

If one server in a group fails then we need to stop the update process and throw an alert in ops and Zenoss. This will prevent an entire service from going offline if the updates cause an issue.

System Center Orchestrator PowerShell Module

This is one I’ve had on the back burner for a while, so yesterday morning I roughed up the basic framework for a PowerShell module. I have a few Runbooks at work, that it would be super cool to just run from PowerShell, and since lately I’ve been all up in the web services this was as good a time as any.

The Get cmdlets were all pretty simple, in fact there is really only one that does any real work Get-scoWebFeed. I probably could have used Invoke-WebService, but that’s no fun so I used .Net to make my own, and it’s really pretty simple. I just go ask the Orchestrator server (on a specially crafted url) to spit out the xml, then I just return it.

The individual functions for getting Runbooks, Jobs and Activities handle building the special URL,which isn’t really special as much as it is specific.

The Start-Runbook was the most complicated, I actually borrowed some code from MSDN, and another guys blog (Part 1, Part 2) to build mine. Turns out some of the xml you have to build to send up has to go in a certain way. I need to adjust my code to handle Runbooks with Parameters, but right now it’s good for what I need it do.

You can find the up to the minute code on GitHub, or you can find it in the TechNet Gallery.

Hyper-V Server 2012 Cluster with Powershell Deployment Toolkit

I recently came across a lovely show on Channel 9. It talks about setting up a simple Hyper-V Server 2012 cluster for use in a lab or test environment or whatever. I won’t go over the details, watch the show, it’s great! In addition to that I had come across an article on the Building Clouds Blog, about the PowerShell Deployment Toolkit. So over Memorial Day weekend I decided to stand up my cluster and spin up a test environment similar to what I use at work.

In my environment I have 6 servers, I have 3 set aside for Hyper-V, one is my firewall, one is a Domain Controller and the last is a management server. I’m using my DC as the file server as well. I didn’t need the iscsi target stuff, as I’m using Windows Server 2012 and used the new File and Storage Services to configure my iscsi drives.

I decided to let vmcreator.ps1 build the vm’s for me, originally I had spun up my own, but I was having difficulties getting the installer to work properly. Turns out that there is a requirement that the PDT tools be run from the C: drive of your computer. Also if you’re running them from the server OS, you will need to install the Hyper-V role in order vmcreator.ps1 to function properly. I don’t recall seeing either of those things mentioned in the TechNet article, but I may have overlooked that part.

So, linked from the vmcreator.ps1 article is a great utility, Convert-WindowsImage.ps1 that I used to create my base OS image. The utility is super handy and has a gui or cmdline version. I wimped out initially and used the gui version, pointed it an ISO of Windows Server 2012 and after a while I had a lovely vhdx ready for vmcreator.ps1.

After renaming the half dozen vm’s the script had created for me, in record time btw, I ran the installer.ps1. There’s not really a whole lot mentioned on the article about it’s use, it is rather self-explanatory and once you realize the limitation to the C: drive then it’s a no-brainer. That part took me a bit to figure out as I had an external drive with all the bits the downloader.ps1 had downloaded for me.

The end result is I now have the basic System Center infrastructure that I can play with locally to try out new features, or test the scripts and apps I create for work. It was really very slick, and I could totally see how I would use something like this in our QA environment at work.

 

Operations Manager, Orchestrator and PowerShell Remoting

It’s been a very long time since I last posted, the primary reason is most likely laziness on my part and secondly I’ve not had a lot to write about. Recently I’ve been messing around with Orchestrator and automation as a means of passing information off to Zenoss. On the face of it, it seemed a rather trivial task, but it took much longer than I anticipated.

The first go round with this was a very simple runbook. It had two activities, Monitor Alert and Run .Net Script. The Monitor Alert activity was configured to look for alerts that were not Information alerts. Once an alert occurred that met that criteria it was passed off to the Run .Net Script. The Run .Net Script activity simply created a simple log entry with PowerShell.


New-EventLog -LogName 'SCOM Alerts' -Source Category
Write-EventLog -LogName 'SCOM Alerts' -Source Category -EntryType Severity -EventId 1 -Message Name

Note : I didn’t include all the gibberish typically seen when copying a runbook into notepad, so you can assume that Category, Severity and Name are prefixed by a big nasty GUID.

The first hurdle I had to get around was creating new sources. Since I didn’t know in advance what they would be, it seemed to me it would be easier to have them created programmatically. That’s what the first line does, but the context under which this runs didn’t have the ability to do that. So I created a group and added the service account to that group, and then added that group to the local Administrators group on the server. Finally I needed to disable UAC which was preventing this from happening, if someone has a better way of doing this I’m all ears.

The nice part about this stage is I was able to get some alerts generated and have them show up in the newly created log. For testing I picked a server that I was monitoring and then stopped the HealthService service. This would generate a failed heartbeat alert similar to a computer going offline unexpectedly. With some sample log entries I was able to configure the Zenoss server to pull in the specific log and start generating alerts with Zenoss.

While this worked well enough to get started I wasn’t satisfied with the quality of the data being returned. Specifically I noted that while some alerts contained the name of the computer with the problem, not all did. Looking at the data returned by the Monitor Alert activity it didn’t seem to me I was getting as many of the details as I needed.

So I decided that some remoting might do the trick for me. With remoting I’m able to use the

links
http://blog.tyang.org/2012/05/09/using-scom-powershell-snap-in-and-sdk-client-with-a-powershell-remote-session/
http://blogs.msdn.com/b/powershell/archive/2008/06/05/credssp-for-second-hop-remoting-part-i-domain-account.aspx
http://blogs.technet.com/b/stefan_stranger/archive/2010/11/02/using-powershell-remoting-to-connect-to-opsmgr-root-management-server-and-use-the-opsmgr-cmdlets.aspx
http://blogs.technet.com/b/jonathanalmquist/archive/2009/03/19/resolve-all-open-alerts-generated-by-specific-agent.aspx
http://www.systemcentercentral.com/BlogDetails/tabid/143/IndexID/70177/Default.aspx

my thread
http://social.technet.microsoft.com/Forums/en-US/operationsmanagergeneral/thread/360f3a42-9153-4e2e-b060-73740e8ffe4f/#360f3a42-9153-4e2e-b060-73740e8ffe4f

SCOM 2007 R2 and Get-Event

For whatever reason I’ve not been able to find what I’ve been looking for regarding this cmdlet. Namely a decent example of it’s use with regards to the –Criteria parameter. For better or worse I have several event collectors setup and it would be nice to ask SCOM for a list of specific events. Normally you would think that would be simple, and perhaps for some it is, but for me I was having some issues, that is until yesterday.

I poked around in my history but I couldn’t find the page I was looking at that enlightened me, so I’ll just add my own here in case anyone else is having the same problem.

So I’m looking at a screen that has the following columns:

  • Level
  • Date and Time
  • Source
  • Name
  • Event Number

Now the examples I have seen show that you pass field=value into the –Critera parameter, but the problem for me is that Event Number or EventNumber aren’t things. In the Event Viewer it’s called ID but in SCOM ID is the ID of the specific entry you’re looking at, much like a primary key in a database.

It turns out that the Event Number field, in SCOM is simply Number. I literally felt like Homer Simpson, D’OH!

Get-Event -Criteria ‘Number=4729’

That actually yields useful information, well assuming you’re logging Event ID 4729. At any rate, I needed to write this down somewhere as it’s a regular thing for me, that up until now has been very difficult.