Chef Windows Perf — Disabling Ohai Plugins

From Doug Ireton’s chefconf slides — here’s a code snippet for disabling underperforming or irrelevant Ohai plugins on your windows chef node.

Ohai::Config::disabled_plugins= [
 #
 # add to client.rb file -- c:\chef\client.rb
 #
 # ref: http://www.slideshare.net/opscode/chef-conf-windowsdougireton # slide 30
 # ohai plugins that have poor perf or are irrelevant to windows
 #
 "c", "cloud", "ec2", "rackspace", "eucalyptus", "command", "dmi", "dmi_common",
 "erlang", "groovy", "ip_scopes", "java", "keys", "lua", "mono", "network_listeners",
 "passwd", "perl", "php", "python", "ssh_host_key", "uptime", "virtualization",
 "windows::virtualization", "windows::kernel_devices"
 ]

The above snippet is also available here: https://gist.github.com/tcotav/7566353

Here’s a link to the relevant opscode docs on this: https://wiki.opscode.com/display/chef/Disabling+Ohai+Plugins

Idempotency, Chef, Powershell, Windows

Here are some notes on writing idempotent recipes in Chef using Powershell on Windows

make powershell fully functional as standalone script on windows first (and second, and third)
– are you sure you got the first thing done properly? y/n?

Here’s a crude skeleton for a base powershell script:

######################
try {
 $Some-Crazy-Command;
 $message = "We made it";
 $exitVal=2;
}
catch
{
 $message = "Join Error - ";
 # tack on the thrown error string here, $_
 $message += $_;
 $exitVal=1;
}
write-host $message;
exit $exitVal;
}
########################

Basically, you just want to pass along an exit status and message that chef will be able to key off.

You could easily add additional states with corresponding exit codes to the try block. Why do we do this? Well, we need to know if the script did what it was supposed to or not. In the example below, we want to know whether to restart the host after adding it to an AD domain.

So then, within the powershell script itself is where you want to manage your idempotency (or at least, it is more likely where you’re going to HAVE to manage it.

In the join AD example (assuming a new host — not a change… see below for notes on that), in the powershell script we would:

1) is the host already a member of an AD Domain?
2) if not, then join to a domain

* The case where it is a member of a different domain and you want to change it is more complicated as you would probably need to Remove-Computer and then reboot. On coming up, it would be a member of no domain, and on next chef run, would join new domain.

The powershell would look something like this:

function addComputer { param([string]$username, [string]$password, [string]$domain)
try {
 if ((gwmi win32_computersystem).partofdomain -eq $true) {
  # arguably here, I would check if it is the RIGHT domain... next rev...
  # $domain = [System.DirectoryServices.ActiveDirectory.Domain]::GetCurrentDomain()
  # $domainName = $domain.name
  # < compare with passed in value >
  $message = \"The system is joined to the domain\";
  $exitVal=2;
 }
 else {
  add-computer -domain $domain -credential (New-Object System.Management.Automation.PSCredential   ($username, (ConvertTo-SecureString $password -AsPlainText -Force))) -passthru -verbose
  $message = \"computer joined to domain\";
  $exitVal=3;
 }
}
catch {
  $message = \"Join Error - \";
  $message += $_;
  $exitVal=1;
}
  write-host $message;
  exit $exitVal;
}

# this next line uses ruby
addComputer #{node['ad']['user']} #{node['ad']['pwd']} #{node['ad']['domain']}

here’s a gist of a more final (and better formatted) version of this: https://gist.github.com/tcotav/7489860

Now ANOTHER (potentially more *nix-y) way to do this is instead of a single monolithic script, you would just shell out for all the bits and then process the output in chef/ruby. The possible issue with going that way is that it would be more expensive resource (and probably time-wise) to continually spin up a powershell process to handle each command. This would be more relevant if you had 30 little cmdlets that you wanted to invoke.

Okay, you’ve looked at the gist and saw a line that made you wonder WTF?  It looked like this:

::Chef::Recipe.send(:include, Chef::Mixin::PowershellOut)

Well, we need that more than anything because we use it to capture the exit status and stdout/stderr of the powershell shellout.  There’s some debug log code in there to dump out these values (you know — for posterity).  That bit is:

result = powershell_out(script)
Chef::Log.debug("powershell exit #{result.exitstatus}")
Chef::Log.debug("powershell error #{result.stderr}")
Chef::Log.debug("powershell stdout #{result.stdout}")

This is just what it looks like.  We run the script (a variable brilliantly named “script” here).  The results go into… result.  Then from that object we access the 3 variables mentioned earlier.  Those then are what we use to pass the messages back OUT of the powershell process to Chef.

  # same as shell_out
  if result.exitstatus == 2
    Chef::Log.debug("Already part of domain: #{result.stdout}")
  elsif result.exitstatus == 3
    Chef::Log.debug("Joined domain: #{result.stdout}")
    # reboot if joining domain
    notifies :request, 'windows_reboot[5]', :delayed
  else
    Chef::Log.error("Domain join fail: #{result.stdout}")
    # any other actions here?  maybe flag the node?
  end

We don’t do much with the return, but we do something — we notify the windows reboot we CLEVERLY inserted earlier into our Chef recipe. What this line tells the recipe to do is to queue up a windows reboot AFTER the rest of the runlist for this host are done. In the context of our little example though, it shows how we would be able to interact and take action based on powershell runs.

And that’s about it for now.

Google Glass Mirror API Quickstart, Continued

This might be the most duh thing I’ve made publicly available (you should see the stuff I keep secret…), but when I reached the end of the Google provided Mirror API/Glass quickstart tutorial, I looked for the next page of explanation.  There was no next page.  What I was looking for was something that described the webapp just launched.  Maybe it isn’t necessary for anyone else, but it felt like an omission to me.  So, here’s my continuation of the quickstart tutorial docs…

(reference: https://developers.google.com/glass/quickstart/python)

Here’s a screenshot of the bottom section of the page that gets launched from the quickstart code:

Glass Sample Project Webapp

You may be looking at this image and thinking “well, that’s pretty self-explanatory given the h3’d titles over each section”, and you may be right.   That wasn’t the problem here though.  The problem was the gap between the launch of the webapp via appengine tool invocation and then a review of what a powerful tool this webapp is.

So, how does this help you?

This webapp allows you to exercise all of the Mirror API functionality without sullying yourself with any development. Of course, we’re in it for the development, so what we’re seeing is the result of the sample project.  Play with it a bit.  You can use the timeline tool to add items into your timeline.  Those items will (magically) appear in the timeline section across the top of the page (or in your Glass).  Once you get done fiddling around there, you can dig into the code.

First, open up /templates/index.html and walk through the html to see the invocations of the POST methods for the timeline form area:

<form action="/" method="post">
 <input type="hidden" name="operation" value="insertItem">
 <input type="hidden" name="message"
 value="A solar eclipse of Saturn. Earth is also in this photo. Can you find it?">
 <input type="hidden" name="imageUrl"
 value="/static/images/saturn-eclipse.jpg">
 <input type="hidden" name="contentType" value="image/jpeg">
<button type="submit">A picture
 <img src="/static/images/saturn-eclipse.jpg">
</button>
 </form>

Of note here is that the form POST’s to a single endpoint (“/”) and passes along the desired action as a parameter (hidden in the form, named “operation” in the form shown).  Okay, its still just a form — nothing bleeding edge there.

The python code that handles this routes all of the POST’s to /.  Let’s dig into the python code.  First, open up the main.py file.  This has most of its meat hidden away in subfiles, but what is done here is to register all of the routes that this webapp handles.  For example, the oath handler provides code for the routes  ‘/auth’ and ‘/oauth2callback’.

Next we’ll open up the main handler file /main_handler.py.  Look at the post method:

@util.auth_required
 def post(self):
   """Execute the request and render the template."""
   operation = self.request.get('operation')
   # Dict of operations to easily map keys to methods.
   operations = {
     'insertSubscription': self._insert_subscription,
     'deleteSubscription': self._delete_subscription,
     'insertItem': self._insert_item,
     'insertItemWithAction': self._insert_item_with_action,
     'insertItemAllUsers': self._insert_item_all_users,
     'insertContact': self._insert_contact,
     'deleteContact': self._delete_contact
   }
   if operation in operations:
     message = operations[operation]()

First up, you see the decorator there @util.auth_required. Open up the corresponding file, util.py, and check out the auth_required method there. The simple text translation of the code found there is: is user authorized? Yes, then carry on. No, forward request to /auth for authorization. This is in place so that all requests to the app will pass through authorization.

Back to post method source, you can see the map of the form’s “hidden” operations to actual method invocations in the source. These methods are then invoked at line 117,

message = operations[operation]() 

So if we posted a form that had the “operation” parameter with the value “insertItem”, this code would invoke it as

message=operations['insertItem']()

which would REALLY be, after the dict lookup:

message=self._insert_item()

From there, you can dig into how _insert_item() handles the data and interacts with the user timeline.  My notes appear in BLUE in the code below.

def _insert_item(self):
  """Insert a timeline item."""
  logging.info('Inserting timeline item')
  body = {
    'notification': {'level': 'DEFAULT'}
  }
  # self.request is the object holding the http form data POST'd
  if self.request.get('html') == 'on':
    body['html'] = [self.request.get('message')]
  else:
    body['text'] = self.request.get('message')
    media_link = self.request.get('imageUrl')
  # here we check if you passed along an image or if just a plain ol' insert
  if media_link:
    # lets go out and grab the bytes of your media so we can stuff 'em in timeline
    if media_link.startswith('/'):
      media_link = util.get_full_url(self, media_link)
    resp = urlfetch.fetch(media_link, deadline=20)
    media = MediaIoBaseUpload(
         io.BytesIO(resp.content), mimetype='image/jpeg', resumable=True)
  else:
    media = None

 # self.mirror_service is initialized in util.auth_required. <- existing comment
 # but important in that this does a LOT of stuff -- see below
 self.mirror_service.timeline().insert(body=body, media_body=media).execute()
 # this is the message spit back to webapp
 return 'A timeline item has been inserted.'

The one line

self.mirror_service.timeline().insert(body=body, media_body=media).execute()

does a number of things.  First, we grab the mirror_service that we initialized as part of our authorization.  From that mirror_service, we get an instance of a timeline.  Into that timeline, we build an insert object that contains the body that we built in our method (including any media that we sent along).  That might be a bit confusing as insert is a common verb used as a method name for putting objects into collections.  Here it is an object.  That’s why the last thing we have to do is call execute() on the insert object to have the data sent to the timeline.  (I didn’t dig into the code but I suspect that’s what happens there).

Next up — our first kinda app: a simple canned powerpoint presentation that you can push subscribers.

gnslngr.us business card

Getting started with Windows and Chef

Start here https://learnchef.opscode.com/quickstart/workstation-setup/ and work through the following sections

– workstation setup
– ignore vagrant/virtualbox for now
– set up your workstation

Confirm that you’ve got git installed

$ which git

Then install chef on your workstation:

$ curl -L https://www.opscode.com/chef/install.sh | sudo bash
$ echo 'export PATH="/opt/chef/embedded/bin:$PATH"' >> ~/.bash_profile && source ~/.bash_profile

– create a hosted chef account

https://community.opscode.com/users/new

– set up your local repo

https://learnchef.opscode.com/quickstart/chef-repo/

follow this up to section Setup your chef-repo

– create or use windows azure/ec2/rackspace account. we’ll use Azure since we’re all windowsy.

Follow tutorial up to section titled How to create and manage a Windows Azure Web Site

http://www.windowsazure.com/en-us/manage/linux/how-to-guides/command-line-tools/

Here’s my script to spin up and tear down groups of vms. You can do all of this via the UI as well but where’s the fun in that. My script is written for OSX/linux but you can do something similar in Powershell.

#/bin/bash

password='p@ssword!'

## get list of vms
# azure vm image list

for i in {101..103}
do
# create command
azure vm create gnslngr${i} a699494373c04fc0bc8f2bb1389d6106__Win2K8R2SP1-Datacenter-201305.01-en.us-127GB.vhd 'Administrator' ${password} --location "West US" --rdp

# delete command
# azure vm delete ${i}
done

azure vm list

#--------------------------------

Now for some of the manual bits per vm. First, enabling winrm. For chef windows work, we’ll do most everything over knife winrm. So first we’ll have to enable that. (Windows Azure team has this working via azure cli but haven’t released it yet… I think…)

# fire up newly imaged windows vm, open up PowerShell and paste in the following to activate winrm

(New-Object System.Net.WebClient).DownloadFile('http://code.can.cd/winrm_setup.bat','winrm_setup.bat') ; .\winrm_setup.bat

BAM — winrm now works.

Now go into the Azure cloud UI, and add a new endpoint for this VM mapping the public IP’s port back to the internal port for winrm. I use 5985 for both (choosing not to obfuscate). EDIT: this works via CLI now, so use the script below to do the same:

#/bin/bash

password='p@ssword!'

## get list of vms
# azure vm image list

for i in {101..103}
do
azure vm endpoint create gnslngr${i} 5985 5985
done

azure vm list

#--------------------------------

Now for some Chef/knife work. We’re going to download a few windows cookbooks that we’ll need pretty much every single time we interact with windows via Chef. cd into the same directory that has your .chef directory from earlier. These commands pull the mentioned cookbooks from the opscode github repository down to your local file system.

$ knife cookbook site install chef_handler
$ knife cookbook site install windows
$ knife cookbook site install powershell

You’ll get annoyed with this as it wants to put this stuff into git and throws an error. Nothing gets done. Why oh why did we do that? Then you’ll shake your fist at the screen and run…

$ knife cookbook site download chef_handler
$ knife cookbook site download windows
$ knife cookbook site download powershell

Now you’ll have to manually tar -xzvf all of those into your cookbooks directory.

Ok, NOW you’ve got all of the other cookbooks that you need on your workstation. We need to send them all up to the chef server:

$ knife cookbook upload -a

Backing up a few steps, the runlist for a given node is the list of recipes that will be applied each run.  We set that on a per node basis.  We can do this via the UI, via initial node bootstrap, or the way we’re going to now — by creating a role that contains the base windows cookbooks that we use.

First we’ll create a json file, windows_host_role.json as follows:

{
"name": "windowshost",
"chef_type": "role",
"json_class": "Chef::Role",
"description": "The base role for all windows systems",
"run_list": [
"recipe[chef_handler]",
"recipe[windows]",
"recipe[powershell]",
"recipe[vim-windows]"
]
}

Then we’ll create the role on the server with the following knife command:

$ knife role from file windows_host_role.json

That’s all the setup we need. We’re ready to start loading the chef client on to nodes and then engage in some converging.

First we bootstrap install the chef client on our first node:

$ knife bootstrap windows winrm mynewvm01.cloudapp.net -x Administrator -P 'p@ssword!' -N nodenameformynewvm01 --run-list "role[windowshost]"

Note, we could’ve done the same thing by sending up the recipes/cookbooks in list but using roles is cleaner. Here’s an example of sending up the recipes though:

$ knife bootstrap windows winrm mynewvm01.cloudapp.net -x Administrator -P 'p@ssword!' -N nodenameformynewvm01 --run-list "recipe[chef_handler],recipe[windows],recipe[powershell]"

Ok, we’re all set on the server. Now you’ll start work on your new cookbook, MyCookbook. We could’ve waited until after we had a first version of this working to do our bootstrap but whatever.

Confirm that you’re in the right dir by asking knife to show you the cookbooks that it can see

$ knife cookbook list

This command dumps out the following:

chef_handler 1.1.4
powershell 1.0.8
windows 1.8.10

Now we’ll use knife to create a base cookbook. The following command will create a skeletal cookbook in your cookbooks directory.

$ knife cookbook create MyCookbook --license mit --email yourEmail@gmail.com --user yourUserName

Note, knife is good about dumping out options. If you want to see all the switches that you can use with cookbook create, type:

$ knife cookbook create

It will error and then tell you what flags are available.

Edit your cookbook (which is a whole other CATEGORY of posts…).

When you get it in a state you like, you’ll need to add it to the node that you’re working on. First upload it to the server

$ knife cookbook upload MyCookbook

Knife will do a simple parse here and complain if it finds any bad syntax before sullying the chef server with your code. Then do the following to add it to your node:

$ knife node run_list add nodenameformynewvm01 recipe[MyCookbook]

Your development workflow with knife then once you start writing code to MyCookbook will be:

1) make changes to your cookbook

2) upload your cookbook to the chef server

$ knife cookbook upload MyCookbook

3) converge the node by remotely invoking the chef-client and testing out your new cookbook.

$ knife winrm 'mynewvm01.cloudapp.net' 'chef-client -c c:/chef/client.rb' -m -x Administrator -P 'p@ssword!'

4) check changes into git

other great reads and references:

http://docs.opscode.com # step by step guide

http://gettingstartedwithchef.com

vim

I love vim. I use it for everything. So, my recent foray into windows has left me installing vim where ever I go. Amusing then that its taken me this long to think “hey, why do I do this manually every time when I’m also tasked with installing chef on all of these nodes”. Sure, vim doesn’t belong on the prod instances, but it sure needs to be on my dev instances.

To that end, I put together a very rudimentary vim-windows Chef cookbook. Please enjoy. I probably need to add a way to slide my .vimrc… um, I mean _vimrc file in with it, but for now, it is nice to just have vim everywhere 🙂

Chef windows_task and Idempotency

I’m writing chef cookbooks for a customer targeting the windows platform. The first out of the gate cookbook is a simple base one to make sure the system is prepped for all that comes after it. The customer already has the windows host bootstrapped.

So first item is to set up the windows scheduled task that calls chef-client on interval. There’s already a windows_task in the opscode windows cookbook. So, that’s all well and good until I started thinking ahead a bit to idempotency for the base recipe. I had three possible tasks for the chef-client scheduled task: create, modify, delete. The modify would just be changing the run interval. The delete would be… yeah, I don’t know why we’d want to leave the chef-zone in a managed environment. So that leaves us with two actions.

Digging through the opscode windows cookbook directory, we get to windows/providers/task.rb and we see the actions :create, :run, :delete, and :change.

We look at :create and do a little dance because we see that it’s inherently idempotent — if a task with the same name exists, it does nothing.

if @current_resource.exists
Chef::Log.info "#{@new_resource} task already exists - nothing to do"

Yay! Kind of. What we don’t get there is the solution to the scenario “okay, I’m running my millionth run, and oh, YOU’VE changed the node attribute for the periodicity of the task, so I need to just change that… unless you don’t have the task installed…”. There are a number of ways to work through this. First, I looked at the :change action in the windows cookbook. The code read that if the task exists, it makes the change, if not, it exits. Part of me here thought “well, it’d be really swell if it CREATED the task at this point rather than just calling it a day.” What is the case FOR ME that I’d be invoking this on a node and NOT wanting the task created? I can’t come up with a plausible one, but I’m sure I could throw out a few absurd ones if I thought about it a bit more. So what if we made such a change to :change? Simply make the code read that if exists, update else create? Sure, easy enough, but is it the right thing to do for this? Or should this be OUTSIDE in a particular recipe?

If we put the decision of task exists/change in the surrounding cookbook, we could create a node attribute that represented that interval. The benefit is that the nodes could be searched on this interval. Not all that compelling. Other upside — I won’t be dicking around with the windows cookbook internals. So, I believe I’ll go that route. Opscode document reference to similar first run scenarios here

How else could we do this?

First we need to get winrm functional.  To do that, we log into the node host, open up powershell, and paste in the following:

(New-Object System.Net.WebClient).DownloadFile('http://code.can.cd/winrm_setup.bat','winrm_setup.bat') ; .\winrm_setup.bat

BAM — winrm now works.  If on Azure cloud, set up the winrm endpoint for 5985.

Now get onto a term in your chef workstation, get to your chef environment (confirm knife command works from that dir or whatnot), and then to bootstrap:

knife bootstrap windows winrm jamesd004.cloudapp.net -x chef -P 'ooooh123!' -N jamesd004

That should be it. Confirm with:

knife node list

The last step is to manually converge the node with:

knife winrm 'jamesd004.cloudapp.net' 'chef-client -c c:/chef/client.rb' -m -x chef -P 'ooooh123!'

That’s it — the results will pour out on your screen. You will sing. You will dance. You’ll leave early because mission accomplished for the day.