Saturday, June 27, 2015

PowerShell: The Pipeline, begin/process/end blocks, and You

So, one of PowerShell's strengths is that it works with objects natively instead of strings, like all other shells in existence.  These objects can be passed down the pipeline from one cmdlet to the next, to do a lot of processing on a lot of stuff with a minimal amount of code.  This is nice and all, but can our custom functions get in on this action?

Well, yes.  Yes they can.  You'd be naïve to think otherwise.  If user-written code couldn't fully harness the power of the pipeline, this would be SlightlyStrongShell, not PowerShell.

I'm only writing about this because I've just run across a situation where I was trying to do something and this was the only way it would work.  So basically, this entire post is "just learned why I want to use it, so I gotta share it with others" syndrome.

To me, the annoying part is that hashtables won't work.  Hashtables have a nice, simple shortcut syntax (@{ key1 = value1; key2 = value2; ... keyN = valueN }), which is why I tried to use them first.  But no, even though you can access a hashtable like you would an object, hashtables function more like associative arrays than objects.  What your code will see is a set of keys and a set of values with no relation to one another instead of the neatly grouped and named values you're giving it.

No, you need a real object for this to work.  We can get the definition of said object down to a reasonably short length, but it's still much longer than the shortcut syntax for a hashtable.  Behold.
new-object psobject -property @{ key1 = value1; key2 = value2; ... keyN = valueN }
Yeah, that's the best we can do without getting into aliases, which for whatever reason are present but all PowerShell literature in existence advises against their use.  Strangely, though, type accelerators are fine, even though they, too, can potentially not exist.  Anyone sensible would argue that the default aliases in PowerShell SHOULD (RFC 2119) be safe, but common sense seems to be lost on Microsoft.  Also, this definition makes use of a hashtable, for what that's worth.  Anyway.

Making a function that can process multiple objects via the pipeline takes some extra syntax, but it's worth it in that the result will Just Work™ with full-on cmdlets and any other PowerShell code whose results you can pipe into it.  Note that this is geared towards working with input objects by having the parameterization automatically split them up by property name, which to be honest, I find pretty sexy.  It will remain an exercise for the reader to modify this to work with an entire input object at once.

So, let's get a sample function in here.  It won't do much that's useful, but that doesn't matter.  Grab it from this pastebin and run it in PowerShell.

That code is about the bare minimum, with some basic functionality implemented so we can see results based on easy-to-generate sample inputs.  But wait... begin, process, and end?  Why are they there?  Functions can work just fine, even with pipeline input, without them!

Well, yes, that's technically true.  However, by omitting them, you're limited to one object on the pipeline, period.  If you try to pipe more than one in, the function will only 'see' and operate on the last one.  There may be some obscure use case for that, but in general, when you pipe multiple objects into a function, you want that function to operate on all of those objects.  Now, let's break down what's going on here.

When the function starts up, the begin block runs.  This is where you should put any initialization code, such as variable declarations or opening a database connection.  Next, for each input object, the process block runs.  The vast majority of your function's code should go here.  Finally, once everything has been processed, the end block runs.  This is where you do all your cleanup, such as calling the Dispose() method on any objects that require it, or closing your database connection.  Note that both the begin and end blocks only run once per function call.  Furthermore, all three blocks share the same scope.  Also, that if you want to confuse anyone reading your code, you can reorder the blocks all you like.

So, we've got sample code, but we need a set of sample objects to show it off.  You could just make up one such set on your own, using the function's parameters as a guide, but...  You could also be lazy, and just copy/paste the one I'm about to give you.  Also, note that each item in this array is enclosed within parentheses, seemingly needlessly.  After all, array definitions in PowerShell are comma-delimited, right?  However, evaluating it without them throws an exception, so consider this yet another example of "IF IT DOESN'T WORK IN POWERSHELL, THROW MORE PARENTHESES AT IT".
@( ( new-object psobject -property @{ Name = 'Alice'; Age = 16; WearsGlasses = $true } ), ( new-object psobject -property @{ Name = 'Sarah'; Age = 19; WearsGlasses = $true } ), ( new-object psobject -property @{ Name = 'Jennifer'; Age = 23; WearsGlasses = $true } ), ( new-object psobject -property @{ Name = 'Melissa'; Age = 22; WearsGlasses = $false } ) )
Pipe it into Process-SampleObject and you'll see your result:
C:\PS> @( ( new-object psobject -property @{ Name = 'Alice'; Age = 16; WearsGlasses = $true } ), ( new-object psobject -property @{ Name = 'Sarah'; Age = 19; WearsGlasses = $true } ), ( new-object psobject -property @{ Name = 'Jennifer'; Age = 23; WearsGlasses = $true } ), ( new-object psobject -property @{ Name = 'Melissa'; Age = 22; WearsGlasses = $false } ) ) | Process-SampleObject
Processing list of people...

Stay away from Alice, you pedo. Even though those glasses are cute.
Porn, tobacco, and sex are okay for Sarah, but alcohol is not. As a bonus, those glasses are sexy.
Anything goes for Jennifer! Giggity. On second thought, thick-frame glasses? No thanks.
Anything goes for Melissa! Giggity.

Done!
As I stated in the middle of one of those paragraphs up there, modifying this to work with an entire object as opposed to splitting it up into its properties will be left as an exercise to the reader.  One thing to note, though: If you're lazy and don't have a parameter block, then PowerShell provides you with the input object through the automatic variable $_, which you should be used to from undoubtedly having already done countless $objects | where-object { $_.DoesntWork } | foreach-object { $_.Defenestrate() } constructs.  Have fun.

No comments:

Post a Comment

I moderate comments because when Blogger originally implemented a spam filter it wouldn't work without comment moderation enabled. So if your comment doesn't show up right away, that would be why.