Saturday, March 3, 2012

PowerShell pretty printer / code cleaner V1

PowerShell Beautifier


Now on GitHub: https://GitHub.com/DTW-DanWard/PowerShell-Beautifier

Formatting Matters

Tabs or spaces; spaces or tabs? If spaces, how many? We sure do take whitespace seriously. But when writing ‘commit-worthy’ PowerShell code, there’s more than just whitespace to think about. Shouldn’t you use cmdlet names instead of aliases? And shouldn’t you have correct casing for cmdlets, methods and types?

PowerShell Beautifier is a PowerShell command-line utility for cleaning and reformatting PowerShell script files, written in PowerShell. Sure, it will change all indentation to tabs or spaces for you - but it will do more than just that. A picture is worth 1KB words; here’s a before/after showing all types of changes including spaces & tabs:

Here's a simpler pic focusing on the alias-replacement and casing changes:



The PowerShell Beautifier makes these changes:
  • properly indents code inside {}, [], () and $() groups
  • replaces aliases with the command names: dir → Get-ChildItem
  • fixes command name casing: get-childitem → Get-ChildItem
  • fixes parameter name casing: Test-Path -path → Test-Path -Path
  • fixes [type] casing
    • changes all PowerShell shortcuts to lower: [STRING] → [string]
    • changes other types (if in memory): [system.exception] → [System.Exception]
  • cleans/rearranges all whitespace within a line



Monday, February 20, 2012

Get file encoding even if no Byte Order Marker

Note: you can find the latest version of the encoding functions in my PowerShell beautifier project:
https://github.com/DTW-DanWard/PowerShell-Beautifier
Check out file src/DTW.PS.FileSystem.Encoding.psm1

Every now and then you need to be able to programmatically determine a file's encoding. Maybe you are writing a utility that edits files and you want ensure you maintain the original encoding type. Perhaps you want to make sure that certain files have a Byte Order Marker (BOM).

If the file has a BOM, this is easy. If it doesn't... aw, crap. At that point you have to analyze the file's contents and make a judgement call based on what you see. I wrote a function to do this: Get-DTWFileEncoding

Get-DTWFileEncoding returns a System.Text.Encoding type based on the file specified. Here's an example of a big-endian file with a BOM:






As you can see, the System.Text.Encoding type is returned and the BOM type has the correct value: FE FF

Here's an example for another big-endian file, this time with no BOM:
The returned Encoding type info looks the same as the first but if you inspect the Preamble, there's no value.




There are some other handy functions in there as well:
  • Add-DTWFileEncodingByteOrderMarker - adds a byte order marker file encoding to a file.
  • Compare-DTWFiles - compares two files and returns $true if same, $false otherwise.  Uses the two functions below to do comparisons.
  • Compare-DTWFilesIgnoringBOM - compares two files, ignoring BOMs, returning $true if same, $false otherwise.
  • Compare-DTWFilesIncludingBOM - compares two files, including BOMs, returning $true if same, $false otherwise.

Again, you can get the encoding functions at the beautifier:


Wednesday, December 7, 2011

ScriptCop rule to find Return in ForEach cmdlet loop

I love the ForEach-Object cmdlet; it embodies the spirit of pipeline processing and helps make logic clear. To me, this simple ForEach-Object cmdlet command:
Split-Path -Path $Home -Parent | dir | foreach {$_.Name}
is far easier to read and understand than this foreach keyword version:
foreach ($Item in (dir (Split-Path -Path $Home -Parent))) {$Item.Name}

However there's a quirky issue with ForEach-Object cmdlet loops: return statements do not work. (They do work in foreach keyword loops.) Here's a quick example:

function f1 {
  foreach ($i in 1..4) { $i; if ($i -gt 2) { return }; "Inside" }
  "After"
}

function f2 {
  1..4 | foreach { $_; if ($_ -gt 2) { return }; "Inside" }
  "After"
}

C:\> f1
1
Inside
2
Inside
3

C:\> f2
1
Inside
2
Inside
3
4
After

As you can see, the return in the foreach keyword loop exits the function entirely at the moment it is run - good! However, in the foreach cmdlet loop the return only exits the current iteration of the loop - it doesn't run/output the following "Inside". But it does keep running the loop AND outputs the "After" after the loop in the function - bad! (See PowerShell in Action V2 section "Using the return statement with 
ForEach-Object" for more information.)

After discovering ScriptCop, the first custom rule I wanted to write for it was a return-inside-foreach-cmdlet detector. So here it is. The rule works by matching the contents of a foreach cmdlet loop using a regex then it takes the matching content, tokenizes it and looks for return Keyword. The regex is a little nasty but works great. Tokenizing the content and searching through the tokens was far easier and more accurate than trying to match the word 'return' with a regular expression that didn't accidentally match that word in a string, comment or block comment.

See the ScriptCop documentation for notes about adding this file to your setup. If you aren't familiar with ScriptCop, check out the ScriptCop site or these other links:
The first time you run it you'll find a whole bunch of items to clean up but after that it's easy to keep your code nice and clean and proper.

Thursday, November 24, 2011

First post!

I've been working with PowerShell since Monad Beta 2 came out. In that time so many great people out there have shared their utilities and support... it's long overdue that I give back to the community as well. I created this blog to post some of the utilities/thoughts/garbage that I've come up with. Coming soon:
  • ScriptCop rule for Return statement inside ForEach cmdlet loop
  • PowerShell code formatter/pretty printer
  • Subversion/FishEye PowerShell 'client'
  • Global settings drive
  • Task-based processor
  • Crazy logger utility
  • Module framework thoughts