Thursday, May 13, 2010

Powershell Scenario: Convert Non-tabular Text file to tabular structured file

I had a file structured like this.

================

Item ID: 1

Item Name: abc1\def1\

        Item Child ID: 1 , Item Child path: child11 \ child 12

================

Item ID: 2

Item Name: abc2\def2\

        Item Child ID: 1 , Item Child path: child11 \ child 12

        Item Child ID: 2 , Item Child path: child21 \ child 22

================

Item ID: 3

Item Name: abc3\def3\

       Item Child ID: 1 , Item Child path: child31 \ child 32

It was a big file with this hierarchical structure. I wanted to see this as

Item ID Item Name Item Child ID Item Child path
1 abc1\def1\ 1 child11 \ child 12
2 abc2\def2\ 1 child11 \ child 12
2 abc2\def2\ 2 child21 \ child 22
3 abc3\def3\ 1 child31 \ child 32

New Powershell features used

  1. Custom objects
  2. ConvertTO-HTML

This is specific to Powershell 2 on windows 7.

Convert-HiearchyToTable.ps1
  1. param ($FileName)
  2.  
  3. if (!$FileName) {Write-Error "It would be nice to have some input file."; return}
  4.  
  5. Add-Type @'
  6. public class TaggingObject
  7. {
  8.     public string ItemID = "";
  9.     public string ItemName = "";
  10.     public string ChildID = "";
  11.     public string ChildPath = "";
  12. }
  13. '@
  14.  
  15. $LineList = Get-Content $FileName
  16. trap {Write-Error "Check file name"; break}
  17. $TagInstanceList = @()
  18.  
  19. $ItemID = ""
  20. $ItemName = ""
  21. foreach ($LineTemp in $LineList){
  22.     $Line = $LineTemp.Trim()
  23.     if (($Line.StartsWith("=====")) -or `
  24.         ($Line.Trim().length -eq 0)){
  25.         continue
  26.     }
  27.     
  28.     if ($Line.ToLower().StartsWith("job name")){
  29.         $ItemName = ($Line.Split(":", [StringSplitOptions]::RemoveEmptyEntries))[1]
  30.         continue
  31.     }
  32.     
  33.     if ($Line.ToLower().StartsWith("item child id")){
  34.         $TagInstance = New-Object TaggingObject
  35.         $TagInstance.ItemID = $ItemID
  36.         $TagInstance.ItemName = $ItemName
  37.         
  38.         $CategoryValues = $Line.Split(@(':',','), [StringSplitOptions]::RemoveEmptyEntries)
  39.         $TagInstance.ItemChildID = $CategoryValues[1]
  40.         $TagInstance.ItemChildPath = $CategoryValues[3]
  41.         
  42.         $TagInstanceList += $TagInstance
  43.         continue
  44.     }
  45. }
  46.  
  47. $FileNameHTML = $FileName.replace(".txt", ".htm")
  48. $TagInstanceList | ConvertTo-Html > $FileNameHTML

Save this as Convert-HiearchyToTable.ps1 and call as

Convert-HiearchyToTable.ps1 [TEXT_FILE.TXT]

If will generate TEXT_FILE.htm in the same path as TEXT_FILE.TXT. Off course, you should have write access to that folder.

Now this HTML file will show as very nice tabular structure which can be opened in excel also for some formatting. When it is opened in excel, excel will complain that this files i used by ‘another user’ for modification. Workaround is to open that file as read only and then save as excel to continue modification.

1 comment:

Khairil Irfan said...

Below link contain small utilities that can be used to convert non tabular data to tabular format (different file format from your example though)

http://www.khairilthegreat.web.id/2012/04/convert-non-tabular-keyvalue-to-tabular.html