Share via


How can I get the file count in a zipped file

Question

Wednesday, October 19, 2011 9:32 PM

Is there a powershell way to get the file count within a .zip file?  My only thought is to unzip it to a folder, count the files, then, delete the unzipped files.  It seems like there has to be a better way than that.

All replies (32)

Thursday, October 20, 2011 1:27 AM ✅Answered | 1 vote

$data = @"
Length     Method       Size     Ratio    Date     Time    CRC-32  Attr  Name
                   
        51421  DeflatX          7834  85%  08/01/2011  07:31  d35ad156 --w-* D080111rpts.ctl
    179450364  DeflatX      15360545  92%  08/01/2011  07:12  73ddacbf --w-* D080111rpts.dat
                                                 
    179501785               15368379  92%                                            2
"@

$input_data = $data -split "`n" |% {$_.trim()}

function get-zipcontent {
 begin {
     $regex = "\s*(\d+)\s+(\w+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
     $props = {
      @{
         Name =   $matches[8]
         Date =   $matches[5] -as [datetime]
         Length = $matches[1] -as [int]
         Method = $matches[2]
         Size =   $matches[3] -as [int]
         Ratio =  $matches[4] -as [int]
         CRC32 =  $matches[6]
         Attr =   $matches[7]
         }
     }
     $proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
 }
 
process { 
    if ($_ -match $regex){
     new-object psobject -property (&$props) | select $proplist
      }
   } 
}
 
$zipcontent = $input_data | get-zipcontent
 
$zipcontent

Name   : D080111rpts.ctl
Date   : 8/1/2011 7:31:00 AM
Length : 51421
Method : DeflatX
Size   : 7834
Ratio  : 85
CRC32  : d35ad156
Attr   : --w-*

Name   : D080111rpts.dat
Date   : 8/1/2011 7:12:00 AM
Length : 179450364
Method : DeflatX
Size   : 15360545
Ratio  : 92
CRC32  : 73ddacbf
Attr   : --w-*

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Tuesday, October 25, 2011 2:27 PM ✅Answered

This will work as long as you never have more than one embedded space in a file name.

\s((\S+\s+\S+)$|(\S+)$)

This should work with any number of spaces.

\s(\w[^.]+\\w{1,3})\s*$

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Wednesday, October 19, 2011 9:36 PM

What are you using to zip them?[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Wednesday, October 19, 2011 9:42 PM

Winzip, although, technically it's the command line utilities wzzip and wunzip.  I am not averse to using something else though.


Wednesday, October 19, 2011 9:54 PM

What does wzzip -vb <zipfile name> produce?

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Wednesday, October 19, 2011 10:01 PM

I am continuing to work with this.  As a side note, I have seen where folks turn output from utilities like this into proper objects that can be searched.  How would one do that?  For instance, if I run the -vm command it returns the following.  (The -vb command does the same.)  I am thinking if I turned this into an object I could count a certain number of lines in to skip the header.  NOTE: This file is empty, so, the info below is correct.

PS C:\Users\wsteele> & "C:\Program Files (x86)\WinZip\WZUNZIP.EXE" -vm "C:\somezip.zip"
WinZip(R) Command Line Support Add-On Version 3.2 (Build 8668)
Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved

Zip file: C:\somezip.zip


    Length     Method       Size     Ratio    Date     Time    CRC-32  Attr  Name
                      
Warning [C:\somezip.zip]:  Zip file is empty
                                                  
            0                      0   0%                                            0

Wednesday, October 19, 2011 10:08 PM

Here's what I use:

http://mjolinor.wordpress.com/2011/10/08/new-object-from-a-hash-table-in-a-script-block/

Typically, when you convert utility program output to PS objects, you get one object per line of output data, so you'd parse the output into an array of PS objects, and then filter those if needed, and get the .count property of the reult.

 

 

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Wednesday, October 19, 2011 10:35 PM

Below is the script I am using. It's pretty clearly commented if you want to get the flow of the script.  One problem I have is being able to analyze whether a decrypted zip matches an encrypted zip.  I was using the same logic above wzunip.exe -vb and piping that to a variable, then, comparing the length of that object, but, that doesn't seem to work.  Anyone have a better suggestion?  That portion of the process is: 1) unzip encrypted file 2) place files in new unencrypted zip 3) copy original to archive and 4) move unencrypted zip to processing directory.  Prior to this copy/move I want to validate that the two zip files do in fact have the same contents.

<#
    .AUTHOR
        Will Steele

    .DEPENDENCIES
        None.

    .DESCRIPTION
        This script unpacks encrypted .zip files, repacks the files into a new .zip
        file. Next, it copies the encrypted .zip file to the archive drive and moves
        the unencrypted .zip file to the processing drive.
    
    .EXAMPLE
        foreach($zip in (dir C:\*.zip)) {
            & "C:\ProcessZip.ps1" -zip_file $zip.fullname -rootdrive %rootdrive%
        }
    
    .EXTERNALHELP
        None.
        
    .FORWARDHELPTARGETNAME
        None.
        
    .INPUTS
        System.String.
        
    .LINK
        None.
        
    .NAME
        ProcessZip.ps1
        
    .NOTES
        Revision History:
            Created 10/03/2011.
        
    .OUTPUTS
        None.
        
    .PARAMETER zip_file
        A mandatory parameter specifying the location of the .zip file to be unpacked.
    
    .PARAMETER password
        An optional parameter specifying the password of the .zip file.  Default is 
        'hellomydarling'.
    
    .PARAMETER processing_directory
        An optional parameter specifying the location of the processing directory. The
        default is 'c:\data'.
    
    .SYNOPSIS
        Unzip encrypted .zip file and pack new, unencrypted .zip file.
#>

[CmdletBinding()]
Param(
    [Parameter(HelpMessage="Enter the path to the encrypted .zip file.", Mandatory=$true)]
    [ValidateNotNullOrEmpty()]
    [ValidateScript({Test-Path $_})]
    [IO.FileInfo] $zip_file,
    
    [Parameter(HelpMessage="Enter the password to decrypt the .zip file.", Mandatory=$false)]
    [String]
    [ValidateNotNullOrEmpty()]
    [String] $password = 'hellomydarling',
    
    [Parameter(HelpMessage="Enter the output location of the unencrypted .zip file.", Mandatory=$false)]
    [ValidateScript({Test-Path $_})]
    [ValidateNotNullOrEmpty()]  
    [IO.DirectoryInfo] $processing_directory = "C:\data",
    
    [Parameter(HelpMessage="Enter the archive location of the unencrypted .zip file.", Mandatory=$false)]
    [IO.DirectoryInfo]
    [ValidateScript({Test-Path $_})]
    [ValidateNotNullOrEmpty()] $archive_directory="c:\archive",
    
    [Parameter(HelpMessage="Enter the root drive.", Mandatory=$true)]
    [ValidateScript({Test-Path $_})]
    [ValidateNotNullOrEmpty()]
    [IO.DirectoryInfo] $rootdrive
)

#requires -Version 2.0
Set-StrictMode -Version 2.0

#region variables

    # Encrypted .Zip file stored in archive
    $archive_zip = "$archive_directory\" + (Split-Path $zip_file -Leaf);
    
    # Main log file
    $log_file = "$rootdrive\logfile.log";
    
    # Unencrypted .zip file MOVED to process directory
    $processing_zip = "$processing_directory\unencrypted.$(Split-Path $zip_file -Leaf)";
    
    # Unencrypted .zip file in original directory
    $unencrypted_zip = "$(Split-Path $zip_file)\unencrypted.$(Split-Path $zip_file -Leaf)";
    
    # New folder for temporary processing of unzipped files
    if(!(Test-Path $((Split-Path $zip_file)+"\unzip"))) {
        $unzip = New-Item -Path (Split-Path $zip_file) -ItemType Directory -Name Unzip;
    } else {
        $unzip = "$(Split-Path $zip_file)\unzip";
    }
    
#endregion variables

#region Functions

    function Report-UnZipError {
        Write-Output "ERROR!!!!! $file failed to unzip. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
    } # end function Report-UnZipError

    function Report-ZipError {
        Write-Output "ERROR!!!!! $file failed to zip. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
    } # end function Report-ZipError

    function Report-CopyError {
        Write-Output "ERROR!!!!! $zip_file failed to copy. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
    } # end function Report-CopyError

    function Report-MoveError {
        Write-Output "ERROR!!!!! $(Split-Path $zip_file)\unencrypted.$(Split-Path $zip_file -Leaf) failed to move. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
    } # end function Report-CopyError
    
    function Main {
        # Validate machine can access drives (to ensure active cluster node is being accessed).
        Write-Verbose "Validating access to the rootdrive to ensure the SAN is active and this is an active cluster node.";
        
        if(Test-Path $rootdrive) {
            # Report: sucessfully accessed $rootdrive
            Write-Verbose "This script can access the rootdrive.  Continuing processing.";
            
            # Check to see if unencrypted file exists yet
            Write-Verbose "Verifying current file ($zip_file) has not been processed.";
            if(Test-Path $unencrypted_zip) {
                Write-Verbose "The file $unencrypted_zip has already been unencrypted.  Cancelling processing.";
                Write-Error "The file $unencrypted_zip has already been unencrypted.  Cancelling processing.";
            } elseif(Test-Path $archive_zip) {
                Write-Verbose "The file $archive_zip has already been unencrypted.  Cancelling processing.";
                Write-Error "The file $archive_zip has already been unencrypted.  Cancelling processing.";
            } else {
                # Report file has not been processed
                Write-Verbose "The file ($zip_file) has not been processed yet.  Continuing processing.";
                
                # Unpack encrypted file
                Write-Verbose "Attempting to unzip file ($zip_file).";
                Start-Process -FilePath "C:\Program Files (x86)\WinZip\wzunzip.exe" -ArgumentList "-e $zip_file $unzip -s$Password" -Wait;
                Write-Verbose "Unzip process completed.";

                # Pack unencrypted files
                Write-Verbose "Attempting to zip file ($unencrypted_zip).";
                Start-Process -FilePath "C:\Program Files (x86)\WinZip\wzzip.exe" -ArgumentList "-a -m `"$unencrypted_zip`" `"$unzip\*.*`"" -Wait
                Write-Verbose "Zip process completed.";
                
                # Compare unencrypted and encrypted zip files to validate new zip file.
                Write-Verbose "Gathering statistics for encrypted .zip ($zip_file) for analysis.";
                $encrypted_summary = & "C:\program files (x86)\winzip\wzunzip.exe" -vb $zip_file;
                
                Write-Verbose "Gathering statistics for unencrypted .zip ($unencrypted_zip) for analysis.";
                $unencrypted_summary = & "C:\program files (x86)\winzip\wzzip.exe" -vb $unencrypted_zip;
                
                # Move encrypted .zip to archive
                Write-Verbose "Copying encrypted zip file ($zip_file) to archive ($archive_directory).";
                Move-Item -Path $zip_file -Destination $archive_directory;
                Write-Verbose "Encrypted .zip file copy ($zip_file) complete.";
                
                # Verifying encrypted .zip was copied to archive
                Write-Verbose "Verifying encrypted .zip file ($archive_zip) was copied correctly.";
                if(Test-Path $archive_zip) {
                    Write-Verbose "The encrypted .zip file ($archive_zip) was copied correctly.";
                } else {
                    Write-Verbose "The encrypted .zip file ($archive_zip) was not copied correctly.";
                    Report-CopyError;
                    break;
                }

                # Move unencrypted .zip to processing directory
                Write-Verbose "Moving unencrypted zip file ($unencrypted_zip) to processing directory ($processing_directory).";
                Move-Item -Path $unencrypted_zip -Destination $processing_directory;
                Write-Verbose "Unencrypted .zip file ($unencrypted_zip) move complete.";
                
                # Verifying unencrypted .zip was moved to processing
                Write-Verbose "Verifying unencrypted .zip file ($unencrypted_zip) was moved correctly.";
                if(Test-Path $processing_zip) {
                    Write-Verbose "The unencrypted .zip file ($processing_zip) was copied correctly.";
                } else {
                    Write-Verbose "The unencrypted .zip file ($processing_zip) was not copied correctly.";
                    Report-CopyError;
                    break;
                }
            }
        } else {
            Write-Output "This machine cannot access the rootdrive ($rootdrive).  It is either not the active cluster node or the SAN is inaccessible.";
            break;
        }
    }
    
#endregion Functions

#region ScriptBody

    . Main

#endregion ScriptBody

Wednesday, October 19, 2011 10:46 PM

What is it exactly you're trying to accomplish?  The only way to tell if an encrypted file matches an unencrypted file is to either decrypt the encrypted file, or encrypt the unencrypted file prior to doing the comparison.

  Comparing the length is not a good test if you need to know if the files are identical.  A better test would be to compare the MD5 hashes of the files.  There's a PS script in the repository for calculating the MD5 hash of a file:

http://gallery.technet.microsoft.com/scriptcenter/2a0cb3b9-267c-4e4d-b489-df5d907bca75

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Wednesday, October 19, 2011 11:30 PM

Sorry, too many words diluted my comment.  I want to verify the unencryption process (creating a new zip and copying all the original zip file's contents into it) have the same file set.  Not trying to see byte for byte match; you're right the encryption would make that impossible.  I just want to be sure all the files that should be there are.


Thursday, October 20, 2011 12:14 AM

I think parsing out the output of the -vb option will be the ticket. 

I can help you write a regex for that if you can post a sample of the output that has values.  It's hard to know what to match for if there there wasn't anything to report.

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Thursday, October 20, 2011 12:49 AM

Yeah, if you could help me build some parsing technique that would be awesome.  Here is a snippet for a zip with 2 files:

PS C:\Users\wsteele> & "C:\Program Files (x86)\WinZip\WZUNZIP.EXE" -vm "C:\somezip.zip"
WinZip(R) Command Line Support Add-On Version 3.2 (Build 8668)
Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved

Zip file: C:\somezip.zip


    Length     Method       Size     Ratio    Date     Time    CRC-32  Attr  Name
 ------------  ------   ------------ -----    ----     ----   -------- ----  ----
        51421  DeflatX          7834  85%  08/01/2011  07:31  d35ad156 --w-* D080111rpts.ctl
    179450364  DeflatX      15360545  92%  08/01/2011  07:12  73ddacbf --w-* D080111rpts.dat
 ------------           ------------  ---                                    ---------
    179501785               15368379  92%                                            2

I will be examining the contents of 1 encrypted file and 1 unencrypted file.  Apparently this -vm option doesn't care if you analyze the contents, rather, only if you try to open them.


Thursday, October 20, 2011 1:50 AM

Thanks for the snippet.  I have never really work with RegEx's much, but, you may given me my first reason to really dig into them.  Have a copy of a RegEx book on my shelf that has been staring at me for a while.  

To be sure I get what you did (and don't just reuse the code blindly) correct me if I miss anything:

1) In the function, parse input with RegEx (which I'll get to in a second) and match hits to "fields".

2) Pair up valid RegEx hits with properties as objects.

3) Using the function, gather input (I am assuming the raw output from my utility would qualify if stored in a variable) into a new variable.

I am thinking I could simply use the $zipcontent.count property for a "file" count.

Now, about that regex, what I see is for $regex = "\s*(\d+)\s+(\w+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$" is:

spaces - digits (for length) - spaces - words (for Method) - spaces - digits (for size) - spaces - digits (for percentage) - spaces - and I can't make out the rest.

I like this trick of finding patterns and creating fields.  Very cool.  Thanks for working this up.  Got me thinking in all sorts of new directions.

What does this line do? $proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'


Thursday, October 20, 2011 2:09 AM

(\S+\s+\S+) = one or more non-space characters, followed by one or more spaces, followed by one or more non-space characters.  This is the Date and Time combine.  It's already in the correct format to be able to parse it directly to [datetime].

\s+

(\w+) = one or more word characters (this matches A-Z,a-z,0-9, and the underscore). This is the CRC-32 field

\s+

(\S+) one or more non-space characters (this is the attributes)

\s+

(\S+) one or more non-space characters (this is the name)

\s*$ zero or more spaces, to the end of the line.

 

This:

$props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'

is explained a little better in the blog.  When you create an object from a hash table, the properties don't come out in any predictable order.  This parses the names from the $props script block, in the order they appear in the hash table definition. That's re-used as the argument list  of the | select when the object is created.  If you want the properties in a different order in your object, just re-order them in the script block / hash table, and they'll automatically get re-ordered in the created objects.

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Thursday, October 20, 2011 9:17 PM

Really appreciate the help on this one.  You've shown me some things I'm excited about studying a little and throwing in the toolbox.


Thursday, October 20, 2011 9:21 PM

Thanks!  That what I like to hear![string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Friday, October 21, 2011 2:45 AM

As I am trying to embed this into a script I am having problems with the last part, pipelining the data through the Get-ZipContent.  I ran through each step comparing your example, which works great, and, my actual solution.  The only difference was the data you used was writing in here-strings.  I grabbed it using the wzunzip.exe output.  I used this code to get it to a variable:

$argumentlist = "-vb `"C:\Data\Conv000001.zip`"";
$standarderror = [System.IO.Path]::GetTempFileName();
$standardoutput = [System.IO.Path]::GetTempFileName();
Start-Process -FilePath "C:\Program Files\WinZip\WZUNZIP.EXE" -ArgumentList $argumentlist -RedirectStandardError $standarderror   -RedirectStandardOutput $standardoutput  -NoNewWindow -Wait;

$zip_input_data = (gc $standardoutput) -split "`n" | % {$_.trim()};

I checked the data each step of the process.  When I checked gc $standardoutput it gave me data as expected.  When I checked $zip_input_data it also gave me the expected results (basically a summary like the you put in the variable in your example).  However, when I ran the last step

$zipcontent = $zip_input_data | get-zipcontent;

I didn't get any results.  Having used your function, I should get about 128 objects.  In my case, I get 0.  It seems the data just isn't being passed as expected or the fields no longer line up.  Is it possible having used a .txt file instead of a variable could be the issue?  Maybe some underlying datatype mismatch because I used a different data source stream?  I'm going to double check and ensure the fields all match up to the field layout I provided (and which you got working).

 


Friday, October 21, 2011 2:56 AM

Whoops again.  The -vb switch drops a field that the -vm switch provides.  I'll change the function your provided and post it once I get it going.  Disregard the issue.


Friday, October 21, 2011 3:02 AM

Ok, here's the corrected function.  When I run it everything works, except I only get the file name.  I am trying to the name with the extension:

function get-zipcontent {
    begin {
    $regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
    $props = {
            @{
        Name =   $matches[6]
        Date =   $matches[4] -as [datetime]
        Length = $matches[1] -as [int]
        Size =   $matches[2] -as [int]
        Ratio =  $matches[3] -as [int]
        CRC32 =  $matches[5]
      }
    }
    
        $proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
    }
 
    process { 
    if ($_ -match $regex){
            new-object psobject -property (&$props) | select $proplist
    }
    } 
}

Friday, October 21, 2011 3:15 AM

 

function get-zipcontent {
 begin {
    $regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
    $props = {
 @{
        Name =   $matches[6]
        Date =   $matches[4] -as [datetime]
        Length = $matches[1] -as [int]
        Size =   $matches[2] -as [int]
        Ratio =  $matches[3] -as [int]
        CRC32 =  $matches[5]
      }
    }
 
$proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
 }
 
process { 
  if ($_ -match $regex){
  write-debug "Matched record: `n"
  write-debug "$_"
  write-debug "***********************"
  write-debug "Matches:`n"
  write-debug "$($matches[1..6])"
 new-object psobject -property (&$props) | select $proplist
  }
 } 
}

Let's add some debug.

Set $debugpreference = "Continue"

 

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Friday, October 21, 2011 2:15 PM

I ran both processes to compare the sample dataset and the output from a live zip I am working with.  A good bit of info, but, I commented it so it should be clear what's going on.  I did as you suggested: added the write-debug lines and set $debugpreference = 'Continue':

Approach 1: Data from variable
$data variable output:
Length     Method       Size     Ratio    Date     Time    CRC-32  Attr  Name
                      
        51421  DeflatX          7834  85%  08/01/2011  07:31  d35ad156 --w-* D080111rpts.ctl
    179450364  DeflatX      15360545  92%  08/01/2011  07:12  73ddacbf --w-* D080111rpts.dat
                                                  
    179501785               15368379  92%                                            2
Spliting data.
Professing data with Get-ZipContent-Old
Output of $variable_content.


Name   : D080111rpts.ctl
Date   : 8/1/2011 7:31:00 AM
Length : 51421
Method : DeflatX
Size   : 7834
Ratio  : 85
CRC32  : d35ad156
Attr   : --w-*

Name   : D080111rpts.dat
Date   : 8/1/2011 7:12:00 AM
Length : 179450364
Method : DeflatX
Size   : 15360545
Ratio  : 92
CRC32  : 73ddacbf
Attr   : --w-*

Approach 2: Data from wzunzip
Collecting import.txt data into $import_file.
Configuring wzunzip analysis.
Starting processing.
Complete wzunzip processing.
Splitting data
Output of $zip_input_data
WinZip(R) Command Line Support Add-On Version 3.2 (Build 8668)
Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved

Zip file: C:\Data\Conv000001.zip

Length         Size     Ratio    Date     Time   Name
               
1218497       1024416  16%  09/21/2011  10:01  10097 E.pdf
1616270       1402640  14%  09/21/2011  09:33  10114 A.pdf
1079692        943306  13%  09/21/2011  09:46  1013 C.pdf
181716        153372  16%  09/21/2011  09:46  1013 MC.pdf
1019725        896758  13%  09/21/2011  10:16  10132 C.pdf
699030        600858  15%  09/21/2011  09:56  10132 D.pdf
135857        129956   5%  09/21/2011  09:34  10152 A.17594.pdf
878353        761607  14%  09/21/2011  09:34  10152 A.pdf
1046299        882238  16%  09/21/2011  11:16  10152 B.pdf
888498        773885  13%  09/21/2011  09:57  10152 F.pdf
1009413        856657  16%  09/21/2011  10:32  10152 G.pdf
92171         85836   7%  09/21/2011  10:47  10170 A.25909.pdf
757799        625612  18%  09/21/2011  10:47  10170 A.pdf
317116        295492   7%  09/21/2011  11:09  10170 B.27442.pdf
1007145        858549  15%  09/21/2011  11:09  10170 B.pdf
92750         86244   8%  09/21/2011  10:28  10170 D.24581.pdf
508113        437821  14%  09/21/2011  10:28  10170 D.pdf
317012        295312   7%  09/21/2011  11:09  10170 E.27443.pdf
1072913        902921  16%  09/21/2011  11:09  10170 E.pdf
859630        722500  16%  09/21/2011  11:16  10170 F.pdf
1331056        964189  28%  09/21/2011  10:16  10170 MC.pdf
99576         82660  17%  09/21/2011  09:47  10232 A.20588.pdf
1043727        916334  13%  09/21/2011  09:47  10232 A.pdf
1919256       1711546  11%  09/21/2011  10:06  10232 B.pdf
239660        209044  13%  09/21/2011  09:46  1026 MC.pdf
1278687       1140173  11%  09/21/2011  10:16  10260 B.pdf
1551213       1391788  11%  09/21/2011  10:16  10262 B.pdf
4016061       3518105  13%  09/21/2011  10:16  10285 A.pdf
325337        287375  12%  09/21/2011  10:16  10287 ODP.pdf
358069        287281  20%  09/21/2011  10:16  10289 MC.pdf
1499359       1317263  13%  09/21/2011  09:45  1032 A.pdf
1318355       1087831  18%  09/21/2011  11:13  1032 B.pdf
1616523       1337912  18%  09/21/2011  10:27  1032 D.pdf
792675        660027  17%  09/21/2011  10:16  10364 MC.pdf
1168987       1020574  13%  09/21/2011  09:40  10393 A.pdf
28692         19200  34%  09/21/2011  09:48  10395 A.20771.pdf
938371        830101  12%  09/21/2011  09:48  10395 A.pdf
366707        301509  18%  09/21/2011  10:15  10459 MC.pdf
688418        459683  34%  09/21/2011  10:15  10471 ODP.pdf
284516        236212  17%  09/21/2011  10:15  10477 MC.pdf
2949227       2618832  12%  09/21/2011  09:44  1048 C.pdf
66757         59947  11%  09/21/2011  09:46  1048 MC.20455.pdf
491267        412167  17%  09/21/2011  09:46  1048 MC.pdf
502375        439110  13%  09/21/2011  09:46  1048 ODP.pdf
1213269       1071776  12%  09/21/2011  09:36  10507 A.pdf
694618        613913  12%  09/21/2011  10:15  10513 ODP.pdf
2796482       2469334  12%  09/21/2011  10:16  10527 A.pdf
1748419       1528567  13%  09/21/2011  09:39  10527 B.pdf
713023        639872  11%  09/21/2011  10:16  10527 ODP.pdf
2744509       2400483  13%  09/21/2011  10:16  10550 A.pdf
328463        300256   9%  09/21/2011  10:16  10550 MC.pdf
162630        149418   9%  09/21/2011  09:39  10551 D.19167 (2).pdf
58563         49594  16%  09/21/2011  09:39  10551 D.19167.pdf
1256560       1102793  13%  09/21/2011  09:39  10551 D.pdf
123059        112906   9%  09/21/2011  10:16  10551 G.23920.pdf
1042783        921643  12%  09/21/2011  10:16  10551 G.pdf
61605         56138   9%  09/21/2011  10:00  10551 H.22240.pdf
1209095       1067721  12%  09/21/2011  10:00  10551 H.pdf
127390        116145   9%  09/21/2011  10:16  10557 C.23922 (2).pdf
78602         72082   9%  09/21/2011  10:16  10557 C.23922.pdf
2658867       2386537  11%  09/21/2011  10:16  10557 C.pdf
346059        281572  19%  09/21/2011  10:16  10573 MC.pdf
1455720       1298255  11%  09/21/2011  09:33  10583 A .pdf
937539        813216  14%  09/21/2011  09:33  10583 A.pdf
135119         90741  33%  09/21/2011  10:06  10583 B.22942 (2).pdf
36082         29954  17%  09/21/2011  10:06  10583 B.22942 (3).pdf
239963        223101   8%  09/21/2011  10:06  10583 B.22942.pdf
1216937       1050041  14%  09/21/2011  10:06  10583 B.pdf
1811111       1559723  14%  09/21/2011  11:13  10583 C.pdf
274670        227410  18%  09/21/2011  10:16  10583 MC.pdf
474134        419677  12%  09/21/2011  10:16  10583 ODP.pdf
64025         53785  16%  09/21/2011  10:16  10590 B.23926.pdf
1818539       1653010  10%  09/21/2011  10:16  10590 B.pdf
1000799        881160  12%  09/21/2011  09:33  10592 B.pdf
24540         18840  24%  09/21/2011  09:37  10592 D.18280.pdf
1576821       1369732  14%  09/21/2011  09:36  10592 D.pdf
987660        869827  12%  09/21/2011  10:01  10592 E.pdf
314344        246759  22%  09/21/2011  10:16  10592 MC.pdf
618164        515294  17%  09/21/2011  10:16  10627 MC.pdf
703529        595902  16%  09/21/2011  10:03  10633 B.pdf
5022028       4247748  16%  09/21/2011  10:16  10633 C.pdf
95873         82742  14%  09/21/2011  10:16  10633 ODP.23931 (2).pdf
1489700       1299456  13%  09/21/2011  10:16  10633 ODP.23931.pdf
435716        363564  17%  09/21/2011  10:16  10633 ODP.pdf
74863         67418  10%  09/21/2011  09:34  10642 B.17205 (2).pdf
734770        595726  19%  09/21/2011  09:34  10642 B.17205 (3).pdf
71036         63799  11%  09/21/2011  09:34  10642 B.17205 (4).pdf
288309        242518  16%  09/21/2011  09:34  10642 B.17205 (5).pdf
1168909       1103561   6%  09/21/2011  09:34  10642 B.17205 (6).pdf
55382         45015  19%  09/21/2011  09:34  10642 B.17205 (7).pdf
79608         73494   8%  09/21/2011  09:34  10642 B.17205.pdf
2853155       2373573  17%  09/21/2011  09:34  10642 B.pdf
490201        447654   9%  09/21/2011  10:16  10642 ODP.pdf
774920        681587  13%  09/21/2011  10:16  10659 ODP.pdf
1845292       1574149  15%  09/21/2011  11:11  10661 A.pdf
1494626       1294812  14%  09/21/2011  10:16  10661 H.pdf
329066        276921  16%  09/21/2011  10:27  10661 MC.pdf
555382        469319  16%  09/21/2011  10:16  10661 ODP.pdf
1718393       1469860  15%  09/21/2011  10:16  10695 C.pdf
930302        812443  13%  09/21/2011  10:16  10695 ODP.pdf
685521        589140  15%  09/21/2011  09:41  10725 A.pdf
441649        391158  12%  09/21/2011  10:16  10725 MC.pdf
1017894        916292  10%  09/21/2011  10:16  10725 ODP.pdf
1889286       1571676  17%  09/21/2011  10:54  10739 A.pdf
211446        180234  15%  09/21/2011  10:16  10750 B.23940.pdf
924173        796491  14%  09/21/2011  10:16  10750 B.pdf
26670         21109  21%  09/21/2011  09:44  10750 D.20019.pdf
1428894       1248140  13%  09/21/2011  09:44  10750 D.pdf
1792431       1500211  17%  09/21/2011  10:53  10750 E.pdf
1694033       1424490  16%  09/21/2011  10:53  10772 G.pdf
3092496       2748958  12%  09/21/2011  10:16  10788 A.pdf
67377         62669   7%  09/21/2011  09:55  10788 MC.21719 (2).pdf
67482         62781   7%  09/21/2011  09:55  10788 MC.21719 (3).pdf
289775        247598  15%  09/21/2011  09:55  10788 MC.21719.pdf
404803        344141  15%  09/21/2011  09:55  10788 MC.pdf
324848        273075  16%  09/21/2011  10:16  10812 MC.pdf
6245069       5178851  18%  09/21/2011  10:02  10819 A.pdf
608881        562430   8%  09/21/2011  10:16  10820 ODP.pdf
213629        173664  19%  09/21/2011  10:16  10831 MC.pdf
69566         63059  10%  09/21/2011  09:49  10845 F.20857 (2).pdf
68773         61770  11%  09/21/2011  09:49  10845 F.20857.pdf
599463        513460  15%  09/21/2011  09:49  10845 F.pdf
543348        462596  15%  09/21/2011  10:16  10845 MC.pdf
4152349       3595934  14%  09/21/2011  10:16  10870 D.pdf
1049217        922875  13%  09/21/2011  10:16  10915 A.pdf
1031344        901848  13%  09/21/2011  09:56  10915 E.pdf
856813        759660  12%  09/21/2011  10:16  10937 ODP.pdf
1035444        904159  13%  09/21/2011  09:38  10940 K.pdf
                         
122066767     105365848  14%                           128
Professing data with Get-ZipContent-New
DEBUG: Matched record: 

DEBUG: 162630        149418   9%  09/21/2011  09:39  10551 D.19167 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 162630 149418 9 09/21/2011  09:39 10551 D.19167
DEBUG: Matched record: 

DEBUG: 127390        116145   9%  09/21/2011  10:16  10557 C.23922 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 127390 116145 9 09/21/2011  10:16 10557 C.23922
DEBUG: Matched record: 

DEBUG: 1455720       1298255  11%  09/21/2011  09:33  10583 A .pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 1455720 1298255 11 09/21/2011  09:33 10583 A
DEBUG: Matched record: 

DEBUG: 135119         90741  33%  09/21/2011  10:06  10583 B.22942 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 135119 90741 33 09/21/2011  10:06 10583 B.22942
DEBUG: Matched record: 

DEBUG: 36082         29954  17%  09/21/2011  10:06  10583 B.22942 (3).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 36082 29954 17 09/21/2011  10:06 10583 B.22942
DEBUG: Matched record: 

DEBUG: 95873         82742  14%  09/21/2011  10:16  10633 ODP.23931 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 95873 82742 14 09/21/2011  10:16 10633 ODP.23931
DEBUG: Matched record: 

DEBUG: 74863         67418  10%  09/21/2011  09:34  10642 B.17205 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 74863 67418 10 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 734770        595726  19%  09/21/2011  09:34  10642 B.17205 (3).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 734770 595726 19 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 71036         63799  11%  09/21/2011  09:34  10642 B.17205 (4).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 71036 63799 11 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 288309        242518  16%  09/21/2011  09:34  10642 B.17205 (5).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 288309 242518 16 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 1168909       1103561   6%  09/21/2011  09:34  10642 B.17205 (6).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 1168909 1103561 6 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 55382         45015  19%  09/21/2011  09:34  10642 B.17205 (7).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 55382 45015 19 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 67377         62669   7%  09/21/2011  09:55  10788 MC.21719 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 67377 62669 7 09/21/2011  09:55 10788 MC.21719
DEBUG: Matched record: 

DEBUG: 67482         62781   7%  09/21/2011  09:55  10788 MC.21719 (3).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 67482 62781 7 09/21/2011  09:55 10788 MC.21719
DEBUG: Matched record: 

DEBUG: 69566         63059  10%  09/21/2011  09:49  10845 F.20857 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 69566 63059 10 09/21/2011  09:49 10845 F.20857
Output of $zip_file_content.
Name   : D.19167
Date   : 9/21/2011 9:39:00 AM
Length : 162630
Size   : 149418
Ratio  : 9
CRC32  : 10551

Name   : C.23922
Date   : 9/21/2011 10:16:00 AM
Length : 127390
Size   : 116145
Ratio  : 9
CRC32  : 10557

Name   : A
Date   : 9/21/2011 9:33:00 AM
Length : 1455720
Size   : 1298255
Ratio  : 11
CRC32  : 10583

Name   : B.22942
Date   : 9/21/2011 10:06:00 AM
Length : 135119
Size   : 90741
Ratio  : 33
CRC32  : 10583

Name   : B.22942
Date   : 9/21/2011 10:06:00 AM
Length : 36082
Size   : 29954
Ratio  : 17
CRC32  : 10583

Name   : ODP.23931
Date   : 9/21/2011 10:16:00 AM
Length : 95873
Size   : 82742
Ratio  : 14
CRC32  : 10633

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 74863
Size   : 67418
Ratio  : 10
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 734770
Size   : 595726
Ratio  : 19
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 71036
Size   : 63799
Ratio  : 11
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 288309
Size   : 242518
Ratio  : 16
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 1168909
Size   : 1103561
Ratio  : 6
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 55382
Size   : 45015
Ratio  : 19
CRC32  : 10642

Name   : MC.21719
Date   : 9/21/2011 9:55:00 AM
Length : 67377
Size   : 62669
Ratio  : 7
CRC32  : 10788

Name   : MC.21719
Date   : 9/21/2011 9:55:00 AM
Length : 67482
Size   : 62781
Ratio  : 7
CRC32  : 10788

Name   : F.20857
Date   : 9/21/2011 9:49:00 AM
Length : 69566
Size   : 63059
Ratio  : 10
CRC32  : 10845

As you can see, some of the records in the output from the wzunzip are not in the result set. Below is my script:

cls

$DebugPreference = 'Continue';

function get-zipcontent-old {
    begin {
        $regex = "\s*(\d+)\s+(\w+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
        $props = {
            @{
                    Name =   $matches[8]
                    Date =   $matches[5] -as [datetime]
                    Length = $matches[1] -as [int]
                    Method = $matches[2]
                    Size =   $matches[3] -as [int]
                    Ratio =  $matches[4] -as [int]
                    CRC32 =  $matches[6]
                    Attr =   $matches[7]
                }
            }

            $proplist = $props.tostring().split("`n") -match "^\s*(\S+)\s*=.+$" -replace "^\s*(\S+)\s*=.+$",'$1';
        }

        process { 
            if ($_ -match $regex){
                new-object psobject -property (&$props) | select $proplist
        }
    } 
}

function get-zipcontent-new {
    begin {
    $regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
    $props = {
            @{
        Name =   $matches[6]
        Date =   $matches[4] -as [datetime]
        Length = $matches[1] -as [int]
        Size =   $matches[2] -as [int]
        Ratio =  $matches[3] -as [int]
        CRC32 =  $matches[5]
      }
    }
    
        $proplist = $props.tostring().split("`n") -match "^\s*(\S+)\s*=.+$" -replace "^\s*(\S+)\s*=.+$",'$1'
    }
 
    process { 
        if ($_ -match $regex){
            write-debug "Matched record: `n"
            write-debug "$_"
            write-debug "***********************"
            write-debug "Matches:`n"
            write-debug "$($matches[1..6])"
            new-object psobject -property (&$props) | select $proplist
        }
    } 
}

Write-Output "Approach 1: Data from variable";

$data = @"
Length     Method       Size     Ratio    Date     Time    CRC-32  Attr  Name
                      
        51421  DeflatX          7834  85%  08/01/2011  07:31  d35ad156 --w-* D080111rpts.ctl
    179450364  DeflatX      15360545  92%  08/01/2011  07:12  73ddacbf --w-* D080111rpts.dat
                                                  
    179501785               15368379  92%                                            2
"@

Write-Output "`$data variable output:";
$data;

Write-Output "Spliting data.";
$input_data = $data -split "`n" |% {$_.trim()}

Write-Output "Professing data with Get-ZipContent-Old";
$variable_content = $input_data | get-zipcontent-Old;
 
Write-Output "Output of `$variable_content.";
$variable_content;

Write-Output "Approach 2: Data from wzunzip";

Write-Output "Collecting import.txt data into `$import_file.";
$import_file = gc 'C:\Data\import.txt';

Write-Output "Configuring wzunzip analysis.";
$argumentlist = "-vb `"C:\Data\Conv000001.zip`"";
$standarderror = [System.IO.Path]::GetTempFileName();
$standardoutput = [System.IO.Path]::GetTempFileName();

Write-output "Starting processing."
Start-Process -FilePath "C:\Program Files\WinZip\WZUNZIP.EXE" -ArgumentList $argumentlist -RedirectStandardError $standarderror   -RedirectStandardOutput $standardoutput  -NoNewWindow -Wait;

Write-Output "Complete wzunzip processing.";

Write-Output "Splitting data";
$zip_input_data = (gc $standardoutput) -split "`n" | % {$_.trim()};

Write-Output "Output of `$zip_input_data";
$zip_input_data;

Write-Output "Professing data with Get-ZipContent-New";
$zip_file_content = $zip_input_data | get-zipcontent-New;
 
Write-Output "Output of `$zip_file_content.";
$zip_file_content;

Friday, October 21, 2011 2:30 PM

It appears some of your file names have embedded spaces.  That will cause this (bolded) part of the regex to only match up to that space.

$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"

Try this for your regex:

$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+([^.]+\\S+)\s+(\S+)\s*$"

**([^.]+\\S+) = **one or more of any character other than a dot, followed by a dot, followed by one or more non-space characters.
 

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Friday, October 21, 2011 2:45 PM

That new regex gave me this:

DEBUG: Matched record: 

DEBUG: 162630        149418   9%  09/21/2011  09:39  10551 D.19167 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 162630 149418 9 09/21/2011  09:39 10551 D.19167
DEBUG: Matched record: 

DEBUG: 127390        116145   9%  09/21/2011  10:16  10557 C.23922 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 127390 116145 9 09/21/2011  10:16 10557 C.23922
DEBUG: Matched record: 

DEBUG: 135119         90741  33%  09/21/2011  10:06  10583 B.22942 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 135119 90741 33 09/21/2011  10:06 10583 B.22942
DEBUG: Matched record: 

DEBUG: 36082         29954  17%  09/21/2011  10:06  10583 B.22942 (3).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 36082 29954 17 09/21/2011  10:06 10583 B.22942
DEBUG: Matched record: 

DEBUG: 95873         82742  14%  09/21/2011  10:16  10633 ODP.23931 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 95873 82742 14 09/21/2011  10:16 10633 ODP.23931
DEBUG: Matched record: 

DEBUG: 74863         67418  10%  09/21/2011  09:34  10642 B.17205 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 74863 67418 10 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 734770        595726  19%  09/21/2011  09:34  10642 B.17205 (3).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 734770 595726 19 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 71036         63799  11%  09/21/2011  09:34  10642 B.17205 (4).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 71036 63799 11 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 288309        242518  16%  09/21/2011  09:34  10642 B.17205 (5).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 288309 242518 16 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 1168909       1103561   6%  09/21/2011  09:34  10642 B.17205 (6).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 1168909 1103561 6 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 55382         45015  19%  09/21/2011  09:34  10642 B.17205 (7).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 55382 45015 19 09/21/2011  09:34 10642 B.17205
DEBUG: Matched record: 

DEBUG: 67377         62669   7%  09/21/2011  09:55  10788 MC.21719 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 67377 62669 7 09/21/2011  09:55 10788 MC.21719
DEBUG: Matched record: 

DEBUG: 67482         62781   7%  09/21/2011  09:55  10788 MC.21719 (3).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 67482 62781 7 09/21/2011  09:55 10788 MC.21719
DEBUG: Matched record: 

DEBUG: 69566         63059  10%  09/21/2011  09:49  10845 F.20857 (2).pdf
DEBUG: ***********************
DEBUG: Matches:

DEBUG: 69566 63059 10 09/21/2011  09:49 10845 F.20857
Output of $zip_file_content.
Name   : D.19167
Date   : 9/21/2011 9:39:00 AM
Length : 162630
Size   : 149418
Ratio  : 9
CRC32  : 10551

Name   : C.23922
Date   : 9/21/2011 10:16:00 AM
Length : 127390
Size   : 116145
Ratio  : 9
CRC32  : 10557

Name   : B.22942
Date   : 9/21/2011 10:06:00 AM
Length : 135119
Size   : 90741
Ratio  : 33
CRC32  : 10583

Name   : B.22942
Date   : 9/21/2011 10:06:00 AM
Length : 36082
Size   : 29954
Ratio  : 17
CRC32  : 10583

Name   : ODP.23931
Date   : 9/21/2011 10:16:00 AM
Length : 95873
Size   : 82742
Ratio  : 14
CRC32  : 10633

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 74863
Size   : 67418
Ratio  : 10
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 734770
Size   : 595726
Ratio  : 19
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 71036
Size   : 63799
Ratio  : 11
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 288309
Size   : 242518
Ratio  : 16
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 1168909
Size   : 1103561
Ratio  : 6
CRC32  : 10642

Name   : B.17205
Date   : 9/21/2011 9:34:00 AM
Length : 55382
Size   : 45015
Ratio  : 19
CRC32  : 10642

Name   : MC.21719
Date   : 9/21/2011 9:55:00 AM
Length : 67377
Size   : 62669
Ratio  : 7
CRC32  : 10788

Name   : MC.21719
Date   : 9/21/2011 9:55:00 AM
Length : 67482
Size   : 62781
Ratio  : 7
CRC32  : 10788

Name   : F.20857
Date   : 9/21/2011 9:49:00 AM
Length : 69566
Size   : 63059
Ratio  : 10
CRC32  : 10845

Friday, October 21, 2011 2:59 PM

Have to go move a server right now.  I'll take some test data and re-do that regex when I get back.[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Friday, October 21, 2011 3:36 PM

See if this doesn't work better:

"\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S.+\S)\s*$"

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Friday, October 21, 2011 3:36 PM

No worries.  I know most of us have jobs that keep us from posting right away.  Whenever you can get to it.

Thanks.


Friday, October 21, 2011 3:39 PM

You must have hit enter about the same time I did.  That one gets the full file name but only the ones that start with Alpha characters.  Some names have numbers too.  So, that last field can start with both alpha and numeric characters.  Isn't there a regex option for alphanumerics?  I can just throw that in.


Friday, October 21, 2011 3:57 PM

Don't know if you missed it, but I posted an updated regex just before your last post.[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Saturday, October 22, 2011 1:55 AM

You must have hit enter about the same time I did.  That one gets the full file name but only the ones that start with Alpha characters.  Some names have numbers too.  So, that last field can start with both alpha and numeric characters.  Isn't there a regex option for alphanumerics?  I can just throw that in.

I missed this post earlier.  The regex was originally written for a different output format (using a different command line option), and I need a don't have a good sample of data from that format.  I'm kind of guessing at these changes.  If you can give me the first 3-4 lines of output from the command it would help.

In them meantime, here's an alternate solution (again I'm guessing at the data format) that has a much simpler regex, and uses substrings instead of regex captures to differentiate the data:

 

$line = "74863         67418  10%  09/21/2011  09:34  10642 B.17205 (2).pdf"

function get-zipcontent {
 begin {
    $regex = "^\d+"
    $props = {
        @{
        Name = $_.substring(51)
        Size = $_.substring(0,9) -as [int]
        Length = $_.substring(14,6) -as [int]
        Ratio = $_.substring(21,2) -as [int]
        Date = $_.substring(26,16) -as [datetime]
        CRC32 = $_.substring(45,5) 
        }
     }
 
$proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
 }
 
process { 
  if ($_ -match $regex){
  new-object psobject -property (&$props) | select $proplist
  }
 } 
}

$line | get-zipcontent

 edit: I went back and bolded what changed from the original script.  It's still the same snipped we started with, except for the regex and the hash table.

If you prefer doing it this way, this helps:

http://mjolinor.wordpress.com/2011/05/30/position-map-a-string-with-powershell/

[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "


Monday, October 24, 2011 9:31 PM

Here is a sample with the spaces in the file names AND one without spaces:

 

Length         Size     Ratio    Date     Time   Name
------------  ------------ -----    ----     ----   ----
1218497       1024416  16%  09/21/2011  10:01  10097 E.pdf
1616270       1402640  14%  09/21/2011  09:33  10114 A.pdf
1079692        943306  13%  09/21/2011  09:46  1013C.pdf
181716        153372  16%  09/21/2011  09:46  1013 MC.pdf

 


Tuesday, October 25, 2011 2:15 PM

mjolinor, I pulled out my copy of Andrew Watts' Beginning Regular Expressions and came up with this:

$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+([0-1][0-9]/[0-3][0-9]/[1-2][0-9][0-9][0-9])\s+(\d+:\d+)\s+((\S+\s+\S+)$|(\S+)$)"

It works in all my testing so far.  Let me know if I missed anything.


Tuesday, October 25, 2011 5:07 PM

Thank you again sir.  This is the first time I have really worked with regexes.  I can see the power in them now and will certainly be getting more familiar with them.