Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Wednesday, October 19, 2011 9:32 PM
Is there a powershell way to get the file count within a .zip file? My only thought is to unzip it to a folder, count the files, then, delete the unzipped files. It seems like there has to be a better way than that.
All replies (32)
Thursday, October 20, 2011 1:27 AM ✅Answered | 1 vote
$data = @"
Length Method Size Ratio Date Time CRC-32 Attr Name
51421 DeflatX 7834 85% 08/01/2011 07:31 d35ad156 --w-* D080111rpts.ctl
179450364 DeflatX 15360545 92% 08/01/2011 07:12 73ddacbf --w-* D080111rpts.dat
179501785 15368379 92% 2
"@
$input_data = $data -split "`n" |% {$_.trim()}
function get-zipcontent {
begin {
$regex = "\s*(\d+)\s+(\w+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
$props = {
@{
Name = $matches[8]
Date = $matches[5] -as [datetime]
Length = $matches[1] -as [int]
Method = $matches[2]
Size = $matches[3] -as [int]
Ratio = $matches[4] -as [int]
CRC32 = $matches[6]
Attr = $matches[7]
}
}
$proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
}
process {
if ($_ -match $regex){
new-object psobject -property (&$props) | select $proplist
}
}
}
$zipcontent = $input_data | get-zipcontent
$zipcontent
Name : D080111rpts.ctl
Date : 8/1/2011 7:31:00 AM
Length : 51421
Method : DeflatX
Size : 7834
Ratio : 85
CRC32 : d35ad156
Attr : --w-*
Name : D080111rpts.dat
Date : 8/1/2011 7:12:00 AM
Length : 179450364
Method : DeflatX
Size : 15360545
Ratio : 92
CRC32 : 73ddacbf
Attr : --w-*
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Tuesday, October 25, 2011 2:27 PM ✅Answered
This will work as long as you never have more than one embedded space in a file name.
\s((\S+\s+\S+)$|(\S+)$)
This should work with any number of spaces.
\s(\w[^.]+\\w{1,3})\s*$
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Wednesday, October 19, 2011 9:36 PM
What are you using to zip them?[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Wednesday, October 19, 2011 9:42 PM
Winzip, although, technically it's the command line utilities wzzip and wunzip. I am not averse to using something else though.
Wednesday, October 19, 2011 9:54 PM
What does wzzip -vb <zipfile name> produce?
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Wednesday, October 19, 2011 10:01 PM
I am continuing to work with this. As a side note, I have seen where folks turn output from utilities like this into proper objects that can be searched. How would one do that? For instance, if I run the -vm command it returns the following. (The -vb command does the same.) I am thinking if I turned this into an object I could count a certain number of lines in to skip the header. NOTE: This file is empty, so, the info below is correct.
PS C:\Users\wsteele> & "C:\Program Files (x86)\WinZip\WZUNZIP.EXE" -vm "C:\somezip.zip"
WinZip(R) Command Line Support Add-On Version 3.2 (Build 8668)
Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved
Zip file: C:\somezip.zip
Length Method Size Ratio Date Time CRC-32 Attr Name
Warning [C:\somezip.zip]: Zip file is empty
0 0 0% 0
Wednesday, October 19, 2011 10:08 PM
Here's what I use:
http://mjolinor.wordpress.com/2011/10/08/new-object-from-a-hash-table-in-a-script-block/
Typically, when you convert utility program output to PS objects, you get one object per line of output data, so you'd parse the output into an array of PS objects, and then filter those if needed, and get the .count property of the reult.
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Wednesday, October 19, 2011 10:35 PM
Below is the script I am using. It's pretty clearly commented if you want to get the flow of the script. One problem I have is being able to analyze whether a decrypted zip matches an encrypted zip. I was using the same logic above wzunip.exe -vb and piping that to a variable, then, comparing the length of that object, but, that doesn't seem to work. Anyone have a better suggestion? That portion of the process is: 1) unzip encrypted file 2) place files in new unencrypted zip 3) copy original to archive and 4) move unencrypted zip to processing directory. Prior to this copy/move I want to validate that the two zip files do in fact have the same contents.
<#
.AUTHOR
Will Steele
.DEPENDENCIES
None.
.DESCRIPTION
This script unpacks encrypted .zip files, repacks the files into a new .zip
file. Next, it copies the encrypted .zip file to the archive drive and moves
the unencrypted .zip file to the processing drive.
.EXAMPLE
foreach($zip in (dir C:\*.zip)) {
& "C:\ProcessZip.ps1" -zip_file $zip.fullname -rootdrive %rootdrive%
}
.EXTERNALHELP
None.
.FORWARDHELPTARGETNAME
None.
.INPUTS
System.String.
.LINK
None.
.NAME
ProcessZip.ps1
.NOTES
Revision History:
Created 10/03/2011.
.OUTPUTS
None.
.PARAMETER zip_file
A mandatory parameter specifying the location of the .zip file to be unpacked.
.PARAMETER password
An optional parameter specifying the password of the .zip file. Default is
'hellomydarling'.
.PARAMETER processing_directory
An optional parameter specifying the location of the processing directory. The
default is 'c:\data'.
.SYNOPSIS
Unzip encrypted .zip file and pack new, unencrypted .zip file.
#>
[CmdletBinding()]
Param(
[Parameter(HelpMessage="Enter the path to the encrypted .zip file.", Mandatory=$true)]
[ValidateNotNullOrEmpty()]
[ValidateScript({Test-Path $_})]
[IO.FileInfo] $zip_file,
[Parameter(HelpMessage="Enter the password to decrypt the .zip file.", Mandatory=$false)]
[String]
[ValidateNotNullOrEmpty()]
[String] $password = 'hellomydarling',
[Parameter(HelpMessage="Enter the output location of the unencrypted .zip file.", Mandatory=$false)]
[ValidateScript({Test-Path $_})]
[ValidateNotNullOrEmpty()]
[IO.DirectoryInfo] $processing_directory = "C:\data",
[Parameter(HelpMessage="Enter the archive location of the unencrypted .zip file.", Mandatory=$false)]
[IO.DirectoryInfo]
[ValidateScript({Test-Path $_})]
[ValidateNotNullOrEmpty()] $archive_directory="c:\archive",
[Parameter(HelpMessage="Enter the root drive.", Mandatory=$true)]
[ValidateScript({Test-Path $_})]
[ValidateNotNullOrEmpty()]
[IO.DirectoryInfo] $rootdrive
)
#requires -Version 2.0
Set-StrictMode -Version 2.0
#region variables
# Encrypted .Zip file stored in archive
$archive_zip = "$archive_directory\" + (Split-Path $zip_file -Leaf);
# Main log file
$log_file = "$rootdrive\logfile.log";
# Unencrypted .zip file MOVED to process directory
$processing_zip = "$processing_directory\unencrypted.$(Split-Path $zip_file -Leaf)";
# Unencrypted .zip file in original directory
$unencrypted_zip = "$(Split-Path $zip_file)\unencrypted.$(Split-Path $zip_file -Leaf)";
# New folder for temporary processing of unzipped files
if(!(Test-Path $((Split-Path $zip_file)+"\unzip"))) {
$unzip = New-Item -Path (Split-Path $zip_file) -ItemType Directory -Name Unzip;
} else {
$unzip = "$(Split-Path $zip_file)\unzip";
}
#endregion variables
#region Functions
function Report-UnZipError {
Write-Output "ERROR!!!!! $file failed to unzip. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
} # end function Report-UnZipError
function Report-ZipError {
Write-Output "ERROR!!!!! $file failed to zip. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
} # end function Report-ZipError
function Report-CopyError {
Write-Output "ERROR!!!!! $zip_file failed to copy. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
} # end function Report-CopyError
function Report-MoveError {
Write-Output "ERROR!!!!! $(Split-Path $zip_file)\unencrypted.$(Split-Path $zip_file -Leaf) failed to move. Date and Time of error follows $((get-Date).ToString(`"ddd MM/dd/yyyy hh:mm tt`"))." | Out-File -FilePath $log_file -Encoding ASCII -Append
} # end function Report-CopyError
function Main {
# Validate machine can access drives (to ensure active cluster node is being accessed).
Write-Verbose "Validating access to the rootdrive to ensure the SAN is active and this is an active cluster node.";
if(Test-Path $rootdrive) {
# Report: sucessfully accessed $rootdrive
Write-Verbose "This script can access the rootdrive. Continuing processing.";
# Check to see if unencrypted file exists yet
Write-Verbose "Verifying current file ($zip_file) has not been processed.";
if(Test-Path $unencrypted_zip) {
Write-Verbose "The file $unencrypted_zip has already been unencrypted. Cancelling processing.";
Write-Error "The file $unencrypted_zip has already been unencrypted. Cancelling processing.";
} elseif(Test-Path $archive_zip) {
Write-Verbose "The file $archive_zip has already been unencrypted. Cancelling processing.";
Write-Error "The file $archive_zip has already been unencrypted. Cancelling processing.";
} else {
# Report file has not been processed
Write-Verbose "The file ($zip_file) has not been processed yet. Continuing processing.";
# Unpack encrypted file
Write-Verbose "Attempting to unzip file ($zip_file).";
Start-Process -FilePath "C:\Program Files (x86)\WinZip\wzunzip.exe" -ArgumentList "-e $zip_file $unzip -s$Password" -Wait;
Write-Verbose "Unzip process completed.";
# Pack unencrypted files
Write-Verbose "Attempting to zip file ($unencrypted_zip).";
Start-Process -FilePath "C:\Program Files (x86)\WinZip\wzzip.exe" -ArgumentList "-a -m `"$unencrypted_zip`" `"$unzip\*.*`"" -Wait
Write-Verbose "Zip process completed.";
# Compare unencrypted and encrypted zip files to validate new zip file.
Write-Verbose "Gathering statistics for encrypted .zip ($zip_file) for analysis.";
$encrypted_summary = & "C:\program files (x86)\winzip\wzunzip.exe" -vb $zip_file;
Write-Verbose "Gathering statistics for unencrypted .zip ($unencrypted_zip) for analysis.";
$unencrypted_summary = & "C:\program files (x86)\winzip\wzzip.exe" -vb $unencrypted_zip;
# Move encrypted .zip to archive
Write-Verbose "Copying encrypted zip file ($zip_file) to archive ($archive_directory).";
Move-Item -Path $zip_file -Destination $archive_directory;
Write-Verbose "Encrypted .zip file copy ($zip_file) complete.";
# Verifying encrypted .zip was copied to archive
Write-Verbose "Verifying encrypted .zip file ($archive_zip) was copied correctly.";
if(Test-Path $archive_zip) {
Write-Verbose "The encrypted .zip file ($archive_zip) was copied correctly.";
} else {
Write-Verbose "The encrypted .zip file ($archive_zip) was not copied correctly.";
Report-CopyError;
break;
}
# Move unencrypted .zip to processing directory
Write-Verbose "Moving unencrypted zip file ($unencrypted_zip) to processing directory ($processing_directory).";
Move-Item -Path $unencrypted_zip -Destination $processing_directory;
Write-Verbose "Unencrypted .zip file ($unencrypted_zip) move complete.";
# Verifying unencrypted .zip was moved to processing
Write-Verbose "Verifying unencrypted .zip file ($unencrypted_zip) was moved correctly.";
if(Test-Path $processing_zip) {
Write-Verbose "The unencrypted .zip file ($processing_zip) was copied correctly.";
} else {
Write-Verbose "The unencrypted .zip file ($processing_zip) was not copied correctly.";
Report-CopyError;
break;
}
}
} else {
Write-Output "This machine cannot access the rootdrive ($rootdrive). It is either not the active cluster node or the SAN is inaccessible.";
break;
}
}
#endregion Functions
#region ScriptBody
. Main
#endregion ScriptBody
Wednesday, October 19, 2011 10:46 PM
What is it exactly you're trying to accomplish? The only way to tell if an encrypted file matches an unencrypted file is to either decrypt the encrypted file, or encrypt the unencrypted file prior to doing the comparison.
Comparing the length is not a good test if you need to know if the files are identical. A better test would be to compare the MD5 hashes of the files. There's a PS script in the repository for calculating the MD5 hash of a file:
http://gallery.technet.microsoft.com/scriptcenter/2a0cb3b9-267c-4e4d-b489-df5d907bca75
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Wednesday, October 19, 2011 11:30 PM
Sorry, too many words diluted my comment. I want to verify the unencryption process (creating a new zip and copying all the original zip file's contents into it) have the same file set. Not trying to see byte for byte match; you're right the encryption would make that impossible. I just want to be sure all the files that should be there are.
Thursday, October 20, 2011 12:14 AM
I think parsing out the output of the -vb option will be the ticket.
I can help you write a regex for that if you can post a sample of the output that has values. It's hard to know what to match for if there there wasn't anything to report.
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Thursday, October 20, 2011 12:49 AM
Yeah, if you could help me build some parsing technique that would be awesome. Here is a snippet for a zip with 2 files:
PS C:\Users\wsteele> & "C:\Program Files (x86)\WinZip\WZUNZIP.EXE" -vm "C:\somezip.zip"
WinZip(R) Command Line Support Add-On Version 3.2 (Build 8668)
Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved
Zip file: C:\somezip.zip
Length Method Size Ratio Date Time CRC-32 Attr Name
------------ ------ ------------ ----- ---- ---- -------- ---- ----
51421 DeflatX 7834 85% 08/01/2011 07:31 d35ad156 --w-* D080111rpts.ctl
179450364 DeflatX 15360545 92% 08/01/2011 07:12 73ddacbf --w-* D080111rpts.dat
------------ ------------ --- ---------
179501785 15368379 92% 2
I will be examining the contents of 1 encrypted file and 1 unencrypted file. Apparently this -vm option doesn't care if you analyze the contents, rather, only if you try to open them.
Thursday, October 20, 2011 1:50 AM
Thanks for the snippet. I have never really work with RegEx's much, but, you may given me my first reason to really dig into them. Have a copy of a RegEx book on my shelf that has been staring at me for a while.
To be sure I get what you did (and don't just reuse the code blindly) correct me if I miss anything:
1) In the function, parse input with RegEx (which I'll get to in a second) and match hits to "fields".
2) Pair up valid RegEx hits with properties as objects.
3) Using the function, gather input (I am assuming the raw output from my utility would qualify if stored in a variable) into a new variable.
I am thinking I could simply use the $zipcontent.count property for a "file" count.
Now, about that regex, what I see is for $regex = "\s*(\d+)\s+(\w+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$" is:
spaces - digits (for length) - spaces - words (for Method) - spaces - digits (for size) - spaces - digits (for percentage) - spaces - and I can't make out the rest.
I like this trick of finding patterns and creating fields. Very cool. Thanks for working this up. Got me thinking in all sorts of new directions.
What does this line do? $proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
Thursday, October 20, 2011 2:09 AM
(\S+\s+\S+) = one or more non-space characters, followed by one or more spaces, followed by one or more non-space characters. This is the Date and Time combine. It's already in the correct format to be able to parse it directly to [datetime].
\s+
(\w+) = one or more word characters (this matches A-Z,a-z,0-9, and the underscore). This is the CRC-32 field
\s+
(\S+) one or more non-space characters (this is the attributes)
\s+
(\S+) one or more non-space characters (this is the name)
\s*$ zero or more spaces, to the end of the line.
This:
$props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
is explained a little better in the blog. When you create an object from a hash table, the properties don't come out in any predictable order. This parses the names from the $props script block, in the order they appear in the hash table definition. That's re-used as the argument list of the | select when the object is created. If you want the properties in a different order in your object, just re-order them in the script block / hash table, and they'll automatically get re-ordered in the created objects.
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Thursday, October 20, 2011 9:17 PM
Really appreciate the help on this one. You've shown me some things I'm excited about studying a little and throwing in the toolbox.
Thursday, October 20, 2011 9:21 PM
Thanks! That what I like to hear.substring(($_*2),2))})-replace " "
Friday, October 21, 2011 2:45 AM
As I am trying to embed this into a script I am having problems with the last part, pipelining the data through the Get-ZipContent. I ran through each step comparing your example, which works great, and, my actual solution. The only difference was the data you used was writing in here-strings. I grabbed it using the wzunzip.exe output. I used this code to get it to a variable:
$argumentlist = "-vb `"C:\Data\Conv000001.zip`"";
$standarderror = [System.IO.Path]::GetTempFileName();
$standardoutput = [System.IO.Path]::GetTempFileName();
Start-Process -FilePath "C:\Program Files\WinZip\WZUNZIP.EXE" -ArgumentList $argumentlist -RedirectStandardError $standarderror -RedirectStandardOutput $standardoutput -NoNewWindow -Wait;
$zip_input_data = (gc $standardoutput) -split "`n" | % {$_.trim()};
I checked the data each step of the process. When I checked gc $standardoutput it gave me data as expected. When I checked $zip_input_data it also gave me the expected results (basically a summary like the you put in the variable in your example). However, when I ran the last step
$zipcontent = $zip_input_data | get-zipcontent;
I didn't get any results. Having used your function, I should get about 128 objects. In my case, I get 0. It seems the data just isn't being passed as expected or the fields no longer line up. Is it possible having used a .txt file instead of a variable could be the issue? Maybe some underlying datatype mismatch because I used a different data source stream? I'm going to double check and ensure the fields all match up to the field layout I provided (and which you got working).
Friday, October 21, 2011 2:56 AM
Whoops again. The -vb switch drops a field that the -vm switch provides. I'll change the function your provided and post it once I get it going. Disregard the issue.
Friday, October 21, 2011 3:02 AM
Ok, here's the corrected function. When I run it everything works, except I only get the file name. I am trying to the name with the extension:
function get-zipcontent {
begin {
$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
$props = {
@{
Name = $matches[6]
Date = $matches[4] -as [datetime]
Length = $matches[1] -as [int]
Size = $matches[2] -as [int]
Ratio = $matches[3] -as [int]
CRC32 = $matches[5]
}
}
$proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
}
process {
if ($_ -match $regex){
new-object psobject -property (&$props) | select $proplist
}
}
}
Friday, October 21, 2011 3:15 AM
function get-zipcontent {
begin {
$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
$props = {
@{
Name = $matches[6]
Date = $matches[4] -as [datetime]
Length = $matches[1] -as [int]
Size = $matches[2] -as [int]
Ratio = $matches[3] -as [int]
CRC32 = $matches[5]
}
}
$proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
}
process {
if ($_ -match $regex){
write-debug "Matched record: `n"
write-debug "$_"
write-debug "***********************"
write-debug "Matches:`n"
write-debug "$($matches[1..6])"
new-object psobject -property (&$props) | select $proplist
}
}
}
Let's add some debug.
Set $debugpreference = "Continue"
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Friday, October 21, 2011 2:15 PM
I ran both processes to compare the sample dataset and the output from a live zip I am working with. A good bit of info, but, I commented it so it should be clear what's going on. I did as you suggested: added the write-debug lines and set $debugpreference = 'Continue':
Approach 1: Data from variable
$data variable output:
Length Method Size Ratio Date Time CRC-32 Attr Name
51421 DeflatX 7834 85% 08/01/2011 07:31 d35ad156 --w-* D080111rpts.ctl
179450364 DeflatX 15360545 92% 08/01/2011 07:12 73ddacbf --w-* D080111rpts.dat
179501785 15368379 92% 2
Spliting data.
Professing data with Get-ZipContent-Old
Output of $variable_content.
Name : D080111rpts.ctl
Date : 8/1/2011 7:31:00 AM
Length : 51421
Method : DeflatX
Size : 7834
Ratio : 85
CRC32 : d35ad156
Attr : --w-*
Name : D080111rpts.dat
Date : 8/1/2011 7:12:00 AM
Length : 179450364
Method : DeflatX
Size : 15360545
Ratio : 92
CRC32 : 73ddacbf
Attr : --w-*
Approach 2: Data from wzunzip
Collecting import.txt data into $import_file.
Configuring wzunzip analysis.
Starting processing.
Complete wzunzip processing.
Splitting data
Output of $zip_input_data
WinZip(R) Command Line Support Add-On Version 3.2 (Build 8668)
Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved
Zip file: C:\Data\Conv000001.zip
Length Size Ratio Date Time Name
1218497 1024416 16% 09/21/2011 10:01 10097 E.pdf
1616270 1402640 14% 09/21/2011 09:33 10114 A.pdf
1079692 943306 13% 09/21/2011 09:46 1013 C.pdf
181716 153372 16% 09/21/2011 09:46 1013 MC.pdf
1019725 896758 13% 09/21/2011 10:16 10132 C.pdf
699030 600858 15% 09/21/2011 09:56 10132 D.pdf
135857 129956 5% 09/21/2011 09:34 10152 A.17594.pdf
878353 761607 14% 09/21/2011 09:34 10152 A.pdf
1046299 882238 16% 09/21/2011 11:16 10152 B.pdf
888498 773885 13% 09/21/2011 09:57 10152 F.pdf
1009413 856657 16% 09/21/2011 10:32 10152 G.pdf
92171 85836 7% 09/21/2011 10:47 10170 A.25909.pdf
757799 625612 18% 09/21/2011 10:47 10170 A.pdf
317116 295492 7% 09/21/2011 11:09 10170 B.27442.pdf
1007145 858549 15% 09/21/2011 11:09 10170 B.pdf
92750 86244 8% 09/21/2011 10:28 10170 D.24581.pdf
508113 437821 14% 09/21/2011 10:28 10170 D.pdf
317012 295312 7% 09/21/2011 11:09 10170 E.27443.pdf
1072913 902921 16% 09/21/2011 11:09 10170 E.pdf
859630 722500 16% 09/21/2011 11:16 10170 F.pdf
1331056 964189 28% 09/21/2011 10:16 10170 MC.pdf
99576 82660 17% 09/21/2011 09:47 10232 A.20588.pdf
1043727 916334 13% 09/21/2011 09:47 10232 A.pdf
1919256 1711546 11% 09/21/2011 10:06 10232 B.pdf
239660 209044 13% 09/21/2011 09:46 1026 MC.pdf
1278687 1140173 11% 09/21/2011 10:16 10260 B.pdf
1551213 1391788 11% 09/21/2011 10:16 10262 B.pdf
4016061 3518105 13% 09/21/2011 10:16 10285 A.pdf
325337 287375 12% 09/21/2011 10:16 10287 ODP.pdf
358069 287281 20% 09/21/2011 10:16 10289 MC.pdf
1499359 1317263 13% 09/21/2011 09:45 1032 A.pdf
1318355 1087831 18% 09/21/2011 11:13 1032 B.pdf
1616523 1337912 18% 09/21/2011 10:27 1032 D.pdf
792675 660027 17% 09/21/2011 10:16 10364 MC.pdf
1168987 1020574 13% 09/21/2011 09:40 10393 A.pdf
28692 19200 34% 09/21/2011 09:48 10395 A.20771.pdf
938371 830101 12% 09/21/2011 09:48 10395 A.pdf
366707 301509 18% 09/21/2011 10:15 10459 MC.pdf
688418 459683 34% 09/21/2011 10:15 10471 ODP.pdf
284516 236212 17% 09/21/2011 10:15 10477 MC.pdf
2949227 2618832 12% 09/21/2011 09:44 1048 C.pdf
66757 59947 11% 09/21/2011 09:46 1048 MC.20455.pdf
491267 412167 17% 09/21/2011 09:46 1048 MC.pdf
502375 439110 13% 09/21/2011 09:46 1048 ODP.pdf
1213269 1071776 12% 09/21/2011 09:36 10507 A.pdf
694618 613913 12% 09/21/2011 10:15 10513 ODP.pdf
2796482 2469334 12% 09/21/2011 10:16 10527 A.pdf
1748419 1528567 13% 09/21/2011 09:39 10527 B.pdf
713023 639872 11% 09/21/2011 10:16 10527 ODP.pdf
2744509 2400483 13% 09/21/2011 10:16 10550 A.pdf
328463 300256 9% 09/21/2011 10:16 10550 MC.pdf
162630 149418 9% 09/21/2011 09:39 10551 D.19167 (2).pdf
58563 49594 16% 09/21/2011 09:39 10551 D.19167.pdf
1256560 1102793 13% 09/21/2011 09:39 10551 D.pdf
123059 112906 9% 09/21/2011 10:16 10551 G.23920.pdf
1042783 921643 12% 09/21/2011 10:16 10551 G.pdf
61605 56138 9% 09/21/2011 10:00 10551 H.22240.pdf
1209095 1067721 12% 09/21/2011 10:00 10551 H.pdf
127390 116145 9% 09/21/2011 10:16 10557 C.23922 (2).pdf
78602 72082 9% 09/21/2011 10:16 10557 C.23922.pdf
2658867 2386537 11% 09/21/2011 10:16 10557 C.pdf
346059 281572 19% 09/21/2011 10:16 10573 MC.pdf
1455720 1298255 11% 09/21/2011 09:33 10583 A .pdf
937539 813216 14% 09/21/2011 09:33 10583 A.pdf
135119 90741 33% 09/21/2011 10:06 10583 B.22942 (2).pdf
36082 29954 17% 09/21/2011 10:06 10583 B.22942 (3).pdf
239963 223101 8% 09/21/2011 10:06 10583 B.22942.pdf
1216937 1050041 14% 09/21/2011 10:06 10583 B.pdf
1811111 1559723 14% 09/21/2011 11:13 10583 C.pdf
274670 227410 18% 09/21/2011 10:16 10583 MC.pdf
474134 419677 12% 09/21/2011 10:16 10583 ODP.pdf
64025 53785 16% 09/21/2011 10:16 10590 B.23926.pdf
1818539 1653010 10% 09/21/2011 10:16 10590 B.pdf
1000799 881160 12% 09/21/2011 09:33 10592 B.pdf
24540 18840 24% 09/21/2011 09:37 10592 D.18280.pdf
1576821 1369732 14% 09/21/2011 09:36 10592 D.pdf
987660 869827 12% 09/21/2011 10:01 10592 E.pdf
314344 246759 22% 09/21/2011 10:16 10592 MC.pdf
618164 515294 17% 09/21/2011 10:16 10627 MC.pdf
703529 595902 16% 09/21/2011 10:03 10633 B.pdf
5022028 4247748 16% 09/21/2011 10:16 10633 C.pdf
95873 82742 14% 09/21/2011 10:16 10633 ODP.23931 (2).pdf
1489700 1299456 13% 09/21/2011 10:16 10633 ODP.23931.pdf
435716 363564 17% 09/21/2011 10:16 10633 ODP.pdf
74863 67418 10% 09/21/2011 09:34 10642 B.17205 (2).pdf
734770 595726 19% 09/21/2011 09:34 10642 B.17205 (3).pdf
71036 63799 11% 09/21/2011 09:34 10642 B.17205 (4).pdf
288309 242518 16% 09/21/2011 09:34 10642 B.17205 (5).pdf
1168909 1103561 6% 09/21/2011 09:34 10642 B.17205 (6).pdf
55382 45015 19% 09/21/2011 09:34 10642 B.17205 (7).pdf
79608 73494 8% 09/21/2011 09:34 10642 B.17205.pdf
2853155 2373573 17% 09/21/2011 09:34 10642 B.pdf
490201 447654 9% 09/21/2011 10:16 10642 ODP.pdf
774920 681587 13% 09/21/2011 10:16 10659 ODP.pdf
1845292 1574149 15% 09/21/2011 11:11 10661 A.pdf
1494626 1294812 14% 09/21/2011 10:16 10661 H.pdf
329066 276921 16% 09/21/2011 10:27 10661 MC.pdf
555382 469319 16% 09/21/2011 10:16 10661 ODP.pdf
1718393 1469860 15% 09/21/2011 10:16 10695 C.pdf
930302 812443 13% 09/21/2011 10:16 10695 ODP.pdf
685521 589140 15% 09/21/2011 09:41 10725 A.pdf
441649 391158 12% 09/21/2011 10:16 10725 MC.pdf
1017894 916292 10% 09/21/2011 10:16 10725 ODP.pdf
1889286 1571676 17% 09/21/2011 10:54 10739 A.pdf
211446 180234 15% 09/21/2011 10:16 10750 B.23940.pdf
924173 796491 14% 09/21/2011 10:16 10750 B.pdf
26670 21109 21% 09/21/2011 09:44 10750 D.20019.pdf
1428894 1248140 13% 09/21/2011 09:44 10750 D.pdf
1792431 1500211 17% 09/21/2011 10:53 10750 E.pdf
1694033 1424490 16% 09/21/2011 10:53 10772 G.pdf
3092496 2748958 12% 09/21/2011 10:16 10788 A.pdf
67377 62669 7% 09/21/2011 09:55 10788 MC.21719 (2).pdf
67482 62781 7% 09/21/2011 09:55 10788 MC.21719 (3).pdf
289775 247598 15% 09/21/2011 09:55 10788 MC.21719.pdf
404803 344141 15% 09/21/2011 09:55 10788 MC.pdf
324848 273075 16% 09/21/2011 10:16 10812 MC.pdf
6245069 5178851 18% 09/21/2011 10:02 10819 A.pdf
608881 562430 8% 09/21/2011 10:16 10820 ODP.pdf
213629 173664 19% 09/21/2011 10:16 10831 MC.pdf
69566 63059 10% 09/21/2011 09:49 10845 F.20857 (2).pdf
68773 61770 11% 09/21/2011 09:49 10845 F.20857.pdf
599463 513460 15% 09/21/2011 09:49 10845 F.pdf
543348 462596 15% 09/21/2011 10:16 10845 MC.pdf
4152349 3595934 14% 09/21/2011 10:16 10870 D.pdf
1049217 922875 13% 09/21/2011 10:16 10915 A.pdf
1031344 901848 13% 09/21/2011 09:56 10915 E.pdf
856813 759660 12% 09/21/2011 10:16 10937 ODP.pdf
1035444 904159 13% 09/21/2011 09:38 10940 K.pdf
122066767 105365848 14% 128
Professing data with Get-ZipContent-New
DEBUG: Matched record:
DEBUG: 162630 149418 9% 09/21/2011 09:39 10551 D.19167 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 162630 149418 9 09/21/2011 09:39 10551 D.19167
DEBUG: Matched record:
DEBUG: 127390 116145 9% 09/21/2011 10:16 10557 C.23922 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 127390 116145 9 09/21/2011 10:16 10557 C.23922
DEBUG: Matched record:
DEBUG: 1455720 1298255 11% 09/21/2011 09:33 10583 A .pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 1455720 1298255 11 09/21/2011 09:33 10583 A
DEBUG: Matched record:
DEBUG: 135119 90741 33% 09/21/2011 10:06 10583 B.22942 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 135119 90741 33 09/21/2011 10:06 10583 B.22942
DEBUG: Matched record:
DEBUG: 36082 29954 17% 09/21/2011 10:06 10583 B.22942 (3).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 36082 29954 17 09/21/2011 10:06 10583 B.22942
DEBUG: Matched record:
DEBUG: 95873 82742 14% 09/21/2011 10:16 10633 ODP.23931 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 95873 82742 14 09/21/2011 10:16 10633 ODP.23931
DEBUG: Matched record:
DEBUG: 74863 67418 10% 09/21/2011 09:34 10642 B.17205 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 74863 67418 10 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 734770 595726 19% 09/21/2011 09:34 10642 B.17205 (3).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 734770 595726 19 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 71036 63799 11% 09/21/2011 09:34 10642 B.17205 (4).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 71036 63799 11 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 288309 242518 16% 09/21/2011 09:34 10642 B.17205 (5).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 288309 242518 16 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 1168909 1103561 6% 09/21/2011 09:34 10642 B.17205 (6).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 1168909 1103561 6 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 55382 45015 19% 09/21/2011 09:34 10642 B.17205 (7).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 55382 45015 19 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 67377 62669 7% 09/21/2011 09:55 10788 MC.21719 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 67377 62669 7 09/21/2011 09:55 10788 MC.21719
DEBUG: Matched record:
DEBUG: 67482 62781 7% 09/21/2011 09:55 10788 MC.21719 (3).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 67482 62781 7 09/21/2011 09:55 10788 MC.21719
DEBUG: Matched record:
DEBUG: 69566 63059 10% 09/21/2011 09:49 10845 F.20857 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 69566 63059 10 09/21/2011 09:49 10845 F.20857
Output of $zip_file_content.
Name : D.19167
Date : 9/21/2011 9:39:00 AM
Length : 162630
Size : 149418
Ratio : 9
CRC32 : 10551
Name : C.23922
Date : 9/21/2011 10:16:00 AM
Length : 127390
Size : 116145
Ratio : 9
CRC32 : 10557
Name : A
Date : 9/21/2011 9:33:00 AM
Length : 1455720
Size : 1298255
Ratio : 11
CRC32 : 10583
Name : B.22942
Date : 9/21/2011 10:06:00 AM
Length : 135119
Size : 90741
Ratio : 33
CRC32 : 10583
Name : B.22942
Date : 9/21/2011 10:06:00 AM
Length : 36082
Size : 29954
Ratio : 17
CRC32 : 10583
Name : ODP.23931
Date : 9/21/2011 10:16:00 AM
Length : 95873
Size : 82742
Ratio : 14
CRC32 : 10633
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 74863
Size : 67418
Ratio : 10
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 734770
Size : 595726
Ratio : 19
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 71036
Size : 63799
Ratio : 11
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 288309
Size : 242518
Ratio : 16
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 1168909
Size : 1103561
Ratio : 6
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 55382
Size : 45015
Ratio : 19
CRC32 : 10642
Name : MC.21719
Date : 9/21/2011 9:55:00 AM
Length : 67377
Size : 62669
Ratio : 7
CRC32 : 10788
Name : MC.21719
Date : 9/21/2011 9:55:00 AM
Length : 67482
Size : 62781
Ratio : 7
CRC32 : 10788
Name : F.20857
Date : 9/21/2011 9:49:00 AM
Length : 69566
Size : 63059
Ratio : 10
CRC32 : 10845
As you can see, some of the records in the output from the wzunzip are not in the result set. Below is my script:
cls
$DebugPreference = 'Continue';
function get-zipcontent-old {
begin {
$regex = "\s*(\d+)\s+(\w+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
$props = {
@{
Name = $matches[8]
Date = $matches[5] -as [datetime]
Length = $matches[1] -as [int]
Method = $matches[2]
Size = $matches[3] -as [int]
Ratio = $matches[4] -as [int]
CRC32 = $matches[6]
Attr = $matches[7]
}
}
$proplist = $props.tostring().split("`n") -match "^\s*(\S+)\s*=.+$" -replace "^\s*(\S+)\s*=.+$",'$1';
}
process {
if ($_ -match $regex){
new-object psobject -property (&$props) | select $proplist
}
}
}
function get-zipcontent-new {
begin {
$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
$props = {
@{
Name = $matches[6]
Date = $matches[4] -as [datetime]
Length = $matches[1] -as [int]
Size = $matches[2] -as [int]
Ratio = $matches[3] -as [int]
CRC32 = $matches[5]
}
}
$proplist = $props.tostring().split("`n") -match "^\s*(\S+)\s*=.+$" -replace "^\s*(\S+)\s*=.+$",'$1'
}
process {
if ($_ -match $regex){
write-debug "Matched record: `n"
write-debug "$_"
write-debug "***********************"
write-debug "Matches:`n"
write-debug "$($matches[1..6])"
new-object psobject -property (&$props) | select $proplist
}
}
}
Write-Output "Approach 1: Data from variable";
$data = @"
Length Method Size Ratio Date Time CRC-32 Attr Name
51421 DeflatX 7834 85% 08/01/2011 07:31 d35ad156 --w-* D080111rpts.ctl
179450364 DeflatX 15360545 92% 08/01/2011 07:12 73ddacbf --w-* D080111rpts.dat
179501785 15368379 92% 2
"@
Write-Output "`$data variable output:";
$data;
Write-Output "Spliting data.";
$input_data = $data -split "`n" |% {$_.trim()}
Write-Output "Professing data with Get-ZipContent-Old";
$variable_content = $input_data | get-zipcontent-Old;
Write-Output "Output of `$variable_content.";
$variable_content;
Write-Output "Approach 2: Data from wzunzip";
Write-Output "Collecting import.txt data into `$import_file.";
$import_file = gc 'C:\Data\import.txt';
Write-Output "Configuring wzunzip analysis.";
$argumentlist = "-vb `"C:\Data\Conv000001.zip`"";
$standarderror = [System.IO.Path]::GetTempFileName();
$standardoutput = [System.IO.Path]::GetTempFileName();
Write-output "Starting processing."
Start-Process -FilePath "C:\Program Files\WinZip\WZUNZIP.EXE" -ArgumentList $argumentlist -RedirectStandardError $standarderror -RedirectStandardOutput $standardoutput -NoNewWindow -Wait;
Write-Output "Complete wzunzip processing.";
Write-Output "Splitting data";
$zip_input_data = (gc $standardoutput) -split "`n" | % {$_.trim()};
Write-Output "Output of `$zip_input_data";
$zip_input_data;
Write-Output "Professing data with Get-ZipContent-New";
$zip_file_content = $zip_input_data | get-zipcontent-New;
Write-Output "Output of `$zip_file_content.";
$zip_file_content;
Friday, October 21, 2011 2:30 PM
It appears some of your file names have embedded spaces. That will cause this (bolded) part of the regex to only match up to that space.
$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s*$"
Try this for your regex:
$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+([^.]+\\S+)\s+(\S+)\s*$"
**([^.]+\\S+) = **one or more of any character other than a dot, followed by a dot, followed by one or more non-space characters.
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Friday, October 21, 2011 2:45 PM
That new regex gave me this:
DEBUG: Matched record:
DEBUG: 162630 149418 9% 09/21/2011 09:39 10551 D.19167 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 162630 149418 9 09/21/2011 09:39 10551 D.19167
DEBUG: Matched record:
DEBUG: 127390 116145 9% 09/21/2011 10:16 10557 C.23922 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 127390 116145 9 09/21/2011 10:16 10557 C.23922
DEBUG: Matched record:
DEBUG: 135119 90741 33% 09/21/2011 10:06 10583 B.22942 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 135119 90741 33 09/21/2011 10:06 10583 B.22942
DEBUG: Matched record:
DEBUG: 36082 29954 17% 09/21/2011 10:06 10583 B.22942 (3).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 36082 29954 17 09/21/2011 10:06 10583 B.22942
DEBUG: Matched record:
DEBUG: 95873 82742 14% 09/21/2011 10:16 10633 ODP.23931 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 95873 82742 14 09/21/2011 10:16 10633 ODP.23931
DEBUG: Matched record:
DEBUG: 74863 67418 10% 09/21/2011 09:34 10642 B.17205 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 74863 67418 10 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 734770 595726 19% 09/21/2011 09:34 10642 B.17205 (3).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 734770 595726 19 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 71036 63799 11% 09/21/2011 09:34 10642 B.17205 (4).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 71036 63799 11 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 288309 242518 16% 09/21/2011 09:34 10642 B.17205 (5).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 288309 242518 16 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 1168909 1103561 6% 09/21/2011 09:34 10642 B.17205 (6).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 1168909 1103561 6 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 55382 45015 19% 09/21/2011 09:34 10642 B.17205 (7).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 55382 45015 19 09/21/2011 09:34 10642 B.17205
DEBUG: Matched record:
DEBUG: 67377 62669 7% 09/21/2011 09:55 10788 MC.21719 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 67377 62669 7 09/21/2011 09:55 10788 MC.21719
DEBUG: Matched record:
DEBUG: 67482 62781 7% 09/21/2011 09:55 10788 MC.21719 (3).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 67482 62781 7 09/21/2011 09:55 10788 MC.21719
DEBUG: Matched record:
DEBUG: 69566 63059 10% 09/21/2011 09:49 10845 F.20857 (2).pdf
DEBUG: ***********************
DEBUG: Matches:
DEBUG: 69566 63059 10 09/21/2011 09:49 10845 F.20857
Output of $zip_file_content.
Name : D.19167
Date : 9/21/2011 9:39:00 AM
Length : 162630
Size : 149418
Ratio : 9
CRC32 : 10551
Name : C.23922
Date : 9/21/2011 10:16:00 AM
Length : 127390
Size : 116145
Ratio : 9
CRC32 : 10557
Name : B.22942
Date : 9/21/2011 10:06:00 AM
Length : 135119
Size : 90741
Ratio : 33
CRC32 : 10583
Name : B.22942
Date : 9/21/2011 10:06:00 AM
Length : 36082
Size : 29954
Ratio : 17
CRC32 : 10583
Name : ODP.23931
Date : 9/21/2011 10:16:00 AM
Length : 95873
Size : 82742
Ratio : 14
CRC32 : 10633
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 74863
Size : 67418
Ratio : 10
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 734770
Size : 595726
Ratio : 19
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 71036
Size : 63799
Ratio : 11
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 288309
Size : 242518
Ratio : 16
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 1168909
Size : 1103561
Ratio : 6
CRC32 : 10642
Name : B.17205
Date : 9/21/2011 9:34:00 AM
Length : 55382
Size : 45015
Ratio : 19
CRC32 : 10642
Name : MC.21719
Date : 9/21/2011 9:55:00 AM
Length : 67377
Size : 62669
Ratio : 7
CRC32 : 10788
Name : MC.21719
Date : 9/21/2011 9:55:00 AM
Length : 67482
Size : 62781
Ratio : 7
CRC32 : 10788
Name : F.20857
Date : 9/21/2011 9:49:00 AM
Length : 69566
Size : 63059
Ratio : 10
CRC32 : 10845
Friday, October 21, 2011 2:59 PM
Have to go move a server right now. I'll take some test data and re-do that regex when I get back.[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Friday, October 21, 2011 3:36 PM
See if this doesn't work better:
"\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+(\S+\s+\S+)\s+(\w+)\s+(\S.+\S)\s*$"
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Friday, October 21, 2011 3:36 PM
No worries. I know most of us have jobs that keep us from posting right away. Whenever you can get to it.
Thanks.
Friday, October 21, 2011 3:39 PM
You must have hit enter about the same time I did. That one gets the full file name but only the ones that start with Alpha characters. Some names have numbers too. So, that last field can start with both alpha and numeric characters. Isn't there a regex option for alphanumerics? I can just throw that in.
Friday, October 21, 2011 3:57 PM
Don't know if you missed it, but I posted an updated regex just before your last post.[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Saturday, October 22, 2011 1:55 AM
You must have hit enter about the same time I did. That one gets the full file name but only the ones that start with Alpha characters. Some names have numbers too. So, that last field can start with both alpha and numeric characters. Isn't there a regex option for alphanumerics? I can just throw that in.
I missed this post earlier. The regex was originally written for a different output format (using a different command line option), and I need a don't have a good sample of data from that format. I'm kind of guessing at these changes. If you can give me the first 3-4 lines of output from the command it would help.
In them meantime, here's an alternate solution (again I'm guessing at the data format) that has a much simpler regex, and uses substrings instead of regex captures to differentiate the data:
$line = "74863 67418 10% 09/21/2011 09:34 10642 B.17205 (2).pdf"
function get-zipcontent {
begin {
$regex = "^\d+"
$props = {
@{
Name = $_.substring(51)
Size = $_.substring(0,9) -as [int]
Length = $_.substring(14,6) -as [int]
Ratio = $_.substring(21,2) -as [int]
Date = $_.substring(26,16) -as [datetime]
CRC32 = $_.substring(45,5)
}
}
$proplist = $props.tostring().split(“`n”) -match “^\s*(\S+)\s*=.+$” -replace “^\s*(\S+)\s*=.+$”,’$1'
}
process {
if ($_ -match $regex){
new-object psobject -property (&$props) | select $proplist
}
}
}
$line | get-zipcontent
edit: I went back and bolded what changed from the original script. It's still the same snipped we started with, except for the regex and the hash table.
If you prefer doing it this way, this helps:
http://mjolinor.wordpress.com/2011/05/30/position-map-a-string-with-powershell/
[string](0..33|%{[char][int](46+("686552495351636652556262185355647068516270555358646562655775 0645570").substring(($_*2),2))})-replace " "
Monday, October 24, 2011 9:31 PM
Here is a sample with the spaces in the file names AND one without spaces:
Length Size Ratio Date Time Name
------------ ------------ ----- ---- ---- ----
1218497 1024416 16% 09/21/2011 10:01 10097 E.pdf
1616270 1402640 14% 09/21/2011 09:33 10114 A.pdf
1079692 943306 13% 09/21/2011 09:46 1013C.pdf
181716 153372 16% 09/21/2011 09:46 1013 MC.pdf
Tuesday, October 25, 2011 2:15 PM
mjolinor, I pulled out my copy of Andrew Watts' Beginning Regular Expressions and came up with this:
$regex = "\s*(\d+)\s+(\d+)\s+(\d{1,2})% \s+([0-1][0-9]/[0-3][0-9]/[1-2][0-9][0-9][0-9])\s+(\d+:\d+)\s+((\S+\s+\S+)$|(\S+)$)"
It works in all my testing so far. Let me know if I missed anything.
Tuesday, October 25, 2011 5:07 PM
Thank you again sir. This is the first time I have really worked with regexes. I can see the power in them now and will certainly be getting more familiar with them.