Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Wednesday, April 11, 2012 2:27 PM
How do I find or search by actually reading data from the file system ... without using a "search" index?
- I don't want an index of possible future searches limited by options in an "Indexing Options" control applet of Windows Search Service!
- I do want all file types and file name extensions included
- I do want to match on partial strings (find cde within abcdefg for example)
- I do want some flexibility such as start path, subfolder, wildcard values to match, etc., typical end-user request
- I do not want to install a 3rd party tool on the server, I'd like to use already available built-in Microsoft utilities
- I repeat: I do NOT want to use the (next to worthless) Windows Search Service Index
The production servers I typically administer do NOT even have the Windows Search Service installed. I just want an on-demand search that actually reads the NTFS volume contents searching EVERYTHING in a specified path (no matter what the file type or file name extension).
I also recognize there are 3rd-party tools available (and other operating systems) which do this very well, but I am limited to what is already present in the Windows operating system and as a best practice I do not want to add more code that will need to be maintained. For example, is there a PowerShell cmdlet that will read everything in a given path and select results based on a partial string match? I don't want to write a script, but a couple pipelined commands might do the trick. Something that I can memorize and type again without having to load a PowerShell library of scripting tools would be nice.
All replies (9)
Wednesday, April 25, 2012 1:08 AM ✅Answered
Hi George,
I'll give you a basic example. Let's say you want to find the word "office" in all text files (.txt) under C:\ excluding any text files located in any "Temp" directories. This is what you'd get:
for /f "tokens=*" %a in ('dir /s /b C:\*.txt') do @echo %a | findstr /i /v "\\Temp\\" > nul & if %errorlevel%==0 findstr /i /m "office" "%a"
I'll break this down for you so you can take the logic and construct your own searches:
for /f | Just means we're setting up a "for" loop. You can get more information on for through running "for /?". |
tokens=* | Just means we're ignoring spaces as delimeters, since filenames can and oftne do contain spaces. We want the whole path to be stored in the variable "%a". |
('dir /s /b C:\*.txt') | This is our basic file search specification. The output from this command is the value placed in %a for each pass of the loop. For example, it might find a match for C:\Data\Blah.txt, which is what would then be loaded into %a for that pass. |
do | Everything after this keyword is what's executed for each pass of the loop. |
@echo %a | This is just used to echo the path of the file found into the first findstr command. |
findstr /i /v "\\Temp\\" | This first findstr command checked that we don't have a path that contains \temp\. The /v means only provide a match if the supplied string doesn't contain the specified substring. The > nul just means redirect any output to the void - don't display it on the screen. |
if %errorlevel%==0 | When a program exits, it can provide what's called a return code - which is just a number, which can be used for further processing courtesy of the %errorlevel% variable. Programs are not obliged to do this, but fortunately findstr does. 1 means a match was found while 0 means there was no match. |
findstr /i /m "office" "%a" | Last but not least, this findstr actually searches the specified file (in %a) for the string you provide (in this example, it's "office"). If it finds a match, the /m switch directs findstr to simply print the filename. The default behaviour otherwise would have been to have output the contents of the file, which makes for messy reading. |
I could kick myself, as I edited this reply and lost the final two paragraphs. So, I'll summarise as best I can.
This isn't the most complex example, but it highlights the key components. More complex examples will mostly just revolve around chaining more findstr commands.
One thing I can't remember is whether findstr can handle unicode formatting or not, in which case if it can't it will affect the search results. You could use regular expressions to address this to a degree, but I'm going to leave those as "beyond the scope of this article" for now.
Cheers,
Lain
Thursday, April 26, 2012 2:26 AM ✅Answered
Hi George,
I don't seem to suffer the precise problem as described above, but interestingly, I did have problems with a related statement that made use of %errorlevel% in the context of being executed within a batch file. This only came to light yesterday in a batch file I wrote to help another Technet user.
The end result is the same, which is using the notation of %errorlevel%==<somevalue> isn't 100% reliable, which is disconcerting - so a good pick-up on your behalf there.
Thankfully, the "if" construct supports errorlevel on a built-in basis, so with that in mind, here's an ever-so-slightly different example command line that should prove more robust:
for /f "tokens=*" %a in ('dir /s /b C:\*.txt') do @echo %a | findstr /i "\\Temp\\" > nul & if errorlevel 1 findstr /i /m "office" "%a"
It does the same thing, but I'm hoping (based on spamming the command quite a few times I haven't run into issues yet) it avoids the inconsistent behaviour of %errorlevel%. The odd thing is I haven't run into this issue before, but then I don't write a lot of fancy command lines or batch files anymore as most of what I do lends itself very much to Powershell.
Let me know how you fare.
Cheers,
Lain
Thursday, October 11, 2012 8:23 AM ✅Answered | 3 votes
so seriously the answer the question is write a complicate dos or powershell script?
How about this. Keep an XP box on our network somewhere and whenever we want to search the logon scripts for which one contains the "m:" drive mapping, we use the XP box.
I know its an incorrect standard theory that people in charge of anything are idiots, but MS aren't doing a lot to dispell this.
"i know lets create a server OS where you can't easily search the contents of a file - who'd want to do that?"
Wednesday, April 11, 2012 3:02 PM | 1 vote
Hi George,
If you're only looking to pattern match based on the filename, then you can use a couple of old DOS-style commands or Powershell without it being complicated. Of course, you can use the same tooks and make it complicated - it just depends on what level of granularity you want out of the results.
Command Prompt
Tools:
- The command prompt itself (cmd.exe)
- dir shell statement
- findstr.exe command executable (present on Windows systems by default)
Examples:
- dir /s /b /a-d c:\
Lists literally all files under c:\ including hidden and system in a bare format that shows the entire file path - dir /s /b /a-d c:\ | findstr /i "this that other"
Finds any files that have either "this", "that" or "other" anywhere in their filename (or path, given /b dictates using the full path) - dir /s /b /a-d c:\ | findstr /i /v "\tmp \bak"
Finds any files that don't have ".tmp" or ".bak" in their filename. The period has to be prefixed with the escape character (the \ to indicate it really is the period we want. A period on its own matches any "normal" character - it's like a single-character wildcard.
Powershell
Tools:
- Just the regular Powershell console
Examples:
- Get-ChildItem -Path c:\ -Recurse | where {$_.Name -match ".xml"}
Lists files that contain ".xml" - Get-ChildItem -Path c:\ -Recurse | where {$_.FullName -match "\Temp\}
Lists any files found in any directories or subdirectories that ultimately branch out from a "\Temp\ directory
Hopefully this helps put you on the right track. The one thing you're not going to be able to do with such basic commands - and you didn't ask for this so it shouldn't be an issue - I just wanted to be clear, is search the content of these files. In principle, that too can be done, but in practice it would take more effort with the commands and the performance would be positively horrid.
Cheers,
Lain
Tuesday, April 24, 2012 7:06 PM
Lain,
Those are great answers! (Although nothing I can memorize and recall at will. I will just tuck that syntax away somewhere handy.)
You're right I didn't ask for finding content within files, but it was an oversight I didn't include that as a "requirement". As to performance being horrid - well that is in the eye of the beholder. Sometimes when you need to find something you have no choice but to look for it. I'd rather wait while a search ran a long time looking, than to not find what I'm looking for or be under the misdirection (by Windows Explorer or Search service) that what I searched for wasn't present.
If it isn't too much trouble, would you care to take on the challenge of the original list of requirements and include finding within file contents?
P.S. Rhetorical question: Why didn't Microsoft include the equivalent of [dir /s /b /a-d c:\ | findstr /i "this that other"] in a find menu item in the Windows Explorer GUI!?
Wednesday, April 25, 2012 3:02 PM
Lain, you rock! This is exactly what I was looking for! (Practical application: finding "missing" purchase orders from our ERP system.)
The only quirk I found was that if I ran the for loop command stream to search for values, no results are displayed on alternating cycles (within the same command window). It is as if there is a condition stored in memory that must be cleared upon entry/exit and it isn't cleared unless the command is run twice. Here is a copy from a command window showing the success and then the null result.
C:\>for /f "tokens=*" %a in ('dir /s /b D:\MHCIMPORTP\*.*') do @echo %a | findstr /i /v "\\Temp\\" > nul & ifl%==0 findstr /i /m "1174" "%a"D:\MHCIMPORTP\p-form1.041012.132725.791.oldD:\MHCIMPORTP\p-form1.041012.133548.661.oldD:\MHCIMPORTP\p-form1.Apr_10_2012132735.oldD:\MHCIMPORTP\p-form1.Apr_10_2012133644.oldD:\MHCIMPORTP\p-form2.041012.130217.800.oldD:\MHCIMPORTP\p-form2.041012.133548.692.oldD:\MHCIMPORTP\p-form2.Apr_10_2012130313.oldD:\MHCIMPORTP\p-form2.Apr_10_2012133644.oldD:\MHCIMPORTP\p-form3.041012.133548.723.oldD:\MHCIMPORTP\p-form3.Apr_10_2012133644.oldC:\>for /f "tokens=*" %a in ('dir /s /b D:\MHCIMPORTP\*.*') do @echo %a | findstr /i /v "\\Temp\\" > nul & ifl%==0 findstr /i /m "1174" "%a"C:\>
P.S. If I open a new command window each time, then it works just fine.
Thursday, April 26, 2012 2:45 PM
The new version seems to work more consistently.
Your other caveat that FINDSTR won't work on UNICODE files is true. :( That is unfortunate. For the particular search I need to accomplish occasionally (missing purchase orders) this is okay - they are ANSI text files. But it prevents this script from being truly universal.
I find it incredibly ironic that "finding" and "searching" is so hard for Microsoft to do as a basic and essential feature of the Windows operating system. There are other operating systems and 3rd party utilities for windows that seem to do this without any issues. Perhaps by version 22 (that's how many releases of Windows there have been since Windows 1.0) you'd think Microsoft would have figured this out!?
Thursday, April 26, 2012 3:36 PM
Regarding the unicode business, I thought that may be the case. My memory's not great, but I thought I remembered something like that as being a limitation.
Powershell would be one answer to the issue. If I can muster a little motivation, then perhaps I'll take a quick look at it this weekend in the name of posterity. My brain's a little too fried at the moment to contemplate it (11:35pm here).
I should ask though, is the platform you're running the script on Server 2008/Vista or later? If not, then I'd revert back to the Windows Scripting Host model.
Cheers,
Lain
Thursday, April 26, 2012 4:21 PM
Windows Server 2008 R2 (for the specific business application I'm dealing with) but we have all versions of Windows (since Windows 2000) in production use in one way or another.
(Windows Explorer's built-in search functionality was thoroughly broken by Microsoft in Windows Vista/2008. On XP/2003 it seems to be able to find things with a little tweaking... Lots of community forum complaints on this subject I might add.)