Searching Text Files in a Directory
I have several years of vendor invoices, in text file format, in some directories on a share. I need to search through these text files to find an order number, manifest number, or some other piece of information. I can’t search everything because it would take too long. And I don’t have control over the server, so if there is some indexing that could be done, I can’t do it. I’m stuck with good old VBA.
The folders are yyyymmdd (ex: 20150725 for July 25th) and corresponds to the invoice dates for any invoices in the file. Each file starts with a three letter abbreviation of the vendors name. Invoice date and vendor name are the only two pieces of information I can use to limit the search. The final piece of information is, of course, the search term. Here’s what the form looks like
I have a table of vendors and codes to populate the Vendor combobox. The QuickDate combobox populates the Date Range textboxes and contains common date ranges, namely, Last Month, This Month, Last Quarter, This Quarter, Last Year, This Year. I can change the dates to whatever I want if there isn’t a Quick Date that suits me. The Search Terms textbox takes a space separated list of terms to search for.
And now the fun part. The code. This converts the Quick Dates into real dates
Dim dtStart As Date, dtEnd As Date
Select Case Me.cbxQuick.Value
Case "Last Month"
dtStart = DateSerial(Year(Now), Month(Now) – 1, 1)
dtEnd = DateSerial(Year(Now), Month(Now), 0)
Case "This Month"
dtStart = DateSerial(Year(Now), Month(Now), 1)
dtEnd = DateSerial(Year(Now), Month(Now) + 1, 0)
Case "Last Quarter"
dtStart = DateSerial(Year(Now), Month(Now) – (((Month(Now) – 1) Mod 3) + 3), 1)
dtEnd = DateSerial(Year(dtStart), Month(dtStart) + 3, 0)
Case "This Quarter"
dtStart = DateSerial(Year(Now), Month(Now) – (((Month(Now) – 1) Mod 3)), 1)
dtEnd = DateSerial(Year(dtStart), Month(dtStart) + 3, 0)
Case "Last Year"
dtStart = DateSerial(Year(Now) – 1, 1, 1)
dtEnd = DateSerial(Year(Now), 1, 0)
Case "This Year"
dtStart = DateSerial(Year(Now), 1, 1)
dtEnd = DateSerial(Year(Now) + 1, 1, 0)
End Select
Me.tbxStartDate.Text = Format(dtStart, "mm/dd/yyyy")
Me.tbxEndDate.Text = Format(dtEnd, "mm/dd/yyyy")
End Sub
This makes sure a real date is entered, but provides for 6 or 8 digit date entry.
If IsDate(Me.tbxEndDate.Value) Then
tbxEndDate.Text = FormatDateTime(tbxEndDate.Value, vbShortDate)
ElseIf Len(tbxEndDate.Text) = 6 Then
tbxEndDate.Text = DateSerial(Right(tbxEndDate.Text, 2), Left(tbxEndDate.Text, 2), Mid(tbxEndDate.Text, 3, 2))
ElseIf Len(tbxEndDate.Text) = 8 Then
tbxEndDate.Text = DateSerial(Right(tbxEndDate.Text, 4), Left(tbxEndDate.Text, 2), Mid(tbxEndDate.Text, 3, 2))
Else
MsgBox "You must enter a valid date."
Cancel = True
End If
End Sub
And the big one, the actual search. This is pretty long and needs to be refactored, but it works for now.
Dim vaTerms As Variant
Dim i As Long, j As Long
Dim aFolders() As String
Dim sFolder As String, sFile As String, lFile As Long
Dim lCnt As Long
Dim dtFolder As Date
Dim sText As String
Const sPATH As String = "\yourserverrawdata"
Me.lbxResults.Clear
ReDim aFolders(1 To 1000)
sFolder = Dir(sPATH & "*", vbDirectory)
‘get a list of folders in the date range
Do While Len(sFolder) > 0
If Len(sFolder) = 8 Then
dtFolder = DateSerial(Left$(sFolder, 4), Mid$(sFolder, 5, 2), Right$(sFolder, 2))
If dtFolder >= CDate(Me.tbxStartDate.Text) And dtFolder <= CDate(Me.tbxEndDate.Text) Then
lCnt = lCnt + 1
aFolders(lCnt) = sFolder
sFolder = Dir
End If
End If
sFolder = Dir
Loop
ReDim Preserve aFolders(1 To lCnt)
lCnt = 0
vaTerms = Split(Me.tbxSearch.Text, Space(1))
‘Make a dummy result
Me.lbxResults.AddItem vbNullString
For i = LBound(aFolders) To UBound(aFolders)
sFolder = sPATH & aFolders(i) & ""
sFile = Dir(sFolder & Me.cbxVendor.Value & "*.IN?")
Do While Len(sFile) > 0
‘Show the current folder as a result
Me.lbxResults.Column(0, 0) = sFolder & sFile
Me.Repaint
‘Open the file and read in all the text
lFile = FreeFile
Open sPATH & aFolders(i) & "" & sFile For Binary As lFile
sText = Space$(LOF(lFile))
Get #1, , sText
Close lFile
‘Loop through the space separated search terms and see if
‘they’re in the file
For j = LBound(vaTerms) To UBound(vaTerms)
If InStr(1, sText, vaTerms(j), vbTextCompare) > 0 Then
‘This is the animation part
Me.lbxResults.AddItem vbNullString, 0
Me.lbxResults.TopIndex = 0
lCnt = lCnt + 1
DoEvents
Exit For
End If
Next j
sFile = Dir
Loop
Next i
‘Get rid of the dummy
Me.lbxResults.RemoveItem 0
End Sub
It takes about 60 seconds per month to search the files. That’s a long time so it’s necessary to entertain the user while he waits. The top entry in the results listbox is whatever the current file is. It rapidly changes the display as it loops through the folder. When there’s a hit, that file becomes the second entry and any prior hits move down. This little animation lets the user know that it’s still working and gives him a list of what hits have been found already.
You can download SearchTextFiles.zip