Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Friday, February 23, 2007 3:55 AM
Hi experts
I'm looking for help to split a large text file into multiple text files based on size of the File.
Can anybody help me in this?
All replies (12)
Friday, February 23, 2007 5:16 AM | 1 vote
Some more information about your requirements would help. What exactly do you mean when you say "based on size of the File"? If you wanted each file to be a maximum of a certain size you could do something like:
string sourceFileName = @"C:\VS2005 SP1.exe";
string destFileLocation = @"C:\;
int index = 0;
long maxFileSize = 52428800;
byte[] buffer = new byte[65536];
using (Stream source = File.OpenRead(sourceFileName))
{
while (source.Position < source.Length)
{
index++;
// Create a new sub File, and read into t
string newFileName = Path.Combine(destFileLocation, Path.GetFileNameWithoutExtension(sourceFileName));
newFileName += index.ToString() + Path.GetExtension(sourceFileName);
using (Stream destination = File.OpenWrite(newFileName))
{
while (destination.Position < maxFileSize)
{
// Work out how many bytes to read
int bytes = source.Read(buffer, 0, (int) Math.Min(maxFileSize, buffer.Length));
destination.Write(buffer, 0, bytes);
// Are we at the end of the file?
if (bytes < Math.Min(maxFileSize, buffer.Length))
{
break;
}
}
}
}
}
Friday, February 23, 2007 5:21 AM
Thanks for the reply Sean Hederman
Suppose if I've text(.txt) file more than 500KB , I want to split it into muliple files.
Ex :
temp.txt is file with 1209KB size
Now the result should be
temp1.txt
temp2.txt
temp3.txt
Friday, February 23, 2007 5:37 AM
But what size should each file be? Whatever that size is, set the maxFileSize to that, and run my code, and it will automatically split the file for you into the directory specified in destFileLocation.
Friday, February 23, 2007 8:14 AM
Thank u , I'll try that.
Friday, February 23, 2007 8:55 AM
can u tell the maxSize field here, what it is,
I want to split at 500KB each file.
If the main file exceeds 500KB then I want to split it.
Friday, February 23, 2007 10:19 AM
Well, 500KB is 500x1024 bytes which means the maximum file size should be 512000.
Friday, February 23, 2007 10:56 AM
Thankq Got
But I'm reading lines from the text file
This method , splitting the line and place the truncated one in other file
Friday, February 23, 2007 11:12 AM
It would have been useful to know that you needed intact lines upfront.
Well, basically you'd rewrite it to use StreamReader and ReadLine, then for each line you'd have to decide if the line would take your file beyond it's max size, and if it didn't write it to the file, and if it did, start a new file.
Tuesday, July 12, 2011 4:02 AM
i already split it out..but it takes time..my file size is more than 2-3G..even to split also need time, not even read and insert into db yet.. any suggestion to read csv file and store to mysql database in efficient/fastest way?? pls help..TQ
Tuesday, July 12, 2011 8:44 AM | 1 vote
Hi,
Steps the achieve the goal :
- Find the size of temp.txt and divide it by 500. This will help you decide you got to split in how many files. Like 1209/500 = 2.41, so you will need 3 files.
- Create a StringBuffer and start reading line by line using ReadLine of StreamReader.
- On reading each line calculate the size of StringBuffer in bytes. If it is < 500, thne continue reading and storing. If it turned > 500 remove the last line from StringBuffer.
- Copy the StringBuffer contents in a file# respectively.
- Continue reading lines till you have reached EOF and saving in file. Repeat 3 & 4 steps.
Step 3 can be done in another way also :
StringBuffer sb = new StringBuffer();
line = streamReader.ReadLine();
sb_bytes = // Find the byte size of sb
line_bytes = // Find the byte size of line
if ( (sb_bytes + line_bytes) <= 500)
sb.Append(line)
else
// Write to File
Hope this helps. If you hae ny concern feel free to ask.
Thanks
If you find any answer helpful, then click "Vote As Helpful" and if it also solves your question then also click "Mark As Answer".
Tuesday, July 12, 2011 10:20 AM
Hi!
Can you please test this code:
Public Function SplitFile(ByVal Filename As String, ByVal RecordsToRead As Integer, ByVal Parts As Integer) As Boolean
Dim filesname As String = Nothing
Dim data() As String = IO.File.ReadAllLines(Filename)
If (Parts * RecordsToRead <= data.Length) Then
Dim portion(RecordsToRead - 1) As String
For i As Integer = 0 To Parts - 1
Array.ConstrainedCopy(data, RecordsToRead * i, portion, 0, RecordsToRead)
Array.Clear(data, 0, RecordsToRead)
IO.File.WriteAllText(Filename.Replace(".", i + 1 & "."), String.Join(vbCrLf, portion))
Next
Else
Return False
End If
Return True
End Function
Here 'Filename' is the name of file, 'RecordsToRead' is the number of records you want to read from a file and put it into a new file, 'Parts' in how many files you want to create. this can put some light on your issue to resolve.
regards,
Shahan
Friday, September 28, 2012 1:22 PM
Thank you so much with your code and some others I came up with the following solution! I have added a link at the bottom to some code I wrote that used some of the logic from this page. I figured I'd give honor where honor was due! Thanks!
Below is a explanation about what I needed:
Try This, I wrote this because I have some very large '|' delimited files that have \r\n inside of some of the columns and I needed to use \r\n as the end of the line delimiter. I was trying to import some files using SSIS packages but because of some corrupted data in the files I was unable to. The File was over 5 GB so it was too large to open and manually fix. I found the answer through looking through lots of Forums to understand how streams work and ended up coming up with a solution that reads each character in a file and spits out the line based on the definitions I added into it. this is for use in a Command Line Application, complete with help :). I hope this helps some other people out, I haven't found a solution quite like it anywhere else, although the ideas were inspired by this forum and others.