Splitting bulk text file every n line

linux shell unix awk cygwin

for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done

Or, written over multiple lines:

for f in filename*.txtdo    split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"done

How it works:

-d tells split to use numeric suffixes
-a1 tells split to start with only single digits for the suffix.
-l10000 tells split to split every 10,000 lines.
--additional-suffix=.txt tells split to add .txt to the end of the names of the new files.
"$f" tells split the name of the file to split.
"${f%.txt}-" tells split the prefix name to use for the split files.

Suppose that we start with these files:

$ lsfilename1.txt  filename2.txt

Then we run our command:

$ for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done

When this is done, we now have the original files and the new split files:

$ lsfilename1-0.txt  filename1-1.txt  filename1.txt  filename2-0.txt  filename2-1.txt  filename2.txt

If your split does not offer --additional-suffix, then consider:

for f in filename*.txtdo     split -d -a1 -l10000 "$f" "${f%.txt}-"    for g in "${f%.txt}-"*    do         mv "$g" "$g.txt"    donedone

linux shell unix awk cygwin

No need for shell loops, just one simple awk command does it for all files:

awk 'FNR%1000==1{if(FNR==1)c=0; close(out); out=FILENAME; sub(/.txt/,"-"++c".txt)} {print > out}' *

CodeHunter