Splitting bulk text file every n line Splitting bulk text file every n line unix unix

Splitting bulk text file every n line


for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done

Or, written over multiple lines:

for f in filename*.txtdo    split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"done

How it works:

  • -d tells split to use numeric suffixes

  • -a1 tells split to start with only single digits for the suffix.

  • -l10000 tells split to split every 10,000 lines.

  • --additional-suffix=.txt tells split to add .txt to the end of the names of the new files.

  • "$f" tells split the name of the file to split.

  • "${f%.txt}-" tells split the prefix name to use for the split files.

Example

Suppose that we start with these files:

$ lsfilename1.txt  filename2.txt

Then we run our command:

$ for f in filename*.txt; do split -d -a1 -l10000 --additional-suffix=.txt "$f" "${f%.txt}-"; done

When this is done, we now have the original files and the new split files:

$ lsfilename1-0.txt  filename1-1.txt  filename1.txt  filename2-0.txt  filename2-1.txt  filename2.txt

Using older, less featureful forms of split

If your split does not offer --additional-suffix, then consider:

for f in filename*.txtdo     split -d -a1 -l10000 "$f" "${f%.txt}-"    for g in "${f%.txt}-"*    do         mv "$g" "$g.txt"    donedone


No need for shell loops, just one simple awk command does it for all files:

awk 'FNR%1000==1{if(FNR==1)c=0; close(out); out=FILENAME; sub(/.txt/,"-"++c".txt)} {print > out}' *