Shell script - convert Excel (xlsx) to CSV - remove blank space / tab space Shell script - convert Excel (xlsx) to CSV - remove blank space / tab space unix unix

Shell script - convert Excel (xlsx) to CSV - remove blank space / tab space


Update: the desired platform for the script has been clarified and a response is no longer applicable. However, I will leave this response here in case a future viewer of this question stumbles upon it and finds it useful. Anyone writing a shell script in a Ubuntu language may be able to port over some aspects of this vbscript as well.

Here is something to get you started. If you record actions with Excel's macro recorder remember that using the same commands in a VBS means you have to get rid of all of the named parameters.

prep_xlsx.vbs

Set objExcel = WScript.CreateObject ("Excel.Application")objExcel.Visible = true 'False  'True for testingstrFileName = "c:\tmp\vbs_test.xlsx" set objWb = objExcel.WorkBooks.open(strFileName) set objWs = objWb.Worksheets(1)with objWs    with .cells(1, 1).CurrentRegion        .Cells.SpecialCells(4) = "**NewDATA**"  ' 4 is xlCellTypeBlanks        .Cells.RemoveDuplicates 2, 1                ' Columns:=2, Header:=xlYes        for c = 1 to .Columns.Count            with .columns(c)                .TextToColumns .Cells(1), 2 ', Array(0, 1)  'Range("C1"), DataType:=xlFixedWidth, FieldInfo:=Array(0, 1)            end with        next    'next c    end withend withobjWb.Close True   'save on closeobjExcel.QuitSet objExcel = Nothing

It should be noted that removing leading / trailing spaces with the Range.TextToColumns method with xlFixedWidth can attempt to split the column into two if there are too many leading spaces. Currently, this will halt the process as it will ask for confirmation on overwriting the next columns values (which you do not want to do). There has to be a significant number of spaces to have Excel guess that it belongs in two columns so unless there are more spaces than a typical word there is nothing to worry about; just something to be aware about. e.g. if there were twice as many leading spaces in D6, it might want to split across two columns.

    vbscript_before
        vbs_test.xlsx before prep_xlsx.vbs

    vbscript_after
        vbs_test.xlsx after prep_xlsx.vbs