(2017-05-01) Batch File Filtering

Yeah, it's been a long time again. I'm still working on a secret project which is the primary reason I've not updated this site.

Anyway. I had some client work I needed to sort out, which is why this post is about Windows batch files.

The need was to remove lines of a text file (specifically a csv file) based on the contents of another. Both files where laid out the same (in the format of Name,Email,ID) and the secondary file essentially worked as a duplicate list, I just needed to delete them from the primary list.

In bash, this would be pretty trivial, but batch files are significantly less ...advanced? Regardless, it took a while to work out exactly how to read files line by line and what I came up with is no doubt not optimal, but honestly, it does what it needs to do and it was 1AM by the time I'd already sorted it out. It essentially works backwards. It copies the master file, then line by line of the filter file removes any lines that match from the output

I added a bit of extra functionality after finishing, so you can input files via the command line arguments, and it does a proper check for files existing.

@echo off
rem delay variable expansion
setlocal enabledelayedexpansion

rem read command line arguments
if "%1" == "" (
	set LIST_TO_FILTER=master.csv
) else (
	set LIST_TO_FILTER=%1
)
if "%2" == "" (
	set LIST_TO_FILTER_FROM=filter.csv
) else (
	set LIST_TO_FILTER_FROM=%2
)
if "%3" == "" (
	set OUTPUT=output.csv
) else (
	set OUTPUT=%3
)

rem count variable
set /A COUNT=0

rem print files and wait for confirmation
echo.
echo File to Filter: %LIST_TO_FILTER%
echo Filter From: %LIST_TO_FILTER_FROM%
echo Final Output: %OUTPUT%
echo.
echo ---------------
echo.
echo Press any key to start...
@pause >nul 2>&1

rem check if files exists
if not exist %LIST_TO_FILTER% (
	echo %LIST_TO_FILTER% does not exist
	echo Press any key to abort...
	@pause >nul 2>&1
	exit /B
)
if not exist %LIST_TO_FILTER_FROM% (
	echo %LIST_TO_FILTER_FROM% does not exist
	echo Press any key to abort...
	@pause >nul 2>&1
	exit /B
)

rem copy LIST_TO_FILTER to OUTPUT
@copy %LIST_TO_FILTER% %OUTPUT% >nul 2>&1

rem read LIST_TO_FILTER_FROM line by line
echo|set /p="Filtering"
for /f "tokens=*" %%A in (%LIST_TO_FILTER_FROM%) do (
	rem find line in OUTPUT and copy all but that line to OUTPUT.tmp
	find /v "%%~A" < %OUTPUT% > %OUTPUT%.tmp
	rem move OUTPUT.tmp back to OUTPUT
	@move %OUTPUT%.tmp %OUTPUT% >nul 2>&1
	rem count lines
	set /A COUNT+=1
	rem print indicator
	echo|set /p="."
)

rem print final processing count and end
echo.
echo Done processing %COUNT% lines
echo.
echo Press any key to close...
@pause >nul 2>&1