I am trying to process some .csv files with Linux as follows:
Some fields have data with newline characters embedded, like so:
"Bob Smith
531 Pennsylvania Avenue
Washington, DC"
(I verified the existence of the " via Wordpad. The file is too large to easily edit in Wordpad to get all the data for each row on a single line).
what linux command would I use on the files to get the data in each cell on one line?
I have tried:
1. awk -v RS="" '{gsub (/\n/,"")}1' file > newfile
but the cell data was still being read in as if "531 Pennsylvania Avenue" was a brand new row in the CSV file.
2. Command 1 followed by awk -v RS="" '{gsub (/\r/,"")}1' newfile > finalFile
but that resulted in all of the data in the file being put onto a single line.
3. awk -v RS="" '{gsub (/\r\n/,"")}1' file > newFile
But that result was the same as attempt number 2.
How can I preprocess the file so that:
"Bob Smith
531 Pennsylvania Avenue
Washington, DC"
is read as a single field on a single line as part of the row it should be associated with, like
"Bob Smith 531 Pennsylvania Avenue Washington, DC"