Hello everyone,
Although it seems easy, I've been stuck with this problem for a moment now and I can't figure out a way to get it done.
My problem is the following:
I have a file where each line is a sequence of IP addresses, example :
Line 1: 10.0.01 10.0.0.2
Line 2 : 10.0.0.5 10.0.0.1 10.0.0.2
...
What I'd like to do, is to remove lines that are completely matched in other lines. In the previous example, "Line 1" would be deleted as it is contained in "Line 2".
So far, I've worked with python and set() objects to get the job done but I've got more than 100K lines and sets lookups are becoming time consuming as the program goes :/
Thanks for you help
I have a log file with a header (which I can skip with awk), and a footer, which I need to find a way to remove. The goal is to extract the middle lines from a file. Specifically, there is a header (1 line) and a footer (1 line).
The only way I can figure out how to do this is if I already know how many lines are in the file to begin with. For example, if the file looks like this:
line 1 (header)
line 2 (interesting line)
line 3 (interesting line)
line 4 (footer)
I just want to extract the middle "interesting lines" without the header/footer lines.
I can't use grep to remove the header/footer, because I don't know what those lines will contain, only that they exist and are exactly 1 line each. In general, I don't know how many lines are in the file.
I have a little bash script that cats out a file and tells me if there is a line
where the 11th column has more than 6 characters in it.
It emails me where there is a bad line in a file - bead meaning that it will break a
donwstream process.
anyhow when i get the email saying that there is a bad file i just log in to the pc via
vpn and the I sed out the lines from the file that I get in the email. The bad lines are
always in danny.csv not danny1.csv
It has been the same lines killing the downstream process for a few weeks, so i put the "sed -i's" into
the script and it does it automagically.
[CODE]
for i in danny.csv danny1.csv
do
cat /come/and/play/with/$i | perl -ne 'print if length((split /,/)[10]) > 6' | mail -s "danny.csv bad line" casper@casperr.com
done
#it would be nice to find a perl change the file in place
sed -i '/D,642,0642,UBF,EVL,,M,,S,S,FOREVER,213,213,/d' /come/and/play/with/us/danny.csv
sed -i '/D,642,0642,UBF,EVL,,M,,S,S,QSP-U=C,4,4,/d' /come/and/play/with/us/danny.csv
[CODE]
However when a new line gets put into this file, I am going to have to log in and take out the line.
SO I have been trying to write a perl one liner that will edit the file in place, like sed, and make a
backup of the file. I just need a perl one liner that will delete any line where the 11th columns has more
than 6 characters in it.
[CODE]
perl -p -i.bak -e 's/\,\w{7}\,//g - which does not work.
[CODE]
I tried something like this:
[CODE]
perl -nle 'print if /\,\w{7}\,/' /come/and/play/with/us/danny.csv
[CODE]
but that does not catch the QSP-U=C and it catches more lines than just the
FOREVER. for a solutinog I need to focus on the the 11th column.
I want to append lines after a match in a file.
##file name is ssl.conf
##match is this
<Directory "/var/www/cgi-bin">
SSLOptions +StdEnvVars
</Directory>
after above line i need to append these lines
<Directory "/">
SSLRenegBufferSize 26215000
</Directory>
so final results should be like this
<Directory "/var/www/cgi-bin">
SSLOptions +StdEnvVars
</Directory>
<Directory "/">
SSLRenegBufferSize 26215000
</Directory>
######Thank You in Advance
Hello,
I have a flat file where I expect to have 5 values delimited by 4 commas ",' :
agf,sdya,geg,fgd,gdfgr
but sometimes I have the following:
agf,sdya,geg,fgd
agf,sdya,geg
agf,sdya
agf
For those lines, I wanna append commas at the end of the lines in order to always have a total of 4 commas:
agf,sdya,geg,fgd,
agf,sdya,geg,,
agf,sdya,,,
agf,,,,
The following awk command already gives me the count number of "," for each line:
awk -F\, '{print NF-1}' "MyFile"
But I am not sure where to go from there.
Basically I want to do the following;
If CommaCount For CurrentLine != 4 Then Append(4 - "number of commas found')CommasToLine
Else CheckNextLine.
Thanks for your help !
I have two files which has exact same number of lines.
I want first line of first file should be filename of new file and content of this new file should be first line of second file.
Then second line of first file should be filename of again new file and content of this new file should be second line of second file.
then third line of first file should be filename of again new file and content of this new file should be third line of second file.
and so on...
I am trying to do it using for loop but I am not able to create two for loops.
This is what I have done
Code:
IFS=$'\n'
var=$(sed 's/\"http\(.*\)\/\(.*\).wav\"\,\".*/\2/g' 1797.csv) # filenames of all files
var2=$(sed 's/\"http\(.*\)\/\(.*\).wav\"\,\"\(.*\)\"$/\3/g' 1797.csv) # contents of all files
for j in $var;
do
#Here I do not know how to use $var2
done
Please help.
Hello
I have a text file which has blocks like
Code:
dir1/dir2/dir3/name_run_number1:
line1_run_number1_part1
line2_run_number1_part2
line3_run_number1_part3...
Each block is separated with a blank line and there is the ":" in the "header" of each one while each block carries the same "number1" after "run_" suffix
What I want to do is for each block, extract the "number1" as shown in the first line and then for the lines below count from 1-20 and give a message if a "partX" line is missing. Any bash or python would be fine
Thanks
I am trying to process some .csv files with Linux as follows:
Some fields have data with newline characters embedded, like so:
"Bob Smith
531 Pennsylvania Avenue
Washington, DC"
(I verified the existence of the " via Wordpad. The file is too large to easily edit in Wordpad to get all the data for each row on a single line).
what linux command would I use on the files to get the data in each cell on one line?
I have tried:
1. awk -v RS="" '{gsub (/\n/,"")}1' file > newfile
but the cell data was still being read in as if "531 Pennsylvania Avenue" was a brand new row in the CSV file.
2. Command 1 followed by awk -v RS="" '{gsub (/\r/,"")}1' newfile > finalFile
but that resulted in all of the data in the file being put onto a single line.
3. awk -v RS="" '{gsub (/\r\n/,"")}1' file > newFile
But that result was the same as attempt number 2.
How can I preprocess the file so that:
"Bob Smith
531 Pennsylvania Avenue
Washington, DC"
is read as a single field on a single line as part of the row it should be associated with, like
"Bob Smith 531 Pennsylvania Avenue Washington, DC"
Hi!
I have one file that looks like this
Code:
>Unc14086
AGAGUUUGAU CCUGGCUCAG AAUCAACGCU GGCGGCGUGC CUAACACAUG
CAAGUCGAAC GAGAAAGUGG AGCAAUCCAU GAGUAAAGUG GCGCACGGGU
GAGUAACACG UGACUAACCU ACCCUUGAGU GGGGGAUAAC UGAGGGAAAC
>Unc35443
GCACGAGAAA GUGGAGCAAU CCAUGAGUAA AGUGGCGUAC GGGUGAGUAA
CACGUGACUA ACCUACCCUC GAGUGGGGAA UAACUUCGGG AAACCGGAGC
UAAUACCGCA UAACACCUAC GGGUCAAAGG AGCAAUUCGC UUGAGGAGGG
So, every n (n may vary) lines the next line starts with ">", that is the beginning of a new block of information.
I have another tab-delimited file:
Code:
Unc14806 InformationalTextExample
Unf35433 InformationalTextExampleII
My goal is to parse the second file with information found in lines starting with ">" in the first file. Whenever a matching pair occurs, i want to write "InformationalTextExample" in that line, possibly separated by "_":
Code:
>Unc14086_InformationalTextExample
AGAGUUUGAU CCUGGCUCAG AAUCAACGCU GGCGGCGUGC CUAACACAUG
CAAGUCGAAC GAGAAAGUGG AGCAAUCCAU GAGUAAAGUG GCGCACGGGU
GAGUAACACG UGACUAACCU ACCCUUGAGU GGGGGAUAAC UGAGGGAAAC
>Unc35443_InformationalTextExampleII
GCACGAGAAA GUGGAGCAAU CCAUGAGUAA AGUGGCGUAC GGGUGAGUAA
CACGUGACUA ACCUACCCUC GAGUGGGGAA UAACUUCGGG AAACCGGAGC
UAAUACCGCA UAACACCUAC GGGUCAAAGG AGCAAUUCGC UUGAGGAGGG
How would that be possible?
Thank you!
hi,
I am newbie in Linux shell scripting.Can anybody help me to check the presence of file identified by variable in Shell scripting?
For example: I am reading the content of a file using while command as below:
"while read -r line
do
code block
done < file_name"
Now in this case every line in file gets stored in the variable 'line' one by one.Problem here is every line in the file is nothing but the file_path of another file say xyz.txt and I am checking presence of this xyz.txt file using below command:
if [-f $line]
as 'line' is the variable which stores file path of xyz.txt but it is not working. It is unable to check the presence of this xyz.txt file as i am addressing it with the variable 'line'.
Please help me.Thanks in advance.
Hello All,
I would like to know if there is a better way to achieve the below task:
Task: Copy lines 10 to 50 from a files that contains 100 lines.
My approach:
1st) use head command and then list first 50 lines and then tail last 40 lines.
2nd) Do a while loop with a counter and then when counter reaches to 10 start writing it to another file up to the point where the counter is 50 and then exit.
I would like to know if there are any other simple and straight approach using which I can directly copy lines 10 to 50.
Help and suggestions on this question are greatly appreciated.
Thanks.