Click here to Skip to main content
15,885,309 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

I have only been learning Perl for less than 2 weeks. I am a C++ programmer.
I have attached a portion of the data below. The data is in file1.txt. I would like to move the data from file1.txt to file2.txt. But, I only want to keep the numbers.

Eg: I want row 1 to look like this:
1       1549367 11     8       3      11      0       -12.00  6.00    -0.25   -3.00   0.00    -1.67   -12.00  6.00    -0.64


Instead of this:

1       Chr26   1549367 11      GGGGGGGAAGA     8       3       Transition      11      0       -12.00  6.00    -0.25   -3.00   0.00    -1.67   -12.00  6.00    -0.64


This is what I have done so far (file1.txt will be in @ARGV):

open FILE2, "+>file2.txt" or die "Cant not open file2.txt!";
my $line;
while($line = readline(ARGV))
{
        print FILE2 $line;
}


The code above only copies content of file1.txt (ARGV) into file2.txt.
I tried to use ‘seek’ and ‘tell()’ but, to solve my problem above but, I got confused :(

I also tried this:

Open(FILE, "file1.txt")
@theFile = ;


This puts every row in the array @the File. But, I can I now modify the elements of one row? (I’m still a novice Perl programmer)

Thank you for your help

/………………………………………………………………………………………../
The file portion

1       Chr26   1549367 11      GGGGGGGAAGA     8       3       Transition      11      0       -12.00  6.00    -0.25   -3.00   0.00    -1.67   -12.00  6.00    -0.64
1       Chr26   1549501 15      ccCctctccccctCC 12      3       Transition      3       12      -17.00  6.00    0.50    1.00    6.00    2.67    -17.00  6.00    0.93
1       Chr26   1549552 14      AagAAaaAAAagga  11      3       Transition      6       8       -31.00  6.00    -2.09   -12.00  3.00    -5.67   -31.00  6.00    -2.86
1       Chr26   1549563 14      tAAaaAAAattat^Ft        9       5       Transversion    5       9       -7.00   6.00    0.22    -64.00  4.00    -18.40  -64.00  6.00    -6.43
1       Chr26   1549726 14      TtTtctTtTtTTTT  13      1       Transition      8       6       -3.00   6.00    1.92    6.00    6.00    6.00    -3.00   6.00    2.21
2       Chr26   1549737 16      T+1Atttt+1aT+1At+1aTt+1aT+1AT+1AT+1AT+1AtT+1A^FA        15      11      Transversion    16      10      -64.00  6.00    -35.67  -64.00  6.00    -46.18  -64.00  6.00    -40.12
2       Chr26   1549815 9       CtCTTTTTT       7       2       Transition      8       1       -3.00   6.00    -0.14   -9.00   0.00    -4.50   -9.00   6.00    -1.11
1       Chr26   1549914 12      gGGGGGGGAGgg    11      1       Transition      9       3       -9.00   6.00    1.18    -4.00   -4.00   -4.00   -9.00   6.00    0.75
1       Chr26   1550018
Posted
Updated 19-Oct-11 4:55am
v2

1 solution

You could do this in a couple of different ways.

The best would be to create the appropriate regular expression to clean the data as you want it, but another solution could be to do the following and then just add in a check for non-numeric characters in each column: I found this here(http://perdoc.perl.org)


How do I extract selected columns from a string?
(contributed by brian d foy)
If you know the columns that contain the data, you can use substr to extract a single column.
PERL
my $column = substr( $line, $start_column, $length );

You can use split if the columns are separated by whitespace or some other delimiter, as long as whitespace or the delimiter cannot appear as part of the data.
PERL
my $line    = ' fred barney   betty   ';
my @columns = split /\s+/, $line;
    # ( '', 'fred', 'barney', 'betty' );
my $line = 'fred||barney||betty';
my @columns = split /\|/, $line;
    # ( 'fred', '', 'barney', '', 'betty' );

If you want to work with comma-separated values, don't do this since that format is a bit more complicated. Use one of the modules that handle that format, such as Text::CSV , Text::CSV_XS , or Text::CSV_PP .

If you want to break apart an entire line of fixed columns, you can use unpack with the A (ASCII) format. By using a number after the format specifier, you can denote the column width. See the pack and unpack entries in perlfunc for more details.
PERL
my @fields = unpack( $line, "A8 A8 A8 A16 A4" );

Note that spaces in the format argument to unpack do not denote literal spaces. If you have space separated data, you may want split instead.


I haven't the time at the moment to create the regex for you as that would be my primary choice, or to update the code above to accommodate your question completely, but hopefully it can get you down the right path. I'll update my answer if I get a chance in the next couple of hours.
 
Share this answer
 
Comments
The_Real_Chubaka 19-Oct-11 9:59am    
Wow! Thanks for the many options.
Let me get to work now. I will contact you in a bit.
Simon Bang Terkildsen 19-Oct-11 11:10am    
My 5
The_Real_Chubaka 19-Oct-11 14:08pm    
Hello Marcus,

Thanks very much for your help.
I am unfortunately still having problems.

This is what I now have:

[code]
#!/usr/bin/perl -w
require 5.10.1; ## The required version
use strict;

open FILE1, "<../file1.txt" or die "Cant not create the file!";
open FILE2, "+>file2.txt" or die "Cant not create file2.txt!";

while(my $line = <file1>)
{
my @words = split /\s+/,$line;
my $line_out;
foreach my $word (@words)
{
if ($word !~ m/[^-+.d]/)
{
$line_out .= $word . '';
}
}
print FILE2 "$line_out\n";
}
close FILE1;
close FILE2;
[\code]

This is my error message:
[code]
Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 1.

Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 2.

Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 3.

Use of uninitialized value $_ in split at extractColumns3.pl line 25, <> line 4.


[\code]
Apologies for the basic questions. I’m just a biginner.

Thank you for your help.

Herve
fjdiewornncalwe 20-Oct-11 13:09pm    
It doesn't look to me like you are reading anything from the file into $line either before you hit your while, or within the loop to get the next line from the file.
Instead of the $line at all, you could use the suggestions from perlfect.com. Scroll down to the "Reading Files" section and I would suggest using this approach to iterate through the lines in the file.
The_Real_Chubaka 20-Oct-11 13:49pm    
Sorry! I made a mistake in one of the lines.
It is -------> while(my $line = <file1>)
Not -------> while(my $line = )

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900