Problem in incrementation???

Question

j.arevathi 0 Newbie Poster

17 Years Ago

Hello there,
I have got a csv file which has

Chromosom_id    fstart  fstop   Count
 1      105     1       14.5
1       105     1       14.5
1       105     1       14.5
1       813     797     4
1       813     797     22
1       813     797     4

In this the fstart represents the start of a matching with the genome and the fstop represents the stop of the match(Means the match starts at 105 and ends at 1.) and the counts represents the number of similar matches available with in this region(1-105) which are all of equal lengths. If the counts are greater some arbitrary value (say 7) then those regions are to be taken into account. I have attached the code below.

open (FILE ,"$file") or die "Cannot open the file\n";
my @hit_clusters = <FILE>;
close FILE;

my ($id, $fstart, $fstop, $count);
my ($cluster_start, $cluster_stop, $cluster_dist);
my $row_number =0;

foreach my $file_line(@hit_clusters){

    next if $file_line =~m/^\s*$/;#removes spaces
    next if $file_line =~m/^(Chromosom_id.+)$/;

    if ($file_line =~m/^(.+?)\t(\d+?)\t(\d+?)\t(\d+?)\b/){
    ($id, $fstart, $fstop, $count)= ($1,$2,$3,$4);

    if ($count >= $mini_num_hits){ #to check the counts greater than the arbitrary value 

        if (!$row_number){
        if  ($fstart > $fstop){ # if fstart is grater than fstop assign fstop to cluster_start.
            $cluster_start = $fstop;

        }else {$cluster_start = $fstart;} #if not assign fstart to cluster_start

        if ($fstop <$fstart){#if similar to the above case.
            $cluster_stop = $fstart;

        }else {$cluster_stop = $fstop;}

        ++$row_number;

        }

but the problem is the row_number is not incrementing and it prints the same value all the time.

1       105     1
1       105     1
1       105     1
1       105     1
1       105     1
1       105     1
1       105     1
1       105     1
1       105     1
1       105     1

What I have to do is: set the first fstart in the file as the $cluster_start and while reading through the file if I get another fstop that is less then 250 from the first fstart then I have to add their counts together and extend the region from the first fstart to the current fstop and then reset the cluster_start to the new fstart continue further.

Thanks in advance,

perl

Edited 12 Years Ago by Nick Evan because: Fixed formatting

3 Contributors
2 Replies
112 Views
1 Day Discussion Span
Latest Post 17 Years Ago Latest Post by KevinADC

katharnakh 7 Posting Whiz in Training

17 Years Ago

hi,

I cannot say what is the problem, because your code is incomplete. May be this line, if ($count >= $mini_num_hits){ #to check the counts greater than the arbitrary value in your code causing problem.

You can use split () function to get what you wanted from each file line.

use strict;
use warnings;

use FileHandle;

my $fh = new FileHandle;
my ($cluster_start, $cluster_stop, $cluster_dist);
my $row_number =0;

my $file_name = 'C:\Documents and Settings\kath\Desktop\input.txt';
open($fh, $file_name);

foreach (<$fh>){
	my ($id, $fstart, $fstop, $count) = split(/\s/);
	if ($fstart > $fstop){ # if fstart is grater than fstop assign fstop to cluster_start.
    $cluster_start = $fstop;
    $cluster_stop = $fstart;
  }
  else {
  	$cluster_start = $fstart;
  	$cluster_stop = $fstop;
 	}     
  ++$row_number;
	print "ROW-$row_number: ID: $id, FSTART: $fstart, FSTOP: $fstop, COUNT: $count -- CLUSTERSTART: $cluster_start, CLUSTERSTOP: $cluster_stop\n";
}

close($fh);

kath.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

KevinADC 192 Practically a Posting Shark · Answer 1 · 2008-01-23T14:43:46+00:00

Personally, I can not understand the specifications of what the program is supposed to do. Posting partial code is not much help, there is no place in the code posted that even prints anything so there is no way to tell why it's not working properly. And you say you have a csv file but you are using tabs in the regexp to pull the data fields out of the lines.

You have to do better at explaining your program specs and please start using code tags around your perl code.