Hi,
Currently I have two files: (I've labeled the columns to help with the explanation)
mature.txt
[0m] [1m] [2m] [3m] [4m] [5m] [6m] [7m]
scaffold1088 121550 D 18 ppy mi 164g 88.6
scaffold1141 262270 D 18 ppy mi 896 90.2
scaffold1168 54635 D 18 peu mi 2138 87.5
scaffold1168 56190 D 18 ppt mi 2218 87.5
hairpin.txt
[0h] [1h] [2h] [3h] [4h] [5h] [6h] [7h]
ptc 164e scaffold1088 97.56 41 121570 121530 73.8
ppt 896 scaffold113 90 60 478993 478934 71.9
ppt 896 scaffold1141 90 60 257204 257145 71.9
ppt 896 scaffold1141 90 60 262302 262243 71.9
mmu 2138 scaffold1168 97.18 71 54618 54688 88.4
peu 2914 scaffold1168 100 63 56162 56224 86.7
What I want to do is:
1) find ever occurrence where [0m] matches [2h] AND [1m] falls between [5h] and [6h].
2) If condition 1) is met, then I need to know if [1h] and [6m] match (but the program should only compare the numbers - not letters). If yes, then I want to print to a new file [0h] [1h] [2h] [7m] [1m] [3h] [5h] [6h] as one line.
Sample output would look like this:
ptc 164 scaffold1088 88.6 121550 97.56 121570 121530
ppt 896 scaffold1141 90.2 262270 90 262302 262243
mmu 2138 scaffold1168 87.5 54635 97.18 54618 54688
*Note that the last item in hairpin.txt meets criteria 1 but because [6m] and [1h] differ, it is not a match
Below is the code I started working on, but I need some help figuring out how to compare the array pieces.
Many thanks!
use strict;
use FileHandle;
use Data::Dumper;
my ($hairpin,$mature) = @ARGV;
my $fh = new FileHandle;
#build array from hairpin miRNA data
open ($h,$hairpin) or die "Could not open $hairpin: $!";
my (@h,@hout)
while (<$h>) {
chomp;
push @h, [split(/\t/)];#Build an array of arrays
my $hhit = $t[0]."\t".$t[1]."\t".$t[2]."\t".$t[9]."\t".$t[10]; #species, miRNAfamily, scaffold_location, start, end
}
#build array from mature miRNA data
open ($m,$mature) or die "Could not open $mature: $!";
my (@m,@mout)
while (<$m>) {
chomp;
push @m, [split(/\t/)]; #Build array
my $mhit = $t[0]."\t".$t[1]."\t".$t[4]."\t".$t[6]."-".$t[7]; #scaffold_location, start, species, family, subidentity
}
#compared scaffold name e.g. "scaffold201_4" in the hairpin data to the scaffold names in the mature data to identify matches
#then compare
my ($hscaffold,$hlocatn,$hfamily,$mscaffold,$mlocatn,$mfamily);
foreach my $mscaffold (@m[0]) {#foreach scaffold reference
if (!defined($mscaffold)