Dear All,
I have two files file1 looks like this,the test file
ENSG00000000003.10 0
ENSG00000000005.5 0
ENSG00000000419.8 392
ENSG00000000457.8 24
and the GTF file like this,
chr1 HAVANA 11869 14412 . + . + . gene_id ENSG00000223972.4 transcript_id ENSG00000223972.4 gene_type pseudogene gene_status KNOWN gene_name DDX11L1 transcript_type pseudogene transcript_status KNOWN
chr1 HAVANA 11869 14409 . + . + . gene_id ENSG00000223972.4 transcript_id ENST00000456328.2 gene_type pseudogene gene_status KNOWN gene_name DDX11L1 transcript_type processed_transcript transcript_status KNOWN
chr1 HAVANA 11869 12227 . + . + . gene_id ENSG00000223972.4 transcript_id ENST00000456328.2 gene_type pseudogene gene_status KNOWN gene_name DDX11L1 transcript_type processed_transcript transcript_status KNOWN
chr1 HAVANA 12613 12721 . + . + . gene_id ENSG00000223972.4 transcript_id ENST00000456328.2 gene_type pseudogene gene_status KNOWN gene_name DDX11L1 transcript_type processed_transcript transcript_status KNOWN
chr1 HAVANA 13221 14409 . + . + . gene_id ENSG00000223972.4 transcript_id ENST00000456328.2 gene_type pseudogene gene_status KNOWN gene_name DDX11L1 transcript_type processed_transcript transcript_status KNOWN
My requiremnt is to comapre the column number one of file one to the 11 th column of file 2 and print out if matching with 13th and 14th column from the file2.
The output should look like this,
ENSG00000000003.10 0 gene_type protein_coding
ENSG00000000005.5 0 gene_type protein_coding
ENSG00000000419.8 392 gene_type protein_coding
ENSG00000000457.8 248 gene_type protein_coding
Please help me a solution using Perl Hash I have a script In Perl arrays but since my file is huge its taking long time..It would be great if I could have something in PERL HASHES.
Thank You ALL;