When parsing an EMBL record (attached) do I follow the same directions as when I parse a GENBANK record? I have to print out the ID, KW, OC, and SQ fields once I parse the record. I have a code that would parse a GenBank record and would like to follow the same route if possible.
#!/usr/bin/perl
# Extract the annotation and sequence sections from the first
# record of a GenBank library
use strict;
use warnings;
use BeginPerlBioinfo;
# Declare and initialize variables
my $annotation = '';
my $dna = '';
my $record = '';
my $filename = 'sequence.gb';
my $save_input_separator = $/;
# Open GenBank library file
unless (open(GBFILE, $filename)) {
print "Cannot open GenBank file \"$filename\"\n\n";
exit;
}
# Set input separator to "//\n" and read in a record to a scalar
$/ = "//\n";
$record = <GBFILE>;
# reset input separator
$/ = $save_input_separator;
# Now separate the annotation from the sequence data
($annotation, $dna) = ($record =~ /^(LOCUS.*ORIGIN\s*\n)(.*)\/\/\n/s);
# Print the two pieces, which should give us the same as the
# original GenBank file, minus the // at the end
print $annotation, $dna;
exit;