Hello,

I am new to Perl - and so far I am enjoying it. Unfortunately I do not have the luxury to start completely from scratch. I have here a problem that i am struggling to solve. I have spent many hours trying to solve this issue without any success, hence why I am asking (or begging whichever makes you feel better ;)) for help.

Problem

In Isodraw (technical illustration app) I am exporting a filename to a text file. Perl accesses the file, and places the text into a variable. I compare this variable to cell data within a spreadsheet until a positive match is made. I have everything working perfectly except that when Perl reads the text inside the text file it reads it differently:


V6558-04505-011_01 (Original - Text File)

■V 6 5 5 8 - 0 4 5 0 5 - 0 1 1 _ 0 1 (PROBLEM - Perl)


From my research it is due to different encoding. Now my options within Isodraw when creating the text file are either UNICODE or 8-bit ASCII. Neither has a good result in Perl, but I cannot change this inside Isodraw so Perl has to do it. (Note: if I manually save the text file out in notepoad to ANSI perl reads it perfectly).

I desperately need some assistance this is currently beyond my knowledge if anyone can help I would really appreciate it.

Many thanks

Alan


Example code

#!/usr/bin/perl -w

use v5.10.0;
use strict;


############# READ NOTE HERE ##############

###### -Uncomment below to see it working perfectly!
#our $VarFileName = "V6558-01501-011_01";


##### IF you wish to see it reading from the file comment above and uncomment Notes Y and Z.



my $record;

our $VarFileName; 			############ NOTE Y

my $VarISS = "VarISS_TestValue";
my $VarICN = "VarICN_TestValue";

############## READ FILE FROM ISODRAW ##################
open (ReadFILE, "<D:/ForJim/FROM_ISODRAW.txt") or die "couldn't open the file!";

while ($record = <ReadFILE>)
{
say $record;
chomp($record);

$VarFileName = $record; 		############ NOTE Z

}
#############################

#############################

	my $VarComparison = "V6558-01501-011_01"; ### TEMP
	if ($VarComparison eq $VarFileName)
		{
		say "MATCH!!!";
		} else {
			say "NOT THE SAME!";
			}

#say our $varFilename;

Please post the file "FROM_ISODRAW.txt" as an attachment. That should give us a file with the original encoding preserved so we can reproduce the problem. Click the "Manage Attachments" button to attach your text file.

Hi

It would not let me edit my post above, did previously but not now for some reason (I am logged in).

heres the file - and thanks for spending the time to help.

Hi

It would not let me edit my post above, did previously but not now for some reason (I am logged in).

heres the file - and thanks for spending the time to help.

Strange, one of my text editors (gedit) tells me the file is plain text and another (Komodo Edit) says it is UTF-16 Little Endian. Try replacing the statement that opens the file with the following:

#Change the following to your path and file name
my $filename = '/home/david/Programming/data/FROM_ISODRAW.txt';

############## READ FILE FROM ISODRAW ##################
open (ReadFILE, '<:encoding(UTF-16)', $filename) or die "couldn't open $filename: $!";
commented: Thanks for sharing - information very helpful! +0

Thanks for sharing,

I tried your suggestion unfortunately it didnt work, a friend managed to assist me. Below is the code, I think he did the same as you but the string still contained a lot of extra space (data) and when compared to the variable it still wasnt equal so it would return 'NOT THE SAME'- I personally would have thought that with the encoding it would have taken care of this issue....but it hasnt.

If the answer below can be shortened to a more compact version, or their is a better work around then please feel free to add any input. I have removed unneccessary elements for this test.

Thank you for your time and effort!

Alan

#!/usr/bin/perl -w

# Declare the subroutines
sub trim($);
sub ltrim($);
sub rtrim($);

# Right trim function to remove trailing whitespace
sub rtrim($)
{
	my $string = shift;
	$string =~ s/\s+$//;
	return $string;
}

use v5.10.0;
use strict;



my $record;
our $ansi;

our $VarFileName; 	

############## READ FILE FROM ISODRAW ##################

open(INFILE, "<:encoding(UTF-16)", "C:/ForJim/FROM_ISODRAW.txt");
while(<INFILE>)
{
$record=$_;

print "$record \n";

$VarFileName = $record; 
}
close(INFILE);

######## Trim the trailing whitespace ########
$VarFileName = rtrim($VarFileName);

#################################

	my $VarComparison = "V6558-04505-011_01"; ### TEMP
	if ($VarComparison eq $VarFileName)
		{
		say "MATCH!!!";
		} else {
			say "NOT THAT SAME!";
			}

Looks OK except opening a file without testing whether the open succeeds can result in confusion if the file fails to open, because the program will continue without giving an error until it tries to read a record from the unopened file. For that reason we usually add an or die... or an || die... clause to the open statement. See "Simple Opens" in http://perldoc.perl.org/5.10.0/perlopentut.html

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.