Hi,
I am trying my hands on Java Regex. Here is my program below with the description of what I require it to do actually.
The thing is that this MyKeyword May occur multiple times in a file.
Also
My program works for a file like this:-
(\\S+)<tab>MyKeyWord<tab>(\\S+)<tab>(\\S+)
(\\S+)<tab>MyKeyWord<tab>(\\S+)<tab>(\\S+)
but if there is a file like this:-
(\\S+)<tab>MyKeyWord<tab>(\\S+)<tab>(\\S+)
(\\S+)<tab>OtherKeyWord<tab>(\\S+)<tab>(\\S+)
It doesn't work at all and gives runtime error
Exception in thread "main" java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.group(Matcher.java:485)
//at this line (out.write(m.group(1)+"\t"+m.group(2)+"\t.....))
Thanks
import java.util.regex.*;
import java.io.*;
public class DataMine {
public static void main(String[] args)
throws Exception {
File fin = new File(args[0]);
File fout = new File(args[1]);
FileInputStream fis =
new FileInputStream(fin);
FileOutputStream fos =
new FileOutputStream(fout);
BufferedReader in = new BufferedReader(
new InputStreamReader(fis));
BufferedWriter out = new BufferedWriter(
new OutputStreamWriter(fos));
//Pattern p = Pattern.compile("(\\S+)\tMyKeyWord\t(\\d+)\t(\\S+)");
/* There could be some other possibilities like (\\S+)\tSomeOtherKeyWord\t(\\d+)\t(\\S+). When this type of pattern comes in my program it gives a runtime error. It compiles OK. There are multiple types of patterns present but I want to mine only the patterns I want.
*/
String aLine = null;
while((aLine = in.readLine()) != null) {
Matcher m = p.matcher(aLine);
m.find();
out.write(m.group(1)+"\t"+MyWord"\t"+m.group(2)+"\t"+m.group(3));
// I want only this type of pattern to be printed in my output file.
out.newLine();
}
in.close();
out.close();
}
}