Hiya, I'm currently working on a little java program that will scan through a directory, reading php files and then dump out a list of function names for that particular file. Does any one know how I can modify my regex to exclude functions that are inside /* */ comments? I can work out how to match functions inside comments using the following with the DOTALL constant: -

(/\*).*?function (.+).*?(\*/)

but I've tried and I can't seem to inverse it so it only matches functions outside comments. Is anyone able to help? Please...

Btw, here is my draft code so far...

import java.io.*;
import java.util.*;
import java.util.regex.*;

public class ScanDir {
	FilenameFilter phpFileFilter;

	public static void main(String[] args) {

		new ScanDir().listOfFiles("/var/www/website/functions/");

	}

	public ScanDir() {
		
		// Set up the filter to include dirs and .php files
		phpFileFilter = new FilenameFilter() {
			public boolean accept(File path, String name) {
				File f = new File(path, name);
				if(f.isDirectory()) {
					return true;
				} else if(name.endsWith(".php")) {
					return true;
				} else {
					return false;
				}	
			}
		};
	}

	public void listOfFiles(String path) {
		File directory = new File(path);

		File[] files = directory.listFiles(phpFileFilter);
		if(files == null) {
			// Might not be a directory or directory does not exist
		} else {
			// Loop though directory listings
			for(int i=0; i<files.length; i++) {
				if(files[i].isDirectory()) {
					this.listOfFiles(files[i].getAbsolutePath());
				} else {
					fetchFunctions(files[i]);
				}
			}
		}
	}

	public void fetchFunctions(File file) {

		// Reads a php file and parses it for functions before printing it
		BufferedReader inputStream = null;

		ArrayList<String> functions = new ArrayList<String>();

		try {
			try {
				StringBuilder text = new StringBuilder();
				inputStream = new BufferedReader(new FileReader(file));
				int c;

				while ((c = inputStream.read()) != -1) {
					text.append( (char) c);
				}

				functions.addAll( scanTextForFunctions(text.toString()) );

			} finally {
				if(inputStream != null) {
					inputStream.close();
				}
			}

		} catch (IOException e) {
			System.out.println("Cannot process file!" + e.getMessage());
		}

		System.out.println();
		System.out.println(file.getName() + " functions");
		System.out.println("-------------------------------------");
		for(String f : functions) {
			System.out.println(f);
		}
	}

	public ArrayList<String> scanTextForFunctions(String text) {

		// Hunts down functions in the php text

		Pattern pattern = Pattern.compile("function (.+) \\{");
		Matcher matcher = pattern.matcher(text);
		ArrayList<String> functions = new ArrayList<String>();

		while(matcher.find()) {
			functions.add(matcher.group(1));
		}

		return functions;
	}

}

If you first run the text through an expression that separates the uncommented code sections from the commented out sections, you can then parse only the uncommented sections.

Thank you Ezzaral, I feel a bit silly now lol. I spend so long trying to work out the perfect regex pattern that does it all that I hadn't thought of splitting it up!

Richard

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.