I had this basic question. Suppose I have a directory on my computer C:\Abc\New. Now I can access this directory using :
File file = new File("C:\\Abc\\New");
Now I can open all the files inside this directory using :
File[] files = file.listFiles();
And I can process the files individually the way I want to.

Suppose I want to do a similar thing to a directory on the web. Then how do I go about it? I cannot use a File object here. How do I map a URL to a File object? In other words, how can I access all the files stored at a directory in the above manner.

Any help will be greatly appreciated.

HttpURLConnection and its getInputStream method.

Hi. Thanks for your reply, but I am not quite sure what you mean to say here. I mean using the HttpURLConnection and then using its getInputStream() method will access the contents at that particular address, right?

I will try to reframe my question. For example, the URL of this page is :
http://www.daniweb.com/forums/post1232098.html#post1232098

What I am wanting to know is that if I have the web directory http://www.daniweb.com/forums, then how can I use this to access all the posts that are stored in this directory? That is, what are the sub URLs to this directory?

If this is possible using the solution you gave above, then could you please tell me how?

Use HttpURLConnection (see the API docs) to connect, then use getInputStream and read that stream to get the data, and use an HTML Parser (Google for one, there are many) to get the links, then use an HttpURLConnection for each of those links, and the getInputStream to process each of those in a similar fashion, of course. Give it a try.

Hi. What you are saying would effectively function the way a Web Crawler does right? I mean, from a page it will harvest all the links and then go to them and follow this routine recursively.
However this is not what I aim to do. I am basically trying to build an Index from a web directory. The files contained here are basically .jsp files. These jsp files contain relative paths to audio files contained in the directory.

As an example, suppose a .jsp file having the path $root_dir$/menu.jsp contains the tag <audio src="$root_dir$/Audio/<%=Name%>.wav">.

Now, I am supposed to Index all the audio files that match this audio name. For this, I need to access all the files inside the $root_dir$/Audio directory. If $root_dir$ were a path on my computer, then I could do so easily using the way I showed in my first post. However, if it is so that $root_dir$ is a URL, then how do I access these files? This is basically my question.

And there is no way to do it other than through the http access that everything else can do it with. So you need to do it with HttpURLConnection and parse those results.
What is so hard to understand about that? With Java, or any other language for that matter, you cannot see any part of a website that a normal user in front of a Browser can't see, and you can't see it in any other manner than they can see it (i.e. over HTML). What makes you think you could?

Hey. Well, I did not know whether it could be done or not, that is why I asked. However, now that you have told me that I can access the same content that a user in front of a browser can, I guess i will have to think for alternatives for my problem. Thanks for your help though.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.