Share via


get files list from url

Question

Saturday, August 20, 2011 2:41 PM

hi,

i need to access a url and get the list of all the files under it.

how can i do that? do i need to use HttpWebRequest ? i know how to access the url but i'm having problems with getting the files list.

any help would be appreciate.

All replies (4)

Saturday, August 20, 2011 3:18 PM âś…Answered | 2 votes

Try the following code
namespace Example
{
    using System;
    using System.Net;
    using System.IO;
    using System.Text.RegularExpressions;

    public class MyExample
    {
        public static string GetDirectoryListingRegexForUrl(string url)
        {
            if (url.Equals("http://www.ibiblio.org/pub/"))
            {
                return "<a href=\".*\">(?<name>.*)</a>";
            }
            throw new NotSupportedException();
        }
        public static void Main(String[] args)
        {
            string url = "http://www.ibiblio.org/pub/";
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
            {
                using (StreamReader reader = new StreamReader(response.GetResponseStream()))
                {
                    string html = reader.ReadToEnd();
                    Regex regex = new Regex(GetDirectoryListingRegexForUrl(url));
                    MatchCollection matches = regex.Matches(html);
                    if (matches.Count > 0)
                    {
                        foreach (Match match in matches)
                        {
                            if (match.Success)
                            {
                                Console.WriteLine(match.Groups["name"]);
                            }
                        }
                    }
                }
            }

            Console.ReadLine();
        }
    }
}

Also read this post
My Blogs


Saturday, August 20, 2011 3:20 PM | 1 vote

Generally speaking: This is not possible without the appropriate server-side software. This also depends on the scheme (protocol) specified by your URL. When using HTTP this could be WebDAV as interactive protocol or a simple directory listing stored in a file at the root with a known file name. Using FTP this is per default supported.

 


Saturday, August 20, 2011 8:02 PM

thank you both for the help


Sunday, August 21, 2011 2:13 AM | 1 vote

Try the following code
namespace Example
{
    using System;
    using System.Net;
    using System.IO;
    using System.Text.RegularExpressions;

    public class MyExample
    {
        public static string GetDirectoryListingRegexForUrl(string url)
        {
            if (url.Equals("http://www.ibiblio.org/pub/"))
            {
                return "<a href=\".*\">(?<name>.*)</a>";
            }
            throw new NotSupportedException();
        }
        public static void Main(String[] args)
        {
            string url = "http://www.ibiblio.org/pub/";
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
            {
                using (StreamReader reader = new StreamReader(response.GetResponseStream()))
                {
                    string html = reader.ReadToEnd();
                    Regex regex = new Regex(GetDirectoryListingRegexForUrl(url));
                    MatchCollection matches = regex.Matches(html);
                    if (matches.Count > 0)
                    {
                        foreach (Match match in matches)
                        {
                            if (match.Success)
                            {
                                Console.WriteLine(match.Groups["name"]);
                            }
                        }
                    }
                }
            }

            Console.ReadLine();
        }
    }
}

Also read this post
My Blogs

  Make the call async because you really don't know how long it might take to finish the call and get the results and perhaps but a time limit on it because the site may be down or the internet may not be working.