Simon Mourier's Blog - retrieving partial content using multiple http requsets to fetch data via parllel tasks

retrieving partial content using multiple http requsets to fetch data via parllel tasks

Dec 3, 2012 See the question and my original answer on StackOverflow

If the server support what's wikipedia calls byte serving, you can multiplex a file download spawning multiple requests with a specific Range header value (using the AddRange method. See also How to download the data from the server discontinuously？). Most serious HTTP servers do support byte-range.

Here is some sample code that implements a parallel download of a file using byte range:

    public static void ParallelDownloadFile(string uri, string filePath, int chunkSize)
    {
        if (uri == null)
            throw new ArgumentNullException("uri");

        // determine file size first
        long size = GetFileSize(uri);

        using (FileStream file = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.Write))
        {
            file.SetLength(size); // set the length first

            object syncObject = new object(); // synchronize file writes
            Parallel.ForEach(LongRange(0, 1 + size / chunkSize), (start) =>
            {
                HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
                request.AddRange(start * chunkSize, start * chunkSize + chunkSize - 1);
                HttpWebResponse response = (HttpWebResponse)request.GetResponse();

                lock (syncObject)
                {
                    using (Stream stream = response.GetResponseStream())
                    {
                        file.Seek(start * chunkSize, SeekOrigin.Begin);
                        stream.CopyTo(file);
                    }
                }
            });
        }
    }

    public static long GetFileSize(string uri)
    {
        if (uri == null)
            throw new ArgumentNullException("uri");

        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
        request.Method = "HEAD";
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();
        return response.ContentLength;
    }

    private static IEnumerable<long> LongRange(long start, long count)
    {
        long i = 0;
        while (true)
        {
            if (i >= count)
            {
                yield break;
            }
            yield return start + i;
            i++;
        }
    }

And sample usage:

    private static void TestParallelDownload()
    {
        string uri = "http://localhost/welcome.png";
        string fileName = Path.GetFileName(uri);

        ParallelDownloadFile(uri, fileName, 10000);
    }

PS: I'd be curious to know if it's really more interesting to do this parallel thing rather than to just use WebClient.DownloadFile... Maybe in slow network scenarios?