Undocumented RCurl "progressfunction" with URL redirection Undocumented RCurl "progressfunction" with URL redirection curl curl

Undocumented RCurl "progressfunction" with URL redirection


I think Rcurl is faithfully forwarding the values from curl, e.g., as documented on curl_set_easyopt under CURLOPT_PROGRESSFUNCTION missing values are returned as 0. If there's a bug then it's with curl. Here's a simple program (see here to get going)

#include <stdio.h>#include <curl/curl.h>curl_progress_callback progress(void *clientp, double dltotal, double dlnow,                                double ultotal, double ulnow){    fprintf(stderr, "PROGRESS: %.0f %.0f %.0f %.0f\n",            dltotal, dlnow, ultotal, ulnow);    return 0;}int main(int argc, char **argv){    CURL *curl;    CURLcode res;    curl = curl_easy_init();    curl_easy_setopt(curl, CURLOPT_URL, argv[1]);    curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);    curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L);    curl_easy_setopt(curl, CURLOPT_PROGRESSFUNCTION, progress);    res = curl_easy_perform(curl);    curl_easy_cleanup(curl);    return 0;}

and it's evaluation

$ clang curl.c -lcurl && ./a.out http://google.com > /dev/nullPROGRESS: 0 0 0 0PROGRESS: 0 0 0 0PROGRESS: 219 219 0 0PROGRESS: 219 219 0 0PROGRESS: 219 219 0 0PROGRESS: 219 219 0 0PROGRESS: 219 0 0 0PROGRESS: 219 2097 0 0PROGRESS: 219 6441 0 0PROGRESS: 219 12233 0 0PROGRESS: 219 20921 0 0PROGRESS: 219 32505 0 0PROGRESS: 219 45360 0 0PROGRESS: 219 45360 0 0PROGRESS: 219 45360 0 0


There are various relevant answers here, for example:

In short, it's not possible to create a progress bar for a site that uses chunked transfer encoding (i.e., the situations where there is no "Content-Length" header).

You'll have to either skip the progress bar in those cases (see, as an example, my answer to your previous question) or set a very high initial overestimate for the file size, knowing that the bar will never actually reach 100%.


Based also on your feedback (i.e. no motivation for described behaviour), there is an actual bug (in curl).

One way to fix it in RCurl is to manually requery the server when a location redirect field is found in the server answer.

curlDown=function(url, curl =NULL){    if(is.null(curl)) curl = getCurlHandle()    h= basicHeaderGatherer()    x=getURL(url, curl = curl, noprogress = FALSE,        headerfunction = h$update,        progressfunction=function(down,up)   cat(down, '\n'))    loc=h$value()["Location"]    if(!is.na(loc)) curlDown(loc)               }

Now we query a server with a redirect:

# curlDown("http://www.google.com") # 0 0 # 258 258 # 258 258 # 258 258 # 0 0 # 0 603 # 0 2003 # ... blah blah# 0 44824 # 0 44824 # 0 44824 

When the request is redirected from the main server to the country specific server, the new server answer has no content length and this is reported as zero (according to RCurl general behaviour).