Why is curl in Ruby slower than command-line curl?

This could be a fitting task for Typhoeus

Something like this (untested):

require 'typhoeus'def write_file(filename, data)    file = File.new(filename, "wb")    file.write(data)    file.close      # ... some other stuffendhydra = Typhoeus::Hydra.new(:max_concurrency => 20)batch_urls.each do |url_info|    req = Typhoeus::Request.new(url_info[:url])    req.on_complete do |response|      write_file(url_info[:file], response.body)    end    hydra.queue reqendhydra.run

Come to think of it, you might get a memory problem because of the enormous amout of files. One way to prevent that would be to never store the data in a variable but instead stream it to the file directly. You could use em-http-request for that.

EventMachine.run {  http = EventMachine::HttpRequest.new('http://www.website.com/').get  http.stream { |chunk| print chunk }  # ...}

ruby http curl download curb

So, if you don't set a on_body handler than curb will buffer the download. If you're downloading files you should use an on_body handler. If you want to download multiple files using Ruby Curl, try the Curl::Multi.download interface.

require 'rubygems'require 'curb'urls_to_download = [  'http://www.google.com/',  'http://www.yahoo.com/',  'http://www.cnn.com/',  'http://www.espn.com/']path_to_files = [  'google.com.html',  'yahoo.com.html',  'cnn.com.html',  'espn.com.html']Curl::Multi.download(urls_to_download, {:follow_location => true}, {}, path_to_files) {|c,p|}

If you want to just download a single file.

Curl::Easy.download('http://www.yahoo.com/')

Here is a good resource: http://gist.github.com/405779

ruby http curl download curb

There's been benchmarks done that has compared curb with other methods such as HTTPClient. The winner, in almost all categories was HTTPClient. Plus, there have been some documented scenarios where curb does NOT work in multi-threading scenarios.

Like you, I've had your experience. I ran system commands of curl in 20+ concurrent threads and it was 10 X fasters than running curb in 20+ concurrent threads. No matter, what I tried, this was always the case.

I've since then switched to HTTPClient, and the difference is huge. Now it runs as fast as 20 concurrent curl system commands, and uses less CPU as well.

CodeHunter

Why is curl in Ruby slower than command-line curl?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last