Ruby - How to get the name of a file with open-uri? Ruby - How to get the name of a file with open-uri? ruby ruby

Ruby - How to get the name of a file with open-uri?


The filename is stored in the header field named Content-Disposition. However decoding this field can be a little bit tricky. See some discussion here for example:

How to encode the filename parameter of Content-Disposition header in HTTP?

For open-uri you can access all the header fields through the meta accessor of the returned File class:

f = open('http://soundcloud.com/stereo-foo/cohete-amigo/download')f.meta['content-disposition']=> "attachment;filename=\"Stereo Foo - Cohete Amigo.wav\""

So in order to decode something like that you could do this:

cd = f.meta['content-disposition'].filename = cd.match(/filename=(\"?)(.+)\1/)[2]=> "Stereo Foo - Cohete Amigo.wav"

It works for your particular case, and it also works if the quotes " are not present. But in the more complex content-disposition cases like UTF-8 filenames you could get into a little trouble. Not sure how often UTF-8 is used though, and if even soundcloud ever uses UTF-8. So maybe you don't need to worry about that (not confirmed nor tested).

You could also use a more advanced web-crawling framework like Mechanize, and trust it to do the decoding for you:

require 'mechanize'agent = Mechanize.newfile = agent.get('http://soundcloud.com/stereo-foo/cohete-amigo/download')file.filename=> "Stereo_Foo_-_Cohete_Amigo.wav"


File.basename(open(source_url)) won't work because open(source_url) returns an I/O handle of some sort, not a string like File.basename expects.

File.basename(source_url)

would have a better chance of working, unless the URL is using some path/to/service/with/parameters/in/line/like/this type encoding.

Ruby's URI library has useful tools to help here though. Something like:

File.basename(URI.parse(source_url).path)

would be a starting point. For instance:

require 'uri'File.basename(URI.parse('http://www.example.com/path/to/file/index.html').path# => "index.html"

and:

File.basename(URI.parse('http://www.example.com/path/to/file/index.html?foo=bar').path)# => "index.html"

do you know if I can retreive the filesize too and how?

A great way to test HTTP stuff locally, is to run gem server from the command-line, and let gems fire up a little web server for its documentation:

require 'open-uri'html_doc = open('http://0.0.0.0:8808/') do |io|  puts io.size  io.readendputs html_doc.size# => 114350# => 114350

When you use a block with OpenURI's open command, it gives you access to a lot of information about the connection in the block variable, which is an instance of the Tempfile class. So, you can find out the size of the incoming file using size.

That's OK for small files, but if you're pulling in a big file you might want to investigate using Net::HTTP to send a head request, which might include the size. I say might, because sometimes the server doesn't know how much will be returned, in the case of dynamic content, or content being returned by a CGI or sub-service that doesn't bother to say.

The advantage to using a "head" request is the server doesn't return the entire content, just the headers. So, in the past, I've prefaced a request using head, to see if I could get the data I needed. If not, I'd be forced to pull in the full response using a normal get.