Count, size, length...too many choices in Ruby?

For arrays and hashes size is an alias for length. They are synonyms and do exactly the same thing.

count is more versatile - it can take an element or predicate and count only those items that match.

> [1,2,3].count{|x| x > 2 }=> 1

In the case where you don't provide a parameter to count it has basically the same effect as calling length. There can be a performance difference though.

We can see from the source code for Array that they do almost exactly the same thing. Here is the C code for the implementation of array.length:

static VALUErb_ary_length(VALUE ary){    long len = RARRAY_LEN(ary);    return LONG2NUM(len);}

And here is the relevant part from the implementation of array.count:

static VALUErb_ary_count(int argc, VALUE *argv, VALUE ary){    long n = 0;    if (argc == 0) {        VALUE *p, *pend;        if (!rb_block_given_p())            return LONG2NUM(RARRAY_LEN(ary));        // etc..    }}

The code for array.count does a few extra checks but in the end calls the exact same code: LONG2NUM(RARRAY_LEN(ary)).

Hashes (source code) on the other hand don't seem to implement their own optimized version of count so the implementation from Enumerable (source code) is used, which iterates over all the elements and counts them one-by-one.

In general I'd advise using length (or its alias size) rather than count if you want to know how many elements there are altogether.

Regarding ActiveRecord, on the other hand, there are important differences. check out this post:

Counting ActiveRecord associations: count, size or length?

ruby activerecord size content-length

There is a crucial difference for applications which make use of database connections.

When you are using many ORMs (ActiveRecord, DataMapper, etc.) the general understanding is that .size will generate a query that requests all of the items from the database ('select * from mytable') and then give you the number of items resulting, whereas .count will generate a single query ('select count(*) from mytable') which is considerably faster.

Because these ORMs are so prevalent I following the principle of least astonishment. In general if I have something in memory already, then I use .size, and if my code will generate a request to a database (or external service via an API) I use .count.

ruby activerecord size content-length

In most cases (e.g. Array or String) size is an alias for length.

count normally comes from Enumerable and can take an optional predicate block. Thus enumerable.count {cond} is [roughly] (enumerable.select {cond}).length -- it can of course bypass the intermediate structure as it just needs the count of matching predicates.

Note: I am not sure if count forces an evaluation of the enumeration if the block is not specified or if it short-circuits to the length if possible.

Edit (and thanks to Mark's answer!): count without a block (at least for Arrays) does not force an evaluation. I suppose without formal behavior it's "open" for other implementations, if forcing an evaluation without a predicate ever even really makes sense anyway.

CodeHunter

Count, size, length...too many choices in Ruby?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last