When to use R, when to use SQL? When to use R, when to use SQL? sql sql

When to use R, when to use SQL?


I usually do the data manipulations in SQL until the data I want is in a single table, and then, I do the rest in R.Only when there is a performance issuedo I start to move some of the computations to the database.This is already what you are doing.

Computations involving timestamps often become unreadable in SQL(the "analytic functions", similar to ddply, are supposed to simplify this,but I think they are not available in MySQL).

However, your example can probably be written entirely in SQL as follows (not tested).

-- Join the tables and compute the maximumCREATE VIEW t1 ASSELECT a.id    AS a_id,        a.group AS a_group,       b.id    AS b_id,       a.time  AS a_time,        a.time - MAX(b.time) AS dtFROM   a, b, abWHERE  a.id = ab.a_id AND b.id = ab.b_idAND    b.time < a.timeGROUP  BY a.id, a.group, b.id;-- Extract the desired rowsCREATE VIEW t2 AS SELECT t1.*FROM t1, (SELECT group, MIN(dt) AS min_dt FROM t1) XWHERE t1.a_id = X.a_id AND   t1.b_id = X.b_id AND   t1.a_group = X.a.group;


Why not both SQL and R — within R using the sqldf package? These examples show how you can either use the sqldf function with either R data frames or through a connection to an existing database. That way, you have the flexibility to do either as the idiom sees fit.