List and description of all packages in CRAN from within R

I actually think you want "Package" and "Title" as the "Description" can run to several lines. So here is the former, just put "Description" in the final subset if you really want "Description":

R> ## from http://developer.r-project.org/CRAN/Scripts/depends.R and adaptedR>R> require("tools")R>R> getPackagesWithTitle <- function() {+     contrib.url(getOption("repos")["CRAN"], "source") +     description <- sprintf("%s/web/packages/packages.rds", +                            getOption("repos")["CRAN"])+     con <- if(substring(description, 1L, 7L) == "file://") {+         file(description, "rb")+     } else {+         url(description, "rb")+     }+     on.exit(close(con))+     db <- readRDS(gzcon(con))+     rownames(db) <- NULL++     db[, c("Package", "Title")]+ }R>R>R> head(getPackagesWithTitle())               # I shortened one Title here...     Package              Title[1,] "abc"                "Tools for Approximate Bayesian Computation (ABC)"[2,] "abcdeFBA"           "ABCDE_FBA: A-Biologist-Can-Do-Everything of Flux ..."[3,] "abd"                "The Analysis of Biological Data"[4,] "abind"              "Combine multi-dimensional arrays"[5,] "abn"                "Data Modelling with Additive Bayesian Networks"[6,] "AcceptanceSampling" "Creation and evaluation of Acceptance Sampling Plans"R>

Dirk has provided an answer that is terrific and after finishing my solution and then seeing his I debated for some time posting my solution for fear of looking silly. But I decided to post it anyway for two reasons:

it is informative to beginning scrapers like myself
it took me a while to do and so why not :)

I approached this thinking I'd need to do some web scraping and choose crantastic as the site to scrape from. First I'll provide the code and then two scraping resources that have been very helpful to me as I learn:

library(RCurl)library(XML)URL <- "http://cran.r-project.org/web/checks/check_summary.html#summary_by_package"packs <- na.omit(XML::readHTMLTable(doc = URL, which = 2, header = T,     strip.white = T, as.is = FALSE, sep = ",", na.strings = c("999",         "NA", " "))[, 1])Trim <- function(x) {    gsub("^\\s+|\\s+$", "", x)}packs <- unique(Trim(packs))u1 <- "http://crantastic.org/packages/"len.samps <- 10 #for demo purpose; use:#len.samps <- length(packs) # for all of themURL2 <- paste0(u1, packs[seq_len(len.samps)]) scraper <- function(urls){ #function to grab description    doc   <- htmlTreeParse(urls, useInternalNodes=TRUE)    nodes <- getNodeSet(doc, "//p")[[3]]    return(nodes)}info <- sapply(seq_along(URL2), function(i) try(scraper(URL2[i]), TRUE))info2 <- sapply(info, function(x) { #replace errors with NA        if(class(x)[1] != "XMLInternalElementNode"){            NA        } else {            Trim(gsub("\\s+", " ", xmlValue(x)))        }    })pack_n_desc <- data.frame(package=packs[seq_len(len.samps)],     description=info2) #make a dataframe of it all

Resources:

I wanted to try to do this using a HTML scraper (rvest) as an exercise, since the available.packages() in OP doesn't contain the package Descriptions.

library('rvest')url <- 'https://cloud.r-project.org/web/packages/available_packages_by_name.html'webpage <- read_html(url)data_html <- html_nodes(webpage,'tr td')length(data_html)P1 <- html_nodes(webpage,'td:nth-child(1)') %>% html_text(trim=TRUE)  # XML: The Package NameP2 <- html_nodes(webpage,'td:nth-child(2)') %>% html_text(trim=TRUE)  # XML: The DescriptionP1 <- P1[lengths(P1) > 0 & P1 != ""]  # Remove NULL and empty ("") itemslength(P1); length(P2);mdf <- data.frame(P1, P2, row.names=NULL)colnames(mdf) <- c("PackageName", "Description")# This is the problem! It lists large sets column-by-column,# instead of row-by-row. Try with the full list to see what happens.print(mdf, right=FALSE, row.names=FALSE)# PackageName Description                                                             # A3          Accurate, Adaptable, and Accessible Error Metrics for Predictive\nModels# abbyyR      Access to Abbyy Optical Character Recognition (OCR) API                 # abc         Tools for Approximate Bayesian Computation (ABC)                        # abc.data    Data Only: Tools for Approximate Bayesian Computation (ABC)             # ABC.RAP     Array Based CpG Region Analysis Pipeline                                # ABCanalysis Computed ABC Analysis# For small sets we can use either:# mdf[1:6,] #or# head(mdf, 6)

However, although working quite well for small array/dataframe list (subset), I ran into a display problem with the full list, where the data would be shown either column-by-column or unaligned. I would have been great to have this paged and properly formatted in a new window somehow. I tried using page, but I couldn't get it to work very well.

EDIT:The recommended method is not the above, but rather using Dirk's suggestion (from the comments below):

db <- tools::CRAN_package_db()colnames(db)mdf <- data.frame(db[,1], db[,52])colnames(mdf) <- c("Package", "Description")print(mdf, right=FALSE, row.names=FALSE)

However, this still suffers from the display problem mentioned...

CodeHunter

List and description of all packages in CRAN from within R

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last