I recently spent a very valuable two hours listening to Tim Wilson’s introduction to R at the eMetrics Summit in Berlin. Given that when I last wrote about setting up for using the Reporting API, I ignorantly omitted R completely, and that Tim made it sound pretty easy, I shall try to remedy that mistake.
The goal of this article is to get you up to speed with Analytics and R as quickly as possible, ideally to the point where you can see some data or a chart.
Prerequisites
Before we start pulling data with R, we obviously need to install it.
Head over the the R Project web site, then follow the link to the download page, which is actually an interstitial that asks you to select a mirror. Since I am in Switzerland, I shall use the mirror hosted by ETH Zürich.
I suggest you download a precompiled binary package. I downloaded “R for Windows”, and I also installed R on a spare Raspberry Pi running Raspbian “stretch” by simply issuing the command sudo apt install r-base
.
R is just a language. It is a lot easier to work with it if you have a bit of padding around it. Tim used “R Studio” in his talk, and so shall I.
Point your browser to the R Studio web site, then download the correct version for you.
Install R first, then R Studio.
When you launch R Studio, it should be able to find R, then give you a user interface with three elements.
Packages
Time to install a couple of packages…
R has a packaging system called “CRAN”. There are thousands of packages available to do all sorts of things with R. For our purposes, we need one package: RSiteCatalyst by Randy Zwitch and others.
The easiest way to install that package is to click into the Console, then issue a command:
install.packages('RSiteCatalyst')
R Studio will download the package (plus possibly some dependencies), then install.
While we’re at it, why not install ggplot2?
Actually, if you haven’t installed “RSiteCatalyst” yet, just install both in one fell swoop:
install.packages('RSiteCatalyst', 'ggplot2')
First Steps
We’re now ready for a couple of baby steps.
In order to use the packages that we installed, we need to load them, like so:
library('RSiteCatalyst')
The Console should hesitate a tiny bit, then give you a new prompt.
SCAuth('jexner:Jan Exner Inc', '12345678901234567890123456789012')
If you save the R Studio status on quitting, you’ll later be able to get back to this command by pressing the up key in the Console, just so you know you don’t have to type this again.
You’re now authenticated and you can pull data from Analytics.
As usual, let’s start with a really simple command, GetReportSuites()
. It’ll return a list of all the Report Suite that you have access to:
pageviews_w_forecast <- QueueOvertime('jexnerweb4dev', date.from = "2016-01-01", date.to="2016-11-13", metrics = "pageviews", date.granularity = 'day', anomaly.detection = 1)
It’ll take some time, then you’ll see the “pageviews_w_forecast” data on the top right.
If you click the little table icon on the top right next to the data, or use the view(pageviews_w_forecast)
command, you’ll get a table on the top left.
Following Randy’s example and tweaking it a bit, I end up with a nice plot of anomalies.
I hope you’ll be having a blast with R and Analytics data, and please feel free to post your results!
Note: Randy has some pretty cool stuff on his site. How about some R code that creates a complete variable map of all your Report Suites in a single Excel file?
And here is the complete code for your perusal;
#Load libraries library('RSiteCatalyst') library('ggplot2') #Authenticate SCAuth('jexner:Jan Exner Inc', 'xxxxxxxxxxxxxxxxxxxxxxxxxx') #Get Page View data plus forecast pageviews_w_forecast <- QueueOvertime('jexnerweb4dev', date.from = "2016-10-01", date.to="2016-11-13", metrics = "pageviews", date.granularity = 'day', anomaly.detection = 1) #Plot data using ggplot2 library(ggplot2) #Combine year/month/day together into POSIX pageviews_w_forecast$date <- ISOdate(pageviews_w_forecast$year, pageviews_w_forecast$month, pageviews_w_forecast$day) #Convert columns to numeric pageviews_w_forecast$pageviews <- as.numeric(pageviews_w_forecast$pageviews) pageviews_w_forecast$upperBound.pageviews <- as.numeric(pageviews_w_forecast$upperBound.pageviews) pageviews_w_forecast$lowerBound.pageviews <- as.numeric(pageviews_w_forecast$lowerBound.pageviews) #Calculate points crossing UCL or LCL pageviews_w_forecast$outliers <- ifelse(pageviews_w_forecast$pageviews > pageviews_w_forecast$upperBound.pageviews, pageviews_w_forecast$pageviews, ifelse(pageviews_w_forecast$pageviews < pageviews_w_forecast$lowerBound.pageviews, pageviews_w_forecast$pageviews, NA)) #Add LCL and UCL labels LCL <- vector(mode = "character", nrow(pageviews_w_forecast)) LCL[nrow(pageviews_w_forecast)] <- "LCL" UCL <- vector(mode = "character", nrow(pageviews_w_forecast)) UCL[nrow(pageviews_w_forecast)] <- "UCL" pageviews_w_forecast <- cbind(pageviews_w_forecast, LCL) pageviews_w_forecast <- cbind(pageviews_w_forecast, UCL) #Create ggplot with actual, UCL, LCL, outliers ggplot(pageviews_w_forecast, aes(date)) + theme_bw(base_family="Garamond") + theme(text = element_text(size=20)) + ggtitle("Page Views for webanalyticsfordevelopers.com\n") + geom_line(aes(y = pageviews), colour = "grey40") + geom_point(aes(y = pageviews), colour = "grey40", size=3) + geom_point(aes(y = outliers), colour = "red", size=3) + geom_line(aes(y = upperBound.pageviews), colour = "green4", linetype = "dashed") + geom_line(aes(y = lowerBound.pageviews), colour = "green4", linetype = "dashed") + xlab("\nDate\n\nNote: Upper and Lower Control Limits calculated by Adobe Analytics API") + ylab("Page Views\n") + geom_text(aes(label=UCL, family = "Garamond"), y = pageviews_w_forecast$upperBound.pageviews, size=4.5, hjust = -.1) + geom_text(aes(label=LCL, family = "Garamond"), y = pageviews_w_forecast$lowerBound.pageviews, size=4.5, hjust = -.1)
2 thoughts on “Setting up for using the Reporting API – with R”