Setting up for using the Reporting API – with R

I recently spent a very valuable two hours listening to Tim Wilson’s introduction to R at the eMetrics Summit in Berlin. Given that when I last wrote about setting up for using the Reporting API, I ignorantly omitted R completely, and that Tim made it sound pretty easy, I shall try to remedy that mistake.

The goal of this article is to get you up to speed with Analytics and R as quickly as possible, ideally to the point where you can see some data or a chart.

Prerequisites

Before we start pulling data with R, we obviously need to install it.

Head over the the R Project web site, then follow the link to the download page, which is actually an interstitial that asks you to select a mirror. Since I am in Switzerland, I shall use the mirror hosted by ETH Zürich.

I suggest you download a precompiled binary package. I downloaded “R for Windows”, and I also installed R on a spare Raspberry Pi running Raspbian “stretch” by simply issuing the command sudo apt install r-base.

R is just a language. It is a lot easier to work with it if you have a bit of padding around it. Tim used “R Studio” in his talk, and so shall I.

Point your browser to the R Studio web site, then download the correct version for you.

Install R first, then R Studio.

When you launch R Studio, it should be able to find R, then give you a user interface with three elements.

[screenshot]

R Studio

On the left, you can see the Console. The right hand side is split. Mine shows some data I loaded earlier on the top, plus a blank space for plots at the bottom. Directly after installation, your top right hand panel should also be empty.

Packages

Time to install a couple of packages…

R has a packaging system called “CRAN”. There are thousands of packages available to do all sorts of things with R. For our purposes, we need one package: RSiteCatalyst by Randy Zwitch and others.

The easiest way to install that package is to click into the Console, then issue a command:

	install.packages('RSiteCatalyst')

R Studio will download the package (plus possibly some dependencies), then install.

While we’re at it, why not install ggplot2?

Actually, if you haven’t installed “RSiteCatalyst” yet, just install both in one fell swoop:

	install.packages('RSiteCatalyst', 'ggplot2')

First Steps

We’re now ready for a couple of baby steps.

In order to use the packages that we installed, we need to load them, like so:

	library('RSiteCatalyst')

The Console should hesitate a tiny bit, then give you a new prompt.

[screenshot]

R Studio loading the RSiteCatalyst library

Next, you need to authenticate. The Reporting API will only give you data if it knows and likes who you are!

	SCAuth('jexner:Jan Exner Inc', '12345678901234567890123456789012')

If you save the R Studio status on quitting, you’ll later be able to get back to this command by pressing the up key in the Console, just so you know you don’t have to type this again.

You’re now authenticated and you can pull data from Analytics.

As usual, let’s start with a really simple command, GetReportSuites(). It’ll return a list of all the Report Suite that you have access to:

[screenshot]

R Studio with Report Suite list

Working? Great! Let’s get some data!

	pageviews_w_forecast <- QueueOvertime('jexnerweb4dev', date.from = "2016-01-01", date.to="2016-11-13", metrics = "pageviews", date.granularity = 'day', anomaly.detection = 1)

It’ll take some time, then you’ll see the “pageviews_w_forecast” data on the top right.

If you click the little table icon on the top right next to the data, or use the view(pageviews_w_forecast) command, you’ll get a table on the top left.

[screenshot]

R Studio with Data Visualisation

Now we have some data!

Following Randy’s example and tweaking it a bit, I end up with a nice plot of anomalies.

[screenshot]

R Studio with Plot

This visualisation is not the most beautiful, to be honest, but I’m new to R, all I can do for now is to follow easy examples. More complex ones, like the brilliant Visualizing Website Structure With Network Graphs are way beyond my capacity for now…

I hope you’ll be having a blast with R and Analytics data, and please feel free to post your results!

Note: Randy has some pretty cool stuff on his site. How about some R code that creates a complete variable map of all your Report Suites in a single Excel file?

And here is the complete code for your perusal;

#Load libraries
library('RSiteCatalyst')
library('ggplot2')

#Authenticate
SCAuth('jexner:Jan Exner Inc', 'xxxxxxxxxxxxxxxxxxxxxxxxxx')

#Get Page View data plus forecast
pageviews_w_forecast <- QueueOvertime('jexnerweb4dev', date.from = "2016-10-01", date.to="2016-11-13", metrics = "pageviews", date.granularity = 'day', anomaly.detection = 1)

#Plot data using ggplot2
library(ggplot2)

#Combine year/month/day together into POSIX
pageviews_w_forecast$date <- ISOdate(pageviews_w_forecast$year, pageviews_w_forecast$month, pageviews_w_forecast$day)

#Convert columns to numeric
pageviews_w_forecast$pageviews <- as.numeric(pageviews_w_forecast$pageviews)
pageviews_w_forecast$upperBound.pageviews <- as.numeric(pageviews_w_forecast$upperBound.pageviews)
pageviews_w_forecast$lowerBound.pageviews <- as.numeric(pageviews_w_forecast$lowerBound.pageviews)

#Calculate points crossing UCL or LCL
pageviews_w_forecast$outliers <- ifelse(pageviews_w_forecast$pageviews > pageviews_w_forecast$upperBound.pageviews, pageviews_w_forecast$pageviews,
ifelse(pageviews_w_forecast$pageviews < pageviews_w_forecast$lowerBound.pageviews, pageviews_w_forecast$pageviews, NA))

#Add LCL and UCL labels
LCL <- vector(mode = "character", nrow(pageviews_w_forecast))
LCL[nrow(pageviews_w_forecast)] <- "LCL"
UCL <- vector(mode = "character", nrow(pageviews_w_forecast))
UCL[nrow(pageviews_w_forecast)] <- "UCL"
pageviews_w_forecast <- cbind(pageviews_w_forecast, LCL)
pageviews_w_forecast <- cbind(pageviews_w_forecast, UCL)

#Create ggplot with actual, UCL, LCL, outliers
ggplot(pageviews_w_forecast, aes(date)) +
theme_bw(base_family="Garamond") +
theme(text = element_text(size=20)) +
ggtitle("Page Views for webanalyticsfordevelopers.com\n") +
geom_line(aes(y = pageviews), colour = "grey40") +
geom_point(aes(y = pageviews), colour = "grey40", size=3) +
geom_point(aes(y = outliers), colour = "red", size=3) +
geom_line(aes(y = upperBound.pageviews), colour = "green4", linetype = "dashed") +
geom_line(aes(y = lowerBound.pageviews), colour = "green4", linetype = "dashed") +
xlab("\nDate\n\nNote: Upper and Lower Control Limits calculated by Adobe Analytics API") +
ylab("Page Views\n") +
geom_text(aes(label=UCL, family = "Garamond"), y = pageviews_w_forecast$upperBound.pageviews, size=4.5, hjust = -.1) +
geom_text(aes(label=LCL, family = "Garamond"), y = pageviews_w_forecast$lowerBound.pageviews, size=4.5, hjust = -.1)
About

German expat living in Switzerland (formerly UK and France). Consultant and member of the Multi-Solutions Group at Adobe, working with the Digital Marketing Suite. Father of 4 girls.

Tagged with: ,
Posted in Automation, Integration
2 comments on “Setting up for using the Reporting API – with R
  1. […] Setting up for using the Reporting API – with R must be the top article, simply because it is time for us in the Analytics trade to up our game […]

    Like

  2. […] Adobe’s own tool. Nowadays, R language is hot topic among Analysts. Again, with the help of Jan post I did my first setup to pull some data with R packages, yes, there is even packet available to pull […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 1,398 other followers

%d bloggers like this: