Summit is over, *phew*.
I love Summit in London! It is a great occasion to meet all those people I speak with throughout the year. …
For the past couple of years, I have been involved in the preparation of the analytics track at Summit, both as a speaker as well as the “guy responsible for the content” of the track.
I have spent a lot of time trying to find speakers, looking at suggested subjects and discussing content of individual sessions. Sometimes, I have coached speakers or just been there for them.
There were occasions when I had to jump in and present, and sometimes that went horribly wrong (2012 during the general opening session, I had the Internet drop just when I was about to show off
Discover Ad Hoc Analytics to about 2500 people, including our CEO, CTO, CFO, CMO, …).
For the official recording, they cut most of my part, but I managed to get hold of it in its raw, uncut form. haven’t been able to watch it yet, though.
But now for something completely different.
API Best Practices
When the access control via tokens system was dropped some time ago, those of us who had used the Reporting API had to change how we used it.
When we had tokens, the goal was to get as much data out of the API with as few requests as possible, because the number of overall requests per month that was free was limited.
But then tokens went away, and now the goal can (must?) be to get data as quickly as possible. How do you do that? There is a API Best Practices chapter on the Developer Connection that suggests three things:
Reduce Number of Results per Request
Instead of pulling 50000 rows of data in one go, make 10 requests for 5000 rows each.
elements element of the
reportDescription has two attributes that allow you to specify which rows you need:
In your first request, set
startingWith=1. On the second request, set
startingWith=5001, and so on.
Even better: pull only the data you really need. If you display something, that would likely be less than 20 rows, right?
Reduce the Date Range
Rather than pulling data for two years in one go, make 24 requests for one month each.
In other words: if you are using the
dateTo attributes in your reports, make sure those dates are not too far apart, or try running multiple reports with shorter ranges instead.
It makes sense that adding another day to the range increases the workload on the back end. The system has to crunch through more data to get you what you need.
This is undoubtedly the one technique that will make the most impact. But you’re a developer and I really don’t have to tell you, right?
The main advantage of caching data in your own application is that should you ever run out of Internet, you can still display the data that you already have retrieved.
The second advantage is that as you show more and more data over time, you still only need to pull it incrementally, say for the current day, which makes your application a lot faster! And as we learned above, making small requests is also faster than making big ones.
Small & Often
The bottom line is that the Reporting API is very good at returning small (-ish) chunks of data.
Since you don’t have to worry about tokens anymore, you should optimize against this, i.e. pull small chunks more often rather than big chunks that encompass all the data you need.