(Or data centers, for you in the US of A)
(2020 update: some of the below is outdated. You may also want to read Edge Nodes, Data Centres & other updates, it has more recent information)
Adobe Analytics is a “cloud-based solution”, or a Saas, a software as a service. That means: it doesn’t run on your hardware, it runs on Adobe’s.
When a page is tagged and a visitor looks at that page, her browser will send tracking to the Adobe servers. Same for a tagged app — all tracking will go to Adobe servers.
On those servers, the data will be received, processed and ultimately stored.
As a customer, you can extract the data back out using Data Feeds or a couple of other methods. The main thing with SaaS is, though, that you don’t have to. The data is perfectly happy where it is and your friendly marketer can access it as they want. That’s what the Reports & Analytics UI and the Ad Hoc Analytics tool are for.
Today I’d like to go into a little bit more detail around the architecture of this “cloud”. Call it a geeky article if you wish.
Adobe Analytics Cloud
In the post about
s_code.js file — configuration I mentioned that you have to configure the
s.trackingServer variable so the data goes to the right place.
When the system was first built, all data went to one central place in the US. In 2008, when I joined Omniture, there were 2 data centres, one in San Jose, the other in Dallas.
Lately Adobe opened two more: London and Singapore.
Those 4 data centres hold the data for Adobe Analytics, or to be precise: the data for what was formerly known as SiteCatalyst, Data Warehouse and Discover.
Each customer is assigned to one of those 4 data centres.
Where am I?
You can easily find out where your data is.
When your friendly marketer logs into the system, she can see which data centre your data is on: if the URL starts with “sc2.” then you’re on Dallas, “sc3.” means it’s on London and “sc4.” points to Singapore. If you see simply “sc.” then you’re on San Jose.
For you as a developer, this makes a difference when you implement. As we mentioned in the article on configuration in the
s_code.js file, you have to set
s.trackingServer correctly. The same goes for tracking mobile apps, of course.
s.trackingServer variable didn’t exist in the beginning. Instead, people used
s.dc to specify the data centre. The values for
s.dc were “112” for San Jose or “122” for Dallas.
Life was easy back then. Your data was either on San Jose or on Dallas and that’s where you had to send your tracking calls.
A couple of years ago, the system was amended with a layer of servers. Regional Data Collection (RDC) basically extracted the first line of servers out of the data centres: those that receive the data.
The idea: if the browser sends data to a local data collection centre, the response would be faster, helping with a better user experience.
Also, the data collection centres could be spread over the world and act as failovers in case any one of them went down.
The collection servers handle incoming traffic and then pass the data over to whichever data centre will process and store it.
The obvious question: how do I send data to a specific data centre now? Or: how do the data collection centres know where to forward the data?
Part of the URL tells the system which data centre the data ultimately should go to as well as which of the solutions it is for.
jexnerinc.d1.sc.omtrdc.net would be an Analytics call (“.sc” for SiteCatalyst) and the data would ultimately be in San Jose (“d1”).
You can probably guess that “d2” stands for Dallas and “d3” for London.
R as in Regional
It is not possible to specify which RDC node the browser will contact, and that’s the whole point of failover, of course. So the “d1”, “d2”, etc above let’s you specify where the data will be processed and stored, not where it’ll be collected.
Let me illustrate that with an example.
Say you are a Swiss company selling luxury articles all over the world. You have a lot of customers in the US and EMEA and also in China.
Say you are assigned to the Dallas data centre — all processing happens in Dallas and your data is ultimately stored there. When your marketer logs into Adobe Analytics, her browser connects with Dallas.
When a US visitor comes to your site, their browser will send the tracking to either San Jose or Dallas, depending on which one they’re closer to. A French visitor’s browser would likely send the data to London, and the browser of a Chinese visitor would probably send it to Singapore.
The RDC centres would forward the data to Dallas, where it’d be processed and stored for your marketer to analyse.
One note on
You might not see neither
omtrdc.net in the tracking call URL. If instead you see a domain your company owns, then your implementation is using 1st-party cookies. The URL you see is likely a CNAME and you can use
host to find out where it really points to.
The purpose of this article really is to clear up the confusion around the term “data centre”. Given that there are two types of centres, it makes sense for you to know that one processes and stores data (the “data centre”) and the other receives data and forwards it (the “data collection centre”).
From now on, when someone says “have you heard? Adobe opened a new data centre in $city!” you can ask “is that for RDC or a proper data centre?”.
Also, when we talk about getting data out of Adobe Analytics, it is important that you know where exactly you get it from!
If you are using the Reporting API, your endpoint depends on the data centre (“api” points to San Jose, “api2” to Dallas and so on).
If you are getting data via FTP (e.g. via a Data Feed), the actual FTP server you need to connect to depends on the data centre (can you guess where “ftp2” sits?)