How can I deliver reports via sFTP?
Really simple, see documentation
Except it isn’t.
Or at least it doesn’t seem to be simple, judged by the number of questions I see people ask, both internally and externally. So let’s explain how it works.
The setting: some of the data in Adobe Analytics can be delivered to FTP servers as tab-separated files. This is usually scheduled (every day, every week and so on) and the data is either raw hit-level data (“Data Feed“) or partly aggregated (from Data Warehouse).
People use the raw data to create feeds into their own data warehouse solutions, and the aggregated reports often feed into other systems.
There is also an aspect of getting data into Adobe Analytics, and the setting is the same: the data can be delivered to FTP servers where Adobe Analytics will pick it up and ingest it.
Privacy & Encryption
Some customers worry about the transfer of data across the Internet.
The File Transfer Protocol (FTP) is used to pull or push data across the Internet. It has been around for a long time and mainly does what it says on the tin.
But FTP does NOT encrypt or otherwise hide away the data it sends, and some customers are therefore looking for an alternative. The risk is that a malicious third party with access to routers might be able to read the data.
One way to make sure no one can is to use end-to-end encryption, meaning the data is encrypted before it goes on the wire, and decrypted once it arrives. One popular protocol is sFTP, but there are others.
If you send your data using sFTP, it is impossible (or at least very difficult) for anyone to eavesdrop.
Imagine you are uploading meta data about your customers, like whether they are high-value customers or have credit issues, and the need for end-to-end encryption becomes obvious.
Login
Encryption is one difference, the other is the way you connect and log in. With FTP, you log in using a username and a password. With sFTP you can use a key pair, no need for passwords.
Keys or certificates are used to authenticate both sides.
The server presents a key to the sFTP client so the client knows it has reached the right server (and there is no “man in the middle” pretending to be the server). And the client sends a key to the server to let it know who is trying to make contact.
The way that these things work, both sides can actually rely on the keys without having to ask for any password or username, which is obviously a great advantage!
So how can you use sFTP?
How?
Let’s presume you already have access to an FTP account hosted by Adobe. Those would usually be accessed on ftp.omniture.com
(or ftp2
/ftp3
/…, depending on which data centre your data resides on).
All you need to do is to add your public key into a file called authorized_keys
in a folder called .ssh
.
Really, that is it.
Where do you get a public key you ask?
Well the easiest way is to make one yourself.
If you’re on Linux or a Mac, a simple ssh-keygen -t rsa
would suffice. On Windows, you might want to use PuTTY.
It really is pretty easy.
All you need now is a client that can use sFTP to down- and upload your data.
Notes
How about sending data to an external FTP server with sFTP, say to your own?
With one exception, that is currently only possible via a custom engagement with the Engineering Services group, meaning there is a cost. Data Feeds are the example that most people ask for here.
You can either work with ES to get them delivered via sFTP to your servers, or you get them from Adobe servers using sFTP.
The exception is Data Warehouse, which can deliver to external FTP servers via sFTP as long as the servers do not require two-factor challenge.
In order for this to work, you will have to contact Customer Care and get the Adobe public key. That key then has to be uploaded into the ~/.ssh/authorized_keys
file on your server.
When you create the DWH requests, you specify the FTP server including protocol (e.g. sftp://ftp.test.com
) and port 22.
Easy.