How to load data from AdWords to Google BigQuery

Blendo Team

This post helps you with loading your data from Google AdWords to BigQuery. If you are looking to get analytics-ready data without the manual hassle you can integrate Google AdWords to BigQuery with Blendo, so you can focus on what matters, getting value out of your business data.

What we will see:

  • An Intro to Google BigQuery and Google AdWords.
  • Extract your data using AdWords and Google AdWords API (the hard way)
  • Prepare your AdWords Data for Google BigQuery
  • Load Data by a source like AdWords to BigQuery
  • The best way you can load from AdWords to BigQuery (the easy way)

Data extraction from Google AdWords

The AdWords API allows applications to interact directly with the AdWords platform. You can build applications to more efficiently manage large or complex AdWords accounts and campaigns. Contrary to the rest of the APIs that we have covered in this series of posts,  the Google AdWords API is implemented only using the SOAP protocol and it doesn’t offer a RESTful web implementation.

Nevertheless, they offer a number of client libraries that you can use for your language or framework of choice. They officially support clients for the following languages

  • Java
  • .Net
  • PHP
  • PERL
  • Python
  • Ruby

The AdWords API is a quite complex product that exposes a lot of functionality to the user, ranging from reporting to do bidding and programmatic advertisement. As the scope of this post is the extraction of data out of it, with the aim of loading any data to a data warehouse for further analysis, we’ll focus only on that part of the Google AdWords API.

You will need more time to read this post than integrating Google Adwords to BigQuery.

Effortlessly Sync All Your Google Adwords Data to BigQuery

There are many ways of interacting with data that adWords API gathers. One way is to link your Google Analytics and AdWords accounts and actually enrich data of your analytics with data coming from AdWords.  The other possible way, if you have the luxury to afford a Google analytics premium account, is to load every data directly to Google BigQuery. From there, you can either do your analysis from BigQuery or export data to another data warehouse.

We’ll assume that you do not have a Google Analytics premium account, to be honest if you had you wouldn’t be looking at this post anyway, but you still want to extract data and load it to your own data warehouse solution. To do that we’ll utilise the Report related functionality of the AdWords API. The API supports a huge number of reports that you can request, and it is possible to change the granularity of your results by passing specific parameters. Defining what kind of data you want to get back as part of your report can be done in two different ways.

  • Using an XML-based report definition.
  • Using an AWQL-based report definition.

If you want to use an XML based report definition you have to include a parameter named “__rdxml” that will contain an XML serialised definition of the report you want to retrieve.

<reportDefinition xmlns="https://adwords.google.com/api/adwords/cm/v201509"> <selector> <fields>CampaignId</fields> <fields>Id</fields> <fields>Impressions</fields> <fields>Clicks</fields> <fields>Cost</fields> <predicates> <field>Status</field> <operator>IN</operator> <values>ENABLED</values> <values>PAUSED</values> </predicates> </selector> <reportName>Custom Adgroup Performance Report</reportName> <reportType>ADGROUP_PERFORMANCE_REPORT</reportType> <dateRangeType>LAST_7_DAYS</dateRangeType> <downloadFormat>CSV</downloadFormat> </reportDefinition>

AWQL is a SQL-like language for performing queries against most common AdWords API services. Any service with a query method is supported; queryable fields for each service are listed here.

As a comparison you can see the difference between using XML and AWQL bellow:

XML

<serviceSelector> <fields>Id</fields> <fields>Name</fields> <predicates> <field>Status</field> <operator>EQUALS</operator> <values>ENABLED</values> </predicates> <ordering> <field>Name</field> <sortOrder>ASCENDING</sortOrder> </ordering> <paging> <startIndex>0</startIndex> <numberResults>50</numberResults> </paging> </serviceSelector>

AWQL

CampaignPage p = campaignService.query("SELECT Id, Name WHERE Status = 'ENABLED' ORDER BY Name DESC LIMIT 0,50");

As we can see, the Google AdWords API has a very expressive way of defining what data we want to get from it and various options to do that. If you feel more comfortable with SQL like languages you can use AWQL, or if you prefer XML you can use that for defining your reports.

Regarding the format of the results you get from the API, there are also multiple options supported.

  • CSVFOREXCEL – Microsoft Excel compatible format
  • CSV – comma separated output format
  • TSV – tab separated output format
  • XML – xml output format
  • GZIPPED-CSV – compressed csv
  • GZIPPED-XML – compressed xml

Google AdWords, exposes a very rich API which offers you the opportunity to get very granular data about your accounting activities and use it for analytic and reporting purposes. This richness comes with a price though, a large number of complex resources that have to be handled through an also complex protocol.

About Google AdWords

Load data from Google Adwords to Google BigQuery

Load data from Adwords to BigQuery

Google AdWords is an online advertising service by Google for businesses wanting to display ads on Google and its advertising network. In its core Google AdWords is a Real Time Bidding system where advertisers compete to display their advertising material to web users who are using Google products like its search engine. Programmatic and instantaneous auctions are performed, similar to how financial markets operate. Among the benefits of AdWords are:

  • Pay-per-click – advertisers pay only for ads that have been clicked by the user

  • Any budget – You can start with any budget, although you have to be aware of the Real Time Bidding nature of AdWords, which means that the effectiveness of your campaigns are linked to what your competitors are also willing to pay.

  • Reach – you can reach billions of people worldwide.

Additionally, Google AdWords, just like every other product from Google has excellent support and it exposes a rich ecosystem of tools and APIs that you can use to get the most out of their services.

Prepare your data to be sent from Google AdWords to Google BigQuery

Before you load any data into BigQuery, you should make sure that it is presented in a format supported by it, so for example if the API you pull data out and returns XML, you have to first transform it into a serialization that BigQuery understands. Currently two data formats are supported:

You also need to make sure that the data types you are using are the ones supported by BigQuery, which are the following:

  • STRING
  • INTEGER
  • FLOAT
  • BOOLEAN
  • RECORD
  • TIMESTAMP

for more information please check the Preparing Data for BigQuery page on the documentation.

About Google BigQuery

 

BigQuery is the data warehousing solution of Google. It’s part of the Google Cloud Platform and it also speaks SQL like Redshift does. Queries are executed against append-only tables using the processing power of Google’s infrastructure. It is also fully managed and is offered as a service over the cloud. You can interact with it through its web UI, using a command line tool while a variety of client libraries exist so you can interact with it through your application.

Load Data from Google AdWords to Google BigQuery

If you want to load Google AdWords data to Google BigQuery, you have to use one of the following supported data sources.

  1. Google Cloud Storage
  2. Sent data directly to BigQuery with a POST request
  3. Google Cloud Datastore Backup
  4. Streaming insert
  5. App Engine log files
  6. Cloud Storage logs

From the above list of sources, 5 and 6 are not applicable in our case.

For Google Cloud Storage, you first must load data you own into it, there are a few options on how to do this, for example you can use the console directly as it is described here and do not forget to follow the best practices. Another option is to post data through the JSON API, as we see again APIs play an important role in both the extraction but also the loading of data into our data warehouse. In it’s simplest case it’s just a matter of one HTTP POST request using a tool like CURL or Postman or Apirise. It should look like the following example.

POST /upload/storage/v1/b/myBucket/o?uploadType=media&name=myObject 
HTTP/1.1 Host: www.googleapis.com 
Content-Type: application/text 
Content-Length: number_of_bytes_in_file 
Authorization: Bearer your_auth_token your Google AdWords data

and if everything went ok, you should get something like the following as a response from the server:

HTTP/1.1 200 Content-Type: application/json { "name": "myObject" }

Working with Curl or Postman, is good only for testing, if you would like to automate the process of loading any data into Google Bigquery, you should write some code to send your own data to Google Cloud Storage. In case you are developing on the Google App Engine you can use the library that is available for the languages that are supported by it:

  1. Python
  2. Java
  3. PHP
  4. Go

If you are using one of the above languages and you are not coding for the Google App Engine, you can use it to access the Cloud Storage from your environment. Interacting such a feature rich product like Google Storage on the Cloud can become quite complicated depending on your use case, for more details on the different options that exist you can check Google Cloud Storage documentation. If you are looking for a less engaged and more neutral way of using it, you can consider a solution like Blendo.

After you have loaded data into Google Cloud Storage, you have to create a Load Job for BigQuery to actually load the data into it, this Job should point to the source data in Cloud Storage that have to be imported, this happens by providing source URIs that point to the appropriate objects.

The previous method described, used a POST request to the Google Cloud Storage API for storing data there and then load it into BigQuery. Another way to go is to do a direct HTTP POST request to BigQuery with the data you would like to query. This approach is similar to how we loaded every data to Google Storage on the Cloud through the JSON API, but it uses the appropriate end-points of BigQuery to load the data there directly. The way to interact with it is quite similar, for more information can be found on the Google BigQuery API Reference and on the page that describes how you can load data into BigQuery using POST. You can interact with it using the HTTP client library of the language or framework of your choice, a few options are:

The best way to load from Google AdWords to BigQuery

So far we just scraped the surface of what you can do with BigQuery and how to load data into it. Things can get even more complicated if you want to integrate data coming from different sources.

Are you striving to achieve results right now?

Instead of writing, hosting and maintaining a flexible data infrastructure use Blendo that can handle everything automatically for you.

Blendo with one click integrates with sources or services, creates analytics-ready data and syncs your Google AdWords to BigQuery right away.

Help your marketing and executive team take ownership of the advertisement data that live inside Google Adwords to transform performance marketing and ROI.

Blendo is the easiest way to automate powerful data integrations.

Try Blendo free for 14 days. No credit card required.