Last Updated on October 8, 2020 by Ritwik B
As you might already know, Python is one of the powerful languages when it comes to data science & analytics. It provides numerous libraries that you can use for analysis & visualization of the data.
In this article, I want to show.. how you can easily get Google Analytics data in pandas so you can use it for data analysis. For more info on pandas, check it here.
To summarize, here’s the whole process:
- Fetch Google Analytics Reporting data using Python (via Google Analytics Reporting API v4)
- Output the data in python pandas data frame. (via Pandas Library)
- Output the data in Google Sheets (via Sheets API) (Optional: You can skip this part)
I won’t be focussing much on the analysis part, but just the setup. Once you get the data in pandas, there’s a whole ocean out there to explore. So let’s go…
Step - 1: Create A Project In The Google Cloud Console.
This is pretty much always the first step when you’re trying to use the Google APIs.
- Go & Login To Google Cloud Console.
- Create a Project with any name.
- Navigate To APIs & Services in the navigation menu & enable the following APIs
- Google Analytics Reporting API (for fetching the data from GA)
- Sheet API (for updating data in sheets)
- Drive API (for updating data in sheets)
Next step is to create a service account key & download the JSON file which contains the private key.
- In APIs & Services, Go To Credential > Create Credential > Service Account Key.
- Select JSON & click create
- A JSON file will be download. (KEEP IT SAFE)
Step - 2: Setup The Environment & The Code.
If you have your own python environment setup on the desktop that’s great. But If you don’t… no worries.
We’ll be running python code on the cloud. Yes.. we’ll be using Google Cloud Shell for the same.
- Go To Google Cloud Shell & Launch the code editor
Once opened,
- Create a file named ‘gaExport.py’ in the cloud shell. Copy the code from the google analytics pandas GitHub page & paste it in the cloud shell gaExport.py file.
- Upload the JSON file which was downloaded in the earlier step. Rename it to ‘client_secrets.json’.
NOTE: Make sure gaExport.py & client_secrets.json are in the same directory.
Now, its time to create a virtual environment.
3.) Open the cloud shell sessions & type the following command.
virtualenv gaToPandas
4.) Once the folder named gaToPandas is created, activate the virtual environment using
source gaToPandas/bin/activate
5.) Once activated, you’ll see the (gaToPandas) prefix. Moving on install these libraries one by one.
pip install --upgrade google-api-python-client
pip install oauth2client
pip install pandas
pip install pygsheets
You can skip the pygsheets installation if you don’t want to export data in google sheets.
Step - 3: Share GA access to Service Email.
1.) Now, you’ll have to share the analytics read & analyze access with the service account client email. You can copy the client_email from the client_secrets.json file & grant google analytics read & analyze access to that email.
The email is of the form ‘[email protected]‘
You can go to view settings & copy the view ID & paste it in the gaExport.py VIEW_ID variable.
Also if you want to export the data in the spreadsheet, make sure to follow these things
- Make spreadsheet access to Pubilc ‘Anyone can edit’ or no-sign in required.
- Copy the spreadsheet ID & paste it as SHEET_ID in gaExport file.
Now its time to run the file. Run the file
python gaExport.py
If everything is fine, you’ll get the data frame as output & also the spreadsheet will be updated
Lastly...
You can modify the DIMENSION & METRIC variables as per your needs. The whole list of 500+ metrics & dimensions for google analytics is available here.
To skip the spreadsheet output process, simply comment the export_to_sheets(df)
line in the main function
Here’s the list of some helpful resources. Do check it out for more info.
- Google Analytics Python Quickstart Guide
- Google Analytics Reporting API v4 – Requests & Responses
- List Of Error Codes – Google Analytics Reporting API v4
- Pygsheets – Python for Sheets Library
Do comment if you face any issues or successfully implement it. 🙂
Okay so this does not seem to work anymore. There is no way to create a JSON file there anywhere. Am I getting something wrong?