Photo by Alexander Shatov / Unsplash

API Scraping Twitter Bot

General Oct 27, 2023


The game GuildWars 2 has a well-documented API that can provide information about players' accounts, achievements, and much more. Another interesting data point available via the API is every achievement in the game, as well as daily rotating achievements.

Some daily achievements are either faster to complete or significantly more profitable than others. I decided I wanted to scrape the API for the daily achievements each day for fun and profit.

I started by manually poking and prodding the API. Thanks to the wiki documentation I quickly discovered the endpoint related to daily achievements. I created a Python script (dailyGetter.py) that ran daily via the task scheduler so I could scrape all the dailies and collect their event ID. I collected three months of daily achievement data and came up with a list of daily tasks I wanted to monitor.

I decided to share the source code and scraped data on Github, run the code in AWS Lambda, and Tweet results using a bot Twitter account. I could check Twitter for fun and profitable dailies, and other players could monitor the bot as well.

Deploying to cloud

As this was my first time setting up a Twitter bot and using AWS Lambda, there was ample trials and tribulation.

I leveraged Tweepy for my project and turned to the API v2 Get tweet example documentation to test the API connection between Lambda and Twitter. Presented with 403 forbidden error, I quickly realized my default keys were for read-only. A quick dirty script to test my Twitter bot credentials and I was able to perform my first post, from Lambda!

I then updated the code to leverage the Parameter Store to store my keys as variables as the previous iteration had hard-coded credentials. I performed another test to make sure all variables were referenced correctly.

I spent a lot of time debugging my code due to a plethora of issues such as the following:

tweepy.errors.Forbidden: 403 Forbidden

Permissions issue with the project app in the Twitter developer portal. Deleting and recreating fixed the issue.

453 - You currently have Essential access which includes access to Twitter API v2 endpoints only. If you need access to this endpoint, you’ll need to apply for Elevated access via the Developer Portal.

Fixed by updating Twitter API Key perms from read-only to read/write

"errorMessage": "Unable to import module 'lambda_function': No module named 'tweepy'", import requests import tweepy

I quickly learned there are a limited number of Python packages in Lambda natively. I tried to leverage a Github project by Keith Rozario that provides a collection of Python Packages as Lambda layers. Unfortunately, the collections did not have the appropriate modules and package versions for the code that I had written locally and was going to repurpose.

Python libraries can be zipped into a folder and manually uploaded within the Layers section of AWS Lambda. I revisited the option to manually upload the source code from Zip. I repackaged every module one by one needed to get my code to work and rezipped the source about times until viola!

Great Success! Now all I had to do was update my Python code to reference a list of IDs I wanted to monitor for and integrate it into the newly created Lambda.

Snippet of code containing ID list to check and post a Tweet based on existence

I then configured daily trigger via Amazon EventBridge. This allowed me to set a rule that invoked the Lambda to run on a schedule.

Schedule expression: cron(45 0 * * ? *)

Reference:
Code https://github.com/FranklyFuzzy/GW2Bot
See the bot at https://twitter.com/EZ_Dailies_GW2
https://github.com/keithrozario/Klayers/tree/master/deployments
https://docs.tweepy.org/en/stable/examples.html
https://wiki.guildwars2.com/wiki/API:Main

Update: This project has been discontinued due to changes to the GW2 API.

Tags