Our MathStat.HWDiscuss forum is powered by open source software called Discourse which has an API that allows us to scrape data right off of it. Check out the following two pages:
The first is just our standard landing page. The second contains the same information, but formatted as JSON. That highly structed format allows us to manipulate it with a computer quite easily. For example, here is the list of categories currently on our forum.
import requests
response = requests.get("https://mathstat.hwdiscuss.com/categories.json")
if response.status_code == 200:
response_json = response.json()
print([category['name'] for category in response_json['category_list']['categories']])
else:
print("Uh-oh")
Now, we can dig a little deeper into the individual categories. For example, here is the list of all the topics in the Statistics category, which happens to have id=11
:
response = requests.get("https://mathstat.hwdiscuss.com/c/11.json")
response_json = response.json()
topic_titles = [topic['title'] for topic in response_json['topic_list']['topics']]
topic_titles
It turns out that with, an API Key, you can automate just about any process that you would do with a browser. In particular, I can automate the process of checking whether you've done your forum homework and received likes on your posts!