Scrapers & Analytics

AWIS Scraper Feb, 2018

Combined AWIS API in Java and directly scrapping Alexa.com to collect daily website traffic information to up to 4 years. Wrote a XML parser and cleaned data to formatted csv file.


Glassdoor Company Review Scraper May, 2017 – Aug, 2017

Gathers employee reviews for companies and employee salary information for designated geolocation. View my research page for more details.

https://www.glassdoor.com/Reviews/PwC-Reviews-E8450.htm

Note: this specific attempt have been permitted by Glassdoor for research purposes. Please always refer to and follow Glassdoor’s Terms of Use.


Social Media Bot and Friend Info Analysis Jul, 2017

WeChat is a popular Chinese social app kinda like the Facebook messenger. Using ItChat library, I scraped (public) information from my friend list including name, sex, location and their signatures.

Distribution of my contacts in the U.S.

Word cloud generated with all the signatures of my friends

original image

 

 

 

 

 

 

 

 

Distribution of my friends from China

Compared to the population distribution of China


Image/Answer Scraper for Question-Answer sites(Quora, Zhihu) May, 2017

Scrapes answers of a specific user or all answers/image files under a certain question.

https://www.quora.com/Should-artificial-intelligence-be-regulated