Dr. Meng Chen, a green card holder, is a Data Scientist with a PhD in Evolutionary Biology, specializing in modeling evolutionary ecology of mammalian community through time. As a data scientist, he is interested in finding business insight by implementing machine learning techniques and building products to execute actionable recommendations for a variety of businesses.
School redistricting in Linganore-Oakdale-Urbana (LOU) area has stirred outcry from the local communities in the social media. In order to understand the concerns about local communities, Frederick County Board of Education provided an online platform for community members to express their opinions and conducted a basic statistics of local community concerns. However, their analyses lack of details about what sentiment of the local communities were and why the communities show favor/disfavor to the school redistricting plans.
The community feedbacks for the second-round proposals of new school districts were assembled in single pdf file that contains numerous tables stretching over 200-page long. The tables store feedbacks for elementary, middle, and high school districts of LOU area in Frederick County, MD. Each school district have tables for options A and B as well as some comments related but not specific to any option. I designated those unspecific comments to option AB.
In order scrape the data from the pdf file, I extract the comments and parse them into dataframe and stored them in a SQL database. To improve the efficiency of analyzing the data, I created redistrict
module to scrape data, to perform exploratory data analyses and sentiment analyses, and to finally visualize the insights.