top of page

Big Data in Python: PyBank & PyPoll

Writer: Sarah E. Stegall-RodriguezSarah E. Stegall-Rodriguez

Updated: Jul 6, 2023


In today's data-driven world, the ability to handle and analyze large datasets is crucial. Python can be used to efficiently process and make sense of big data. Recently, I showcased my Python skills through two Python mini-projects: PyBank and PyPoll. These projects allowed me to leverage my Python scripting abilities to tackle real-world (and hypothetical) scenarios to deliver insightful analyses.


PyBank

In the PyBank project, I delved into financial records using a provided data set, budget_data.csv. By employing Python scripting techniques, I successfully calculated essential metrics, including the total number of months in the dataset, the net total amount of profits/losses, the changes in profits/losses over the entire period, the average change, and identified the greatest increase and decrease in profits along with their corresponding dates. The accuracy of my results matched the provided analysis, showcasing my ability to effectively handle financial data and conduct meticulous analyses. Results were exported into a text file for ease of viewing.



PyPoll

To support a small rural town in modernizing its vote-counting process, I utilized my Python skills in the PyPoll challenge. Leveraging the election_data.csv dataset, I developed a Python script that efficiently analyzed the votes. Through this script, I determined the total number of votes cast, generated a comprehensive list of candidates who received votes, calculated the percentage and total number of votes for each candidate, and identified the winner based on the popular vote. By combining my programming expertise with data analysis, I enabled the town to streamline its election processes and obtain accurate results. Like PyBank, results were exported to a text file for ease of viewing.



In summary

Completing these Python challenges has provided me with a platform to demonstrate my proficiency in handling and analyzing big data using Python. I also exemplified my expertise in essential tasks such as module importation, file manipulation, data storage, iteration, and debugging. Employing a systematic approach, I successfully accomplished the assigned objectives by breaking down complex tasks into manageable steps.


For a more in-depth exploration of these projects and to review additional details, please visit my GitHub. And, feel free to explore my other projects as well! This project was completed as a part of UTSA's Data Analysis and Visualization Certification.


Commentaires


© 2013-2023 by Sarah E. Stegall-Rodriguez. Contents may not be used and/or duplicated without explicit written permission from the author (Use the Contact Page). Excerpts and links may be used with full and clear credit given to Sarah E. Stegall-Rodriguez. 

bottom of page