A long journey to data science

4 minute read

Published:

Welcome to the kickoff of our blog series! This is the first post in a series that dives into the fascinating world of data science, politics, and personal lived experiences. Today, I’ll take you on a journey through my accidental stumble into data science—a journey that’s been nothing short of an exhilarating work in progress.

The accident

Once upon a time in a small city in Brazil…

There was a postdoc. This postdoc was me, and I was trying to investigate the partisan affiliations of public officers working for the Brazilian central government. My research project focused on bureaucrats and their affiliations with political organisations, including (but not limited to) political parties.

Brazil has over 30 political parties, and downloading these databases took work. MS Excel opened them, but it was impossible to filter partisan members by name to check for specific bureaucrats, as the filtering mechanism froze my screen every time.

That led me to the programming language R.

A snowball

I started learning R because I couldn’t analyse my databases with Excel—it was not functional. But then everything became a snowball: I started building my graphs with R, learning econometrics, and running regression models and ML algorithms—and that’s how I became a data scientist by accident!

That was quite surprising. I have never excelled in Math, and my computer skills were not great. I had only some experience with HTML. At around 12, I learned how to build webpages with HTML (reading a 1000-page book) because I wanted to build a webpage about a Japanese cartoon I liked. My relationship with technology has always been instrumental: I love it insofar as it facilitates my life.

This previous experience helped me familiarise myself with computers and languages, so R looked quite intuitive. Still, learning a new skill — especially when you don’t use it so frequently —is challenging. Gladly, there are numerous free resources on the Internet. However, I struggled with the most valuable human resource: time. I have taken many free online courses (one of the best was from Harvard University’s platform EdX) and was soon a member of communities around data (PyLadies, for example).

Conciliating this learning process with my other responsibilities — teaching, literature review, meetings, and data collection — was complex. However, I managed to get what I wanted at the time: access to databases and analyse data with R. This led me to a post as a research associate in data at Newcastle University. Then, I had to learn other things, such as data visualisation with Power BI, GIS, Tableau, and other tools. The snowball became an avalanche.

Learning is a roller coaster

There is still much to learn. Data science and technology are constantly evolving, and tools emerge every second. Now, we have machine learning algorithms and generative artificial intelligence, which are still black boxes. We have software and additional tools that assist in different stages of research. Keeping pace with all these new technologies might be overwhelming.

Progress is not always linear. If we build a graph about life, it is not a single line going up. It’s more like waves - a roller coaster. There are moments when you feel you are not evolving and peaks of excitement. And there is always something new to learn. It’s an endless journey with many ups and downs.

A life lesson

Data science requires an ongoing commitment to learning and resilience. Learning programming languages without a computer science background is challenging but not impossible. Starting something new is usually scary. But, with courage, perseverance, and hope, we can do wonderful things! I’m not a big fan of coaching talks, but if I could advise people starting with programming, I would say, “Be grateful for the small things”. Every little achievement counts, and we should be proud even of our little progress (sometimes, it looks small, but it’s