As Data Science continues to grow and develop, it’s only natural for new tools to emerge, especially considering the fact that data science had some significant barriers to entry in the past.
In this article, I wanted to go over nine libraries that I’ve come across in the past year that are game changers. These libraries have been incredibly useful in my data science journey and I wanted to share them with you in hopes that it’ll help you with your journey too!
The following libraries are broken down into three categories:
When I was applying to Data Science jobs, I noticed that there was a need for a comprehensive statistics and probability cheat sheet that goes beyond the very fundamentals of statistics (like mean/median/mode).
I just want to say that whether you choose data science or data engineering should ultimately depend on your interests and where your passion lies. However, if you’re sitting on the fence, unsure of which to choose because they are of equal interest, then keep reading!
Data science has been a hot topic for a while, but a new king of the jungle has arrived — data engineers. In this article, I’m going to share with you several reasons why you might want to consider pursuing data engineering over data science.
Note that this IS an opinionated article and take…
An in-depth analysis of the most in-demand skills from webscraping over 15,000 Data Scientist job postings.
I just wanted to start off by saying that this is heavily inspired by Jeff Hale’s articles that he wrote back in 2018/2019. I’m writing this simply because I wanted to get a more up-to-date analysis of what skills are in demand today, and I’m sharing this because I’m assuming that there are people out there that also want to see an updated version of the most in-demand skills for data scientists in 2021.
Take what you want from this analysis — it’s obvious…
I wanted to write this article because I think a lot of people tend to overlook the validation and testing stage of machine learning. Similar to experimental design, it’s important that you spend enough time and use the right technique(s) to validate your ML models. Model validation goes far beyond train_test_split(), which you’ll soon find out if you keep reading!
Model validation is a method of checking how close the predictions of a model is to reality. Likewise, model validation means to calculate the accuracy (or metric of evaluation) of the model that you’re training.
There are several different methods…
Suppose I wanted to create a list of numbers from 1 to 1000 and wrote the following code…
Can you figure out what’s wrong with it?
numbers = 
for i in range(1,1001):
Trick question. There’s nothing wrong with the code above, BUT there’s a better way to achieve the same result with a list comprehension:
numbers = [i for i in range(1, 1001)]
List comprehensions are great because they require less lines of code, are easier to comprehend, and are generally faster than a for loop.
While list comprehensions are not the most difficult concept to grasp, it…
Unlike building ML models, model deployment has been one of the biggest pain points in data science because it leans more on the software engineering side of things and has a steep learning curve for beginners.
In recent years, however, several tools have been built to make model deployment easier. In this article, we’re going to go through five technologies that you can use to deploy your machine learning models.
In case you’re not entirely sure what model deployment is…
Model deployment means integrating a machine learning model into an existing production environment that takes an input and returns output…
As the volume of data continues to grow, the need for qualified data professionals grows as well. Specifically, there’s been a growing need for professionals who are fluent in SQL and not just at a beginner level.
With that said, here we go!
If you ever wanted query a query, that’s when CTEs come into play — CTEs essentially create a temporary table.
Using common table expressions (CTEs) is a great…
SQL is the universal language in the data world. If you’re a data analyst, data scientist, data engineer, data architect, etc., you need to write good SQL code.
Learning how to write basic SQL queries isn’t hard, but learning how to write good SQL code is another story. Writing good SQL code isn’t necessarily hard, but it does require learning some rules.
If you’re fluent in Python or another coding language, some of these rules might seem familiar with you, and that’s because they’re very much transferable!
And so, I’m going to share with you five tips to write cleaner…