Sign in

3M+ views | Data Scientist | MSc Analytics & MBA student | https://terenceshin.com/

Photo by Douglas Sanchez on Unsplash

As Data Science continues to grow and develop, it’s only natural for new tools to emerge, especially considering the fact that data science had some significant barriers to entry in the past.

In this article, I wanted to go over nine libraries that I’ve come across in the past year that are game changers. These libraries have been incredibly useful in my data science journey and I wanted to share them with you in hopes that it’ll help you with your journey too!

The following libraries are broken down into three categories:

  1. Model Deployment
  2. Data Modelling
  3. Exploratory Data Analysis

1. Model Deployment

Kedro

It’s…


Important terms and equations for statistics and probability

Photo by Green Chameleon on Unsplash

Table of Contents

  1. About This Resource
  2. Confidence Intervals
  3. Z-statistics vs T-statistic
  4. Hypothesis Testing
  5. A/B Testing
  6. Linear Regression
  7. Probability Rules
  8. Bayes Theorem
  9. Interview Practice Questions

About This Resource

When I was applying to Data Science jobs, I noticed that there was a need for a comprehensive statistics and probability cheat sheet that goes beyond the very fundamentals of statistics (like mean/median/mode).

And so, Nathan Rosidi, founder of StrataScratch, and I collaborated to cover the most important topics that commonly show up in data science interviews. …


A new king of the jungle has emerged

Photo by Ryan Harvey on Unsplash

I just want to say that whether you choose data science or data engineering should ultimately depend on your interests and where your passion lies. However, if you’re sitting on the fence, unsure of which to choose because they are of equal interest, then keep reading!

Data science has been a hot topic for a while, but a new king of the jungle has arrived — data engineers. In this article, I’m going to share with you several reasons why you might want to consider pursuing data engineering over data science.

Note that this IS an opinionated article and take…


Results from webscraping over 15,000 Data Scientist job postings

Image created by Author

An in-depth analysis of the most in-demand skills from webscraping over 15,000 Data Scientist job postings.

Introduction

I just wanted to start off by saying that this is heavily inspired by Jeff Hale’s articles that he wrote back in 2018/2019. I’m writing this simply because I wanted to get a more up-to-date analysis of what skills are in demand today, and I’m sharing this because I’m assuming that there are people out there that also want to see an updated version of the most in-demand skills for data scientists in 2021.

Take what you want from this analysis — it’s obvious…


A comprehensive guide to four popular cross validation techniques

Man photo created by karlyukav — www.freepik.com

I wanted to write this article because I think a lot of people tend to overlook the validation and testing stage of machine learning. Similar to experimental design, it’s important that you spend enough time and use the right technique(s) to validate your ML models. Model validation goes far beyond train_test_split(), which you’ll soon find out if you keep reading!

But first, what is model validation?

Model validation is a method of checking how close the predictions of a model is to reality. Likewise, model validation means to calculate the accuracy (or metric of evaluation) of the model that you’re training.

There are several different methods…


Practice Problems for List Comprehensions, Dictionary Comprehensions, and Nested List Comprehensions

Background vector created by rawpixel.com — www.freepik.com

Introduction

Suppose I wanted to create a list of numbers from 1 to 1000 and wrote the following code…

Can you figure out what’s wrong with it?

numbers = []
for i in range(1,1001):
numbers.append(i)

Trick question. There’s nothing wrong with the code above, BUT there’s a better way to achieve the same result with a list comprehension:

numbers = [i for i in range(1, 1001)]

List comprehensions are great because they require less lines of code, are easier to comprehend, and are generally faster than a for loop.

While list comprehensions are not the most difficult concept to grasp, it…


A dive into modern ML deployment tools

Photo by SpaceX on Unsplash

Unlike building ML models, model deployment has been one of the biggest pain points in data science because it leans more on the software engineering side of things and has a steep learning curve for beginners.

In recent years, however, several tools have been built to make model deployment easier. In this article, we’re going to go through five technologies that you can use to deploy your machine learning models.

In case you’re not entirely sure what model deployment is…

Model deployment means integrating a machine learning model into an existing production environment that takes an input and returns output…


Take your SQL skills to the next level

Clouds vector created by vectorjuice — www.freepik.com

As the volume of data continues to grow, the need for qualified data professionals grows as well. Specifically, there’s been a growing need for professionals who are fluent in SQL and not just at a beginner level.

And so, Nathan Rosidi, founder of StrataScratch, and I collaborated to to go over what I think are the 10 most important and relevant intermediate to advanced SQL concepts.

With that said, here we go!

1. Common Table Expressions (CTEs)

If you ever wanted query a query, that’s when CTEs come into play — CTEs essentially create a temporary table.

Using common table expressions (CTEs) is a great…


Level up your SQL code with these five tips!

Photo by The Creative Exchange on Unsplash

Introduction

SQL is the universal language in the data world. If you’re a data analyst, data scientist, data engineer, data architect, etc., you need to write good SQL code.

Learning how to write basic SQL queries isn’t hard, but learning how to write good SQL code is another story. Writing good SQL code isn’t necessarily hard, but it does require learning some rules.

If you’re fluent in Python or another coding language, some of these rules might seem familiar with you, and that’s because they’re very much transferable!

And so, I’m going to share with you five tips to write cleaner…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store