July 13, 2018

Today’s articles turned out to kind of resemble very general phases of data science project.
First, there’s the data collection, after that feature engineering, modelling and later it’s evaluating the performance. Of course, it isn’t only very general. There is much more than that: e.g. formulating a business problem, formalizing it in mathematical terms (so that it can be modeled), EDA, data preparing and cleaning, implementation issues or maintenance and control. Additionally, they aren’t in the linear order as the feature engineering and modelling are iteratively repeated to try different approaches and combine them.

Piotr Piękos

Attempting to Collect Unbiased Data About the Player Base of Overwatch (PC)

Often data is provided by the client, so there’s no data collection phase, but even then problems described in the article arrive at this seemingly trivial task.

Machine Learning with Kaggle: Feature Engineering

Feature engineering is one of the keys to successful modeling. Here is very basic tutorial to get the grasp of what it is.

Gradient Boosting explained

There are many ml models, but gradient boosted trees are dominating now classic modelling contest on kaggle. This is the best explanation I know on the internet (and in the next article on this page there’s a cool interactive gradient boosting simulation).

Beyond Accuracy: Precision and Recall

Although neither precision nor recall are being directly optimized they are often used to get overview of the performance and explain the results.

Bartosz Cłapa

Goodbye Microservices: From 100s of problem children to 1 superstar

Microservices are not as good as they’re presented

How JavaScript works: the internals of Shadow DOM

How one of Web Component standards is working? Read what you can find under the hood and how to use it when creating reusable components.

Web Architecture 101

Basic architecture concepts you should be familiar with as a web developer