BrainsToBytes - page 2

Hello There Reader, It's Been a Minute

It’s been a while since I last posted on this blog (almost 4 years!), as life has been demanding a lot of attention. I may be able to start writing with a bit more frequency, or at least that’s what I hope. So, what have I been up to? Well, a bunch of things. I started a consulting business! I finally decided to take the plunge and start my o...

Read more

It's Fine, Nobody Can Remember Everything

A couple of days ago I had a conversation with a friend who is learning to program. We were talking about the difficulty of remembering what each concept means and what every keyword does. The conversation eventually led to this question: Ok, but when will I stop needing the docs? I (probably most people) had the same feeling when I was learn...

Read more

On Abstraction and Coupling

This article is about the second group of concepts I wanted to talk about after re-reading Clean Architecture. I want to try something different this time: Instead of elaborating each idea in long, continuous prose, I’ll just list them as separate chunks. So, here it goes: We already know that tight coupling is bad. It binds together softw...

Read more

On Shape and Behavior

I recently started re-reading Bob Martin’s Clean Architecture and found two other ideas I wanted to share. One of them (the topic of this article) is the dual nature of the way software developers provide value through code. When you implement (or modify) a feature in your system you are creating value by altering or expanding its behavior. Mos...

Read more

Domain-Driven Design

This article is a summary of what I consider to be the most important concepts of the book Domain-Driven Design, by Eric Evans. I tried to condense the most important ideas in a single article for anyone interested in the topic. I attempted to pack in as much information as possible, but it was not an easy task: The book is a very condensed work...

Read more

Hands-on Pandas(11): The apply function

We have already covered most of the fundamentals of working with data using the Pandas library. There is one more topic I’d like to discuss before concluding the series: The Apply function. In the previous article, we learned how to create subgroups of data using the groupby function. This is quite useful when you want to gain a better understa...

Read more

Hands-on Pandas(10): Group Operations using groupby

Sometimes you need to perform operations on subsets of data. Your rows might have attributes in common or somehow form logical groups based on other properties. Common operations like finding the average, maximum, count, or standard deviation of values from groups of data is a really common task, and Pandas makes this really easy to accomplish. ...

Read more

Hands-on Pandas(9): Merging Dataframes

Merge/join operations in Pandas let you gather information from many tables into a single dataframe for further processing or analysis. This is another important skill that you will probably use a lot when working with data. If you have some experience with relational databases you can recognize the analogous behavior with table joins. In this ...

Read more

Hands-on Pandas(8): Cleaning Data

In an ideal world, all the data you need is available in the right format and with complete content. In the real world, you will probably need to scrape data from lots of different and incomplete sources. That’s why it’s important to learn how to clean your data before analyzing it or feeding it into a ML algorithm. Data cleaning might not the...

Read more

Hands-on Pandas(7): Loading data from files

Data analysis usually starts by loading data into the structures of your library/tools of choice. Almost always this data will either come from a database, the web, or a collection of files. The files that contain your data can come in many different formats: Comma-separated values in a text file, JSON files, excel files, or files with values s...

Read more

Hands-on Pandas(6): Descriptive Statistics

Pandas provides many options for calculating descriptive statistics and other reduction operations with just a simple function call. You might want to calculate these values as part of a ML/Data Analysis pipeline, or just because you want to get a better understanding of the data you are dealing with. Most of these operations are similar to Num...

Read more

Hands-on Pandas(5): Mapping, apply and applymap

In this article, we will learn about mapping and the apply and applymap functions. This technique will help you manipulate your data in very convenient ways, and is another important addition to your toolbox. As always, we will explore the topic with examples that will help you understand what’s going on. Great, let’s get started! Mapping M...

Read more

Hands-on Pandas(3): Reindexing and Deletion

Today we will deal with two techniques we need to cover before moving to more advanced Pandas topics: Reindexing and element deletion. It will be a bit shorter than the first two articles in the series, but that doesn’t mean it’s not important. Both techniques are very useful, and you will probably use them in your day-to-day work if you become...

Read more

Hands-on Pandas(2): Selection, Filtering, loc and iloc

In the last article, we learned about the two basic pandas data structures: Series and DataFrames. We also built a couple of them on our own and learned the basics of indexing and selection. Today we will learn a bit more about selecting and filtering elements from Pandas data structures. This might seem like an incredibly basic topic, but it’s...

Read more

Hands-on Pandas(1): Series and Dataframes

In a previous series we covered the fundamentals of NumPy, now it’s time to deal with another important tool frequently used in data analysis: Pandas. Pandas is a library for data manipulation and analysis that lets you manipulate heterogeneous data in tabular form (in contrast to NumPy, designed to work with homogeneous numerical data in array...

Read more

Hands-on NumPy(VI): Linear Algebra

Linear algebra has many useful applications in science and engineering. If you are doing scientific computing, it’s very likely that sooner or later you will need to use linear algebra to solve problems. If your linear algebra is a bit rusty, you can take a look at Khan Academy’s linear algebra path, it’s free and it does a great job at explain...

Read more

Hands-on NumPy(V): Reductions/Aggregations

Reductions (or aggregations) are a family of NumPy functions that operate over an array returning a result with fewer dimensions. Many of these functions perform typical statistical operations on arrays, while others perform dimensionality-reductions. In this article, we will learn about some of the most common aggregations, but before we get ...

Read more