THE RISKS OF BIG DATA

What is an algorithm?

Algorithms are a series of commands that are built from approximate models of reality and that work with data

2 min
Algorithms are written in programming languages that computers can understand.

BarcelonaThe term algorithm is used much more now than when the 9th century Persian mathematician from whom it derives, Muhammad ibn Musa al-Khwarizmi, devised methods for solving first and second degree equations. The reasons are clear: now there is more data and more computing power. But there have always been algorithms, which are nothing more than a set of instructions aimed at executing a task. The recipe for a chocolate cake or a Valencian paella are algorithms. The way to light a fire with two sticks is an algorithm. The hunting technique used by some groups of killer whales, when they expel air while spinning under their prey to catch them in a cylinder of bubbles, is an algorithm, in this case generated and transmitted in a pre-linguistic way. So are the nuptial dances of the bird of paradise or lullabies.

Today there are algorithms everywhere. When we press the button for the floor we want to go to in an elevator, an algorithm is executed: the elevator uses previous information to move up or down and to stop at the right time. As you know, almost everything we do on the Internet is mediated by algorithms that can become very sophisticated. However, all algorithms must meet a number of conditions, the most important of which is that the instructions can be carried out in a finite number of steps. Otherwise, the algorithm is useless. If the time it takes to make a chocolate cake exceeds the age of the universe, perhaps we'd better choose another recipe.

Today's complex algorithms, which work thanks to large amounts of data, are built from models, which are nothing more than abstractions and simplifications of the functioning of a real system. There are simple ones, which work on the basis of data that are easy to obtain, and there are complicated ones, and often incomplete ones, which work on the basis of the data they can obtain. The case that Michael Lewis explains in the book Moneyball is one of the first: using historical baseball player data to optimize a team's performance. The data exists and is relevant to predict the performance of a player and, incidentally, the whole team.

However, when you want to predict the ability to repay a loan, there is a lot of relevant data that does not exist. The models are then more approximate, because they use data from similar situations that have occurred in the past. To answer the question of whether the store the applicant wants to open with the loan money will do well, it would be ideal to have information from many of the same stores that the same person would have opened so far in similar positions, as with baseball players. But what if this is the first store he or she wants to open?

The models on which algorithms that work with large amounts of data are based can have holes that lead to uncertainty in the predictions. And uncertainty means that the probability of error when making a specific prediction is higher. It is therefore especially important to know how algorithms are designed and what data they are fed with when refining them.

10 algorithms that will go down in history
  • Google The evolution of the first search algorithm launched by Google in 1999, known as Page Rank, now takes into account whether a website contains plagiarised or offensive content. The world's most widely used search engine is updated almost daily.
  • Amazon The success of the platform that concentrates 40% of online sales is based on an algorithm that scores sellers and generates a system of dynamic prices and recommendations obtained with deep learning systems.
  • Netflix The platform that has changed the way we consume series and films has a recommendation system based on each user's search history and how they have interacted with the content (how they have rated it, if they have watched it to the end, etc.).
  • Social Media Although they can be very diverse, all social media algorithms aim to show relevant and personalised content to each user in order to optimise advertising. They do this with content scoring systems that are based on the actions of millions of users.
  • Spotify The app that has revolutionised the way people listen to music uses user actions and extensive tagging of artists and songs to suggest music that the listener has never heard before on the platform.
  • Dating Apps such as Tinder and Match.com have used both user feedback and user ratings to change the way people find a partner.
  • Distribution Delivery companies use algorithms that rate riders to give them preference in choosing orders and schedules, which may optimise service but increasingly penalises some riders.
  • Uber The pioneer in the on-demand passenger transport service operates with an algorithm that generates dynamic fares and uses opaque criteria to evaluate drivers in order to keep or fire them, according to complaints from some drivers.
  • Crime prediction So-called crime prediction programmes use algorithms based on neural networks to decide where crimes will be committed and whether a person is more likely to reoffend. They have been criticised by a body of more than 2,000 artificial intelligence experts for containing racial and socio-economic biases.
  • Finance The trading of financial securities on stock exchanges around the world is partially automated thanks to sophisticated algorithms. In the United States, 70% of transactions are carried out by machines. In Spain, 40%.
stats