**Mathematics of Matchmaking**

I'm not sure I can cover all of the math behind the science of matchmaking. I thought it best to describe an example with the company Netflix. Netflix wants to make the decision process of selecting movies for its customers easier. Netflix developed an algorithm to match customers' interests in movies. In fact they even decided to farm out an improvement to the algorithm in a worldwide contest. So how does the Netflix algorithm work? There is a lot of math behind the algorithm but it essentially comes down to finding common features in the customer and movie data. The customers give Netflix a clue to the features they want by ranking movies the customers enjoy. This then becomes the dependent variables in the algorithm formulation. Then the algorithms churn out likely matches based on common feature sets.

Perhaps one of the best writings on this subject was given by Simon Funk on his blog about his Netflix Contest adventures. Simon thought a creative way to find features would be to use the matrix transformation process of Singular Value Decomposition. Traditionally Singular Value Decomposition was used in the microelectronics industry to improve digital signal processing. Simon wrote up an easy solution for matchmaking movie features with the SVD method which spurned a wave of enthusiasm in the Netflix Contest entrants.

Finding feature sets is not exclusively in the realm of Linear Algebra. There are also methods of clustering, regression, support vector machine, neural networks, bayesian networks, and decision trees just to name a few. The science of matchmaking is closely related to artificial intelligence and is commonly referred to machine learning. Machine learning is using algorithms and mathematical methods to evolve and generate behaviors from data in order to solve a problem.

**Processing the Matchmaking Data**

The science of matchmaking would not be complete without the data. The advent of the internet has opened a lot of new enterprises that makes use of millions of data observations. These internet companies have a lot of data to process in huge server arrays that will make even the ENIAC envious. So how do these companies process all of this matchmaking data with their matchmaking algorithms? The basic answer is to break it down into manageable chunks. Perhaps no greater example is Google and their MapReduce methods. MapReduce is a software framework process that takes a large computing need and breaks it down into a distributed network that is more manageable. The first step in the MapReduce process is to Map the data. The Map process is to organize and distribute the data to computing nodes, usually a huge cluster. The Reduce process is to apply the algorithm or learning process to a node in the network and determine an answer to the data its given. This essentially gives it a local optimum. This process is iterated until a globally learned optimum is achieved. This is a very cut and dry description but you get the idea.

The MapReduce software framework is proprietary to Google. That has not stopped software enthusiasts. An open source MapReduce method was created called Hadoop and is growing into a stronger user supported community.

So what can be used with the science of matchmaking? Really anything the heart desires (okay, again, that was bad). Amazon.com uses recommendation algorithms for its books and products. Online dating sites (how appropriate) uses matchmaking methods for matching interested daters. Search engines like Google uses matchmaking algorithms, known as PageRank, to match search keywords with websites. As you can tell these types of enterprises are doing very well thanks to the science of matchmaking.

*This article is part of the INFORMS Online blog challenge. February's blog challenge is Operations Research and Love.*

## No comments:

Post a Comment