Thursday 25 August 2016

Matrix Factorization: initial values

The initial distribution of feature values affects the results of matrix factorization (SVD) algorithm (this implementation). In this post, let's have a look at performance of SVD algorithm with different distributions of initial values. To conduct experiments, I used Lenskit framework and MovieLens100K dataset. The experiments includes three distributions:
  1. Fixed values (0.1) (Fixed)
  2. Random values (Random)
  3. Popularity distribution for item features and random for user features (POP)

Wednesday 1 June 2016

The difference between Venn Diagram and Euler Diagram

Both diagrams are based on the set theory. The main difference between Venn and Euler diagrams is that a Venn diagram shows all possible logical relationships between sets, while an Euler diagram only shows existing relationships. In other words, in a Venn diagram you have to depict each intersection between each set, even if the intersection is empty, while in an Euler diagram you only depict intersections that are not empty.
Suppose we have three sets:

Tuesday 17 May 2016

Latex: Citation Order for elsarticle

To order citations in your paper, you might want to use cite package like this:
\usepackage{cite}
However, as it was described here, the package conflicts with natbib package embedded into elsarticle.

Saturday 14 May 2016

Summary: Cross-Domain Recommendations with Overlapping Items

Full text available: proceedings | direct link.


D. Kotkov, S. Wang, and J. Veijalainen. Cross-domain recommendations with overlapping items. In Proceedings of the 12th International Conference on Web Information Systems and Technologies, pages 131-138, 2016.

Thursday 12 May 2016

Summary: Challenges of Serendipity in Recommender Systems

Full text available: proceedings | direct link.

D. Kotkov, J. Veijalainen, and S. Wang. Challenges of serendipity in recommender systems. In Proceedings of the 12th International Conference on Web Information Systems and Technologies, pages 251-256, 2016.

Saturday 20 February 2016

Lenskit: Recommendation browser

Recommendation Browser has been developed to investigate recommendations users receive in Lenskit framework. The graphical user interface looks as follows:

Thursday 18 February 2016

Lenskit: data split in evaluation

Lenskit 2 has an embedded evaluation module. In this post I am going to describe how it splits datasets.

Lenskit: Popular items evaluation exception solution

In Lenskit 2.2.1 in evaluation when algorithms get to rank popular items the framework throws exception. The problem has been reported to the authors of Lenskit and it will be fixed soon. In this post I offer a workaround for this problem for those who use version 2.2.1.

Tuesday 16 February 2016

Bash: screen reminder

Bash command screen allows to run processes in background and return to them. Here are a few commands that might be useful.

Git: quick reminder

The post consists of sets of frequently used sets of commands.

Friday 8 January 2016

Lenskit: Popularity baseline (Learning to Rank)

This post is dedicated to popularity baseline in Lenskit 2 framework. The framework lacks this baseline. I therefore provide an implementation and demonstrate the results of the baseline.
The implementation includes three classes:
PopItemScorer - items scorer, which provides actual scores for items
PopModel - model that contains popularity of each item
PopModelBuilder - builder calculates popularity for each items and puts them to the model.

Lenskit code example

Recently, I started using Lenskit framework. The framework is designed for recommendations. It contains a few useful recommendation algorithms, such as item-item collaborative filtering and matrix factorization. However, there is a lack of documentation and examples on the framework.
I needed to use SimpleEvaluator class and I could not find a relevant documentation on the the class or a good example how I can use it.