Homepage of Aditya Krishna Menon

Code for Linking losses for density ratio and class-probability estimation, ICML 2016

The aim of this MATLAB code is to replicate the tables of results and figures from the paper Linking losses for density ratio and class-probability estimation, appearing in ICML 2016.

Unzipping the code should reveal four subfolders:

weight_function/: weight function experiments (Sec 8.1).
covariate_shift/: covariate shift experiments (Sec 8.2).
rtb/: ranking the best experiments (Sec 8.3).
helper/: miscellaneous helper files (see below).

We describe how to run the experiments for each of Sections 8.1 -- 8.3.

Weight function analysis

For the weight function analysis, in the weight_function folder, simply run:


	>> loss_regret_script;

You should see an output such as:


>> loss_regret_script;

reg = 0.3614 [lambda = 10^-8, gamma = 0; 1.8 secs]

max regret = 0.3614 [gamma = 0, lambda = 10^-8]



reg = 0.3821 [lambda = 10^-8, gamma = 0; 1.5 secs]

max regret = 0.3821 [gamma = 0, lambda = 10^-8]



reg = 0.5460 [lambda = 10^-8, gamma = 0; 1.1 secs]

max regret = 0.5460 [gamma = 0, lambda = 10^-8]

A plot mimicking Figure 1 of the paper should also be displayed.

Covariate shift adaptation

For the covariate shift experiments on the poly dataset, in the covariate_shift folder, simply run:


	>> poly_script;

The script will go through each of the losses considered in Sec 8.2, and train a kernel model to estimate the density ratio. The NMSE on the test sample is reported. You should see output that mimics Table 2(a), such as:


             Uniform & 1.2723 $\pm$ 0.0302 \\

               KLIEP & 0.6916 $\pm$ 0.0136 \\

                LSIF & 0.7742 $\pm$ 0.0217 \\

               uLSIF & 0.7038 $\pm$ 0.0102 \\

               ...

For the experiments on the amazon dataset, in the covariate_shift folder, simply run:


	>> amazon_script;

The script will go through each of the losses considered in Sec 8.2, and train a kernel model to estimate the density ratio. The pairwise disagreement on the test sample is reported. Following the generation of the feature mappings (after TF-IDF and SVD projection), you should see output that mimics Table 2(b), such as:


generating data trial #doing svd...done

...


             Uniform & 0.1582 $\pm$ 0.0018

               KLIEP & 0.1500 $\pm$ 0.0018

                LSIF & 0.1500 $\pm$ 0.0019

Note that the file amazon.mat contains the processed Amazon data as provided here.

Ranking the best

For the ranking the best experiments, in the rtb folder, simply run:


	>> rtb_script;

The display window will then fill with the results of cross-validation and training each of the methods on each of the datasets. The script proceeds by taking each dataset and then each method in turn. The script will output, for each train-test split, the performance of a method according to all the performance criteria listed in Appendix H. Sample output:


= Dataset german [n = 1000, d = 24] =

unknown proper_logistic

unknown proper_p-classification

unknown proper_lsif

	fold 1	2	3	4	5	

Proper_Logistic	0.7845	0.0346	0.1827	0.5188	0.0000	0.6000	(0.0 secs; lambda 1.953125e-03, pPush 4, lPush 4)

	fold 1	2	3	4	5	

Proper_Logistic	0.7936	0.0342	0.1815	0.5876	0.0100	0.6000	(0.0 secs; lambda 2.441406e-04, pPush 4, lPush 4)

	fold 1	2	3	4	5	

Proper_Logistic	0.8011	0.0436	0.1911	0.6632	0.0490	0.8000	(0.0 secs; lambda 1.220703e-04, pPush 4, lPush 4)

...

Once the script is completed, it will output the LaTeX source for Table 5 in the appendix. Be warned that this script is likely to take a long time.

During the course of this script, we will save, for each trial, the results of cross-validation as well as the final predictions. These can be used subsequently to either skip cross-validation and just perform learning, or to skip both and just produce formatted tables of results. To just print out the results of a previous run, change


	PRINT_ALL = 1;

in Line 42 of rtb_script.m.

Third-party libraries

The code relies on certain third-party MATLAB code for various operations. For convenience, the code is included in the ZIP file as part of the helper folder. The libraries are:

cprintf: print colour text
minFunc: LBFGS optimisation
liblinear: LibLinear (note: you may need to make binaries for your architecture.)
liblinear-weights: LibLinear with weights (note: you may need to make binaries for your architecture.)
sampleError: computes the AUC of a scorer
tfidf2: computes a TF-IDF matrix