Research Abstract (written in 30 minutes :))

To be re-written.



Some of my own IT notes:

TAO/PETSc/Lapack Installation on Cygwin:  [link(updated on August 5, 2007)

Using Matlab C/C++ Math Library without Installing Matlab or its Compiler:  [link  (updated on August 7, 2007)

Accessing remote computer via ssh without keying in password every time:  [link

TAO/PETSc Installation on SoC@NUS computing clusters:  [link

Practical examples for manipulating Vectors and Matrices in PETSc:  [c file


Some research notes I wrote for presentation at reading groups:

Max-margin Methods for Structured Outputs: [my notes]

Survey Propagation: [my notes]  [original paper]

Loopy BP for Max b-matching:  [my notes]  [original paper]

Message Passing Formula in Gaussian MRF [notes]

Proof of Factorization of Tree-structured Distributions [notes]

A Very Gentle Note on the Construction of Dirichlet Process [notes]



Below was written when I was in Singapore.

OUT OF DATE ALREADY.  Will be updated.

Note: Here are some background information of the area I study.  For my OWN work, please see Publications.

Semi-Supervised Learning

    I planed to write something for Semi-supervised Learning.  However, after reading  Zhu Xiaojin's PhD dissertation (now Assistant Professor at UWisconsin), I find it far better to read his thesis.  He also maintains a web page that puts together the historical and cutting edge research in semi-supervised learning.  The work is up-to-date (May 2005), clear and comprehensive.  Proudly, I got my Bachelor's degree from the same university as Prof. Zhu, Shanghai Jiao Tong University (though he graduated and left the university 3 years before I was admitted) .

Learning on Structured Data
    I also planed to write something for Structured Data.  However, after reading Ben Taskar's PhD dissertation, I find it far better to read his thesis.  It contains his work that won the Student's Award for NIPS 2003.  The work is up-to-date (December 2004), clear and comprehensive.

    Here is a list (must be incomplete) of recent papers (up to early 2005) on machine learning with structured data.

Optimization tools and miscellaneous Linux programming skills

    There is a huge number of optimization tools/software available online.  What I like to use is TAO, based on PETSc.  This package utilized MPI, thus it is very suitable for maximum entropy models and CRF.  In fact, we used 30 processors to compute the objective function value and gradient (expectations) in parallel, by uniformly distributing data examples to all processors available and  then assemble their contribution to calculate the gradient.

Nature and Nature's laws lay hid in night:
God said, Let Newton be! and all was light.
      ----Alexander Pope