I am researching short length semantic analysis.  Latent semanctic analysis (LSA)  has been used to analyze text passages with word counts of 100 or more.  However, LSA breaks down when analyzing short passages of text.  That is why LSA has been successfully used to grade essays but does not do well with short answer questions.

My research focuses on various ways to approach this problem.

Below are links to papers that are useful to my research:

Barzilay, R. and Lillian Lee, Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization, HLT-NAACL 2004: Proceedings of the Main Conference, 113-120.  Click here for paper.

Blei, D. M., Ng, A. Y., and Jordan, M. I. Latent Dirichlet Allocation. Journal of Machine Learning Research. 3:993-1022. Mar. 2003.  Click here for paper.

Salton, G. and Buckley, C. 1987 Term Weighting Approaches in Automatic Text Retrieval. Technical Report. UMI Order Number: TR87-881., Cornell University.  Click here for paper.

Hofmann T. (1999). Probabilistic Latent Semantic Analysis. Uncertainty in Artificial Intelligence.  Click here for paper.

Landauer, T. K. and Dumais, S. T. (1997) A solution to Plato's problem: the Latent Semantic
Analysis theory of acquisition, induction and representation of knowledge.
Psychological Review, 104(2), 211-240. (1997).  Click here for paper.