I am researching short length semantic
analysis.
Latent semanctic analysis (LSA) has been used to
analyze
text passages with word counts of 100 or more. However, LSA
breaks down when analyzing short passages of text. That is
why
LSA has been successfully used to grade essays but does not do well
with short answer questions.
My research focuses on various ways to approach this problem.
Below are links to papers that are useful to my research:
Barzilay, R. and Lillian Lee, Catching the Drift: Probabilistic Content
Models, with Applications to Generation and Summarization, HLT-NAACL
2004: Proceedings of the Main Conference, 113-120.
Click here
for paper.
Blei, D. M., Ng, A. Y., and Jordan, M. I. Latent Dirichlet Allocation.
Journal of Machine Learning Research. 3:993-1022. Mar. 2003.
Click here
for paper.
Salton, G. and Buckley, C. 1987 Term Weighting Approaches in
Automatic Text Retrieval. Technical Report. UMI Order Number:
TR87-881., Cornell University. Click here
for paper.
Hofmann T. (1999). Probabilistic Latent Semantic Analysis. Uncertainty
in Artificial Intelligence. Click here
for paper.
Landauer, T. K. and Dumais, S. T. (1997) A solution to Plato's problem:
the Latent Semantic
Analysis theory of acquisition, induction and representation of
knowledge.
Psychological Review, 104(2), 211-240. (1997). Click here
for paper.