A multi-word expression (MWE) is a lexeme (term) made up of a sequence of two or more lexemes that has properties that are not predictable from the properties of the individual lexemes or their normal mode of combination. In short, a MWE is like an idiom. In several languages, there are built-in dictionaries of MWEs. In this project, we aim at forming a dictionary (lexicon) of MWEs for Turkish.
Text summarization is the task of forming a summary of a given document. There are two types of summaries: extracts (just selecting some of the important sentences in the document as the summary) and abstracts (generating new sentences from scratch to form the summary). In this project, we deal only with extract type of summarization. The classical metric used to evaluate the success of a summarization system is the Rouge metric.
Comparative analysis is the task of identifying comparison sentences in the documents, analyzing the word groups and structures in these sentences, and extracting the compared objects and their properties. For instance, in the comparison sentence “the sound quality of CD player X is better than that of CD player Y”, the objects CD players X and Y are compared and the result of the comparison is that X is better than Y in terms of sound quality. In this project, we aim at doing comparative analysis for Turkish sentences.