CRL
Abstract:Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models -- grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model -- have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same framework, 2) comparing them under the same conditions, and 3) developing a way to improve machine transliteration through this comparison. Our comparison showed that the hybrid and correspondence-based models were the most effective and that the four models can be used in a complementary manner to improve machine transliteration performance.
Abstract:We have developed a new method for Japanese-to-English translation of tense, aspect, and modality that uses an example-based method. In this method the similarity between input and example sentences is defined as the degree of semantic matching between the expressions at the ends of the sentences. Our method also uses the k-nearest neighbor method in order to exclude the effects of noise; for example, wrongly tagged data in the bilingual corpora. Experiments show that our method can translate tenses, aspects, and modalities more accurately than the top-level MT software currently available on the market can. Moreover, it does not require hand-craft rules.
Abstract:In this paper, we present a method of estimating referents of demonstrative pronouns, personal pronouns, and zero pronouns in Japanese sentences using examples, surface expressions, topics and foci. Unlike conventional work which was semantic markers for semantic constraints, we used examples for semantic constraints and showed in our experiments that examples are as useful as semantic markers. We also propose many new methods for estimating referents of pronouns. For example, we use the form ``X of Y'' for estimating referents of demonstrative adjectives. In addition to our new methods, we used many conventional methods. As a result, experiments using these methods obtained a precision rate of 87% in estimating referents of demonstrative pronouns, personal pronouns, and zero pronouns for training sentences, and obtained a precision rate of 78% for test sentences.
Abstract:A noun phrase can indirectly refer to an entity that has already been mentioned. For example, ``I went into an old house last night. The roof was leaking badly and ...'' indicates that ``the roof'' is associated with `` an old house}'', which was mentioned in the previous sentence. This kind of reference (indirect anaphora) has not been studied well in natural language processing, but is important for coherence resolution, language understanding, and machine translation. In order to analyze indirect anaphora, we need a case frame dictionary for nouns that contains knowledge of the relationships between two nouns but no such dictionary presently exists. Therefore, we are forced to use examples of ``X no Y'' (Y of X) and a verb case frame dictionary instead. We tried estimating indirect anaphora using this information and obtained a recall rate of 63% and a precision rate of 68% on test sentences. This indicates that the information of ``X no Y'' is useful to a certain extent when we cannot make use of a noun case frame dictionary. We estimated the results that would be given by a noun case frame dictionary, and obtained recall and precision rates of 71% and 82% respectively. Finally, we proposed a way to construct a noun case frame dictionary by using examples of ``X no Y.''
Abstract:Question answering task is now being done in TREC8 using English documents. We examined question answering task in Japanese sentences. Our method selects the answer by matching the question sentence with knowledge-based data written in natural language. We use syntactic information to obtain highly accurate answers.