RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 학술지명
          펼치기
        • 주제분류
        • 발행연도
        • 작성언어
        • 저자
          펼치기

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • Research on On-line Uyghur Handwritten Character Recognition Technology Based on Modified Center Distance Feature

        Askar Hamdulla,Wujiahemaiti Simayi,Mayire Ibrayim,Dilmurat Tursun 보안공학연구지원센터 2014 International Journal of Signal Processing, Image Vol.7 No.5

        Through the analysis on the unique characteristics of Uyghur characters, in order to further improve the recognition rate, this paper developed the Center Distance Feature (CDF) to its modified form which is named as Modified Center Distance Feature (MCDF). By combination with some low dimensional features including stroke number feature, additional part’s location feature, shape feature, bottom-up and left-right density feature(BULR) in experiments, MCDF gifted robust recognition accuracy of 98.77% for the 32 isolated forms of Uyghur characters. MCDF increased the recognition accuracy by 4.51 points comparing with the result from the combination of CDF with the same low dimensional features mentioned above, which is 94.16%. This paper used the samples from 400 different volunteers. The recognition system is trained using 70 percent of 12800 samples from 400 different writers and tested on the remained 30 percent.

      • On-line Handwritten Uyghur Word Recognition Using Segmentation-Based Techniques

        Mayire Ibrayim,Askar Hamdulla 보안공학연구지원센터 2015 International Journal of Signal Processing, Image Vol.8 No.6

        An approach to online handwriting word recognition using segmentation-based techniques is presented in this paper. This approach is referred to as lexicon-driven approach because an optimal segmentation is generated for each string in the lexicon. Word recognition problem is transformed into matching optimization problems between the dictionary entry and the handwritten word image. The segmentation processes use these steps such as removing delayed strokes, shape analysis of the stroke trajectory, reconstructing delayed strokes and combining adjacent fragments. Dynamic matching is used to ranking the lexicon entries in order to get best match. A match score is assigned to a segmentation and string by matching each segment to the corresponding character in the string with a character recognition algorithm that returns confidence value for each character class. As a result the performance for lexicons of size 10, 100, 500 and 1000are 93.17%, 70.33%, 59.79%,51.20% and 94.85%, 79.75%, 74.42%, 62.19% for adding distance and normalizing distance respectively.

      • Uyghur Stemming Using Conditional Random Fields

        Abdurahim Mahmoud,Akbar Pattar,Askar Hamdulla 보안공학연구지원센터 2015 International Journal of Signal Processing, Image Vol.8 No.8

        Stemming is a natural language processing task that to remove all derivational affixes from a word. This task proved to be harder for languages with complex morphology such as the Uyghur language. This paper presents a new stemming method for Uyghur words based on CRFs (Conditional Random Fields). In the proposed method all words in the training corpus are segmented into syllables and each syllable are tagged as a part of stem or as a part of affix. We experimentally evaluated this method with five test files each includes 100 sentences , results have shown that our method gets good performance, average stemming precision, recall and F-score in open test reached 98.42%, 98.34% and 98.38% respectively.

      • A Survey of Uyghur Person Name Recognition

        Tashpolat Nizamidin,Palidan Tuerxun,Askar Hamdulla,Muhtar Arkin 보안공학연구지원센터 2016 International Journal of Signal Processing, Image Vol.9 No.3

        Uyghur is one of the most populous and civilized groups with Turkic ethnicity and mainly located Xinjiang Uyghur Autonomous Region of China. Uyghur language belongs to the Karluk branch of the Turkic language family in Altaic language system, and holds agglutinative characteristics in morphological structure. Named Entity Recognition (NER) is an Information Extraction task that has become an essential part of Natural Language Processing (NLP) tasks, such as Machine Translation and Information Retrieval. In this paper, as a subtask of NER, the importance of Uyghur Named Entity Recognition (UPNR) task is demonstrated, the main characteristics of the Uyghur language are highlighted, and the aspects of standardization in annotating named entities are illustrated. Moreover, the approaches used in Uyghur NPNR field are explained and the features of common tools used in Uyghur NPNR are described. A brief review of the state of the art of Uyghur NPNR research is discussed, too. Finally, we present our conclusions. Throughout the presentation, illustrative examples are used for clarification.

      • The SVM based Uyghur Text Classification and its Performance Analysis

        Palidan Tuerxun,Fang Dingyi,Askar Hamdulla 보안공학연구지원센터 2015 International Journal of Multimedia and Ubiquitous Vol.10 No.4

        This paper mainly explores the use of Support Vector Machines (SVMs) for Uyghur text classification, presents the process of text categorization: Text preprocessing, feature dimensionality reduction, representation method and classification of text features etc., discusses the SVMs classification algorithm in the application of Uyghur text classification. Focus on the construction of text categorization model and its procedures. Experiment results show that training by using the selected training data with the guarantee of the performance of the classifier, has higher efficiency than other nearest neighbor classifier (KNN), Naive Bayes (NB) classifier with increased accuracy.

      • Morpheme Segmentation and Concatenation Approaches for Uyghur LVCSR

        Mijit Ablimit,Tatsuya Kawahara,Askar Hamdulla 보안공학연구지원센터 2015 International Journal of Hybrid Information Techno Vol.8 No.8

        In this paper, various kinds of sub-word lexica are thoroughly investigated under the framework of Uyghur LVCSR system. Experimental results show that it is inefficient to directly model based on word units or small units like morpheme or even syllable units. It is observed that an optimal sub-word unit set between word and morpheme units can better fit for ASR system. In order to select best unit set we have investigated several effective unit segmentation, concatenation approaches, and their ASR performances. For segmentation approach, we investigate a supervised segmentation which split words into the smallest functional units - the linguistic morphemes, and an unsupervised segmentation which extract pseudo-morphemes (or statistical morphemes). In supervised model, a leaning algorithm is trained on a manually prepared training corpus, and morpho-phonetics changes are analyzed. In the unsupervised model, the Morfessor tool is used to extract pseudo-morphemes from a raw text corpus. For concatenation approach, several effective concatenation approaches are investigated based on linguistic morphemes. First is the data-driven approach which concatenates morpheme sequences based on certain measures like co-occurrence frequency or mutual probability. Second is a model based approach which merges units with global statistical criteria. In this study, the Morfessor program is revised and turned into concatenation program by controlling segmentation points. Third is the two-layer-lexica based concatenation approach which extracts an optimal sub-word unit set by aligning and comparing the ASR results of word and morpheme two lexical layers. This method utilizes both speech and text, and produced the best results in terms of WER and lexicon size, and proved to be very stable. The best optimal lexicon, which is obtained totally on the basis of HMM based acoustic model, outperformed all other baseline lexica. And when all these lexica are directly incorporated with a deep neural network (DNN) based acoustic model, without changing the speech and text training corpora and language models, the optimal lexicon not only drastically improved the ASR accuracy but also outperformed other units as a proof of the generality of the two-layer-lexica based approach.

      • A Survey on Uyghur Ontology

        Hankiz Yilahun,Seyyare Imam,Askar Hamdulla 보안공학연구지원센터 2015 International Journal of Database Theory and Appli Vol.8 No.4

        Ontology has become a hot research topic in the fields of artificial intelligence such as knowledge representation, knowledge engineering and natural language processing (NLP) etc.. In this paper, according to the application requirements in the intelligent Uyghur information retrieval system, by giving the brief description about the ontology and its construction rules, methods, tools and descriptive languages, have conducted the contrastive analysis the current research status about the ontology in domestic and abroad, and then sum up some key issues in Uyghur ontology construction procedures and some early achievements. After all, the further research directions are also proposed in this paper.

      • An Extractive Approach for Uyghur Text Summarization

        Turdi Tohti,Hankiz Yilahun,Askar Hamdulla 보안공학연구지원센터 2016 International Journal of Hybrid Information Techno Vol.9 No.4

        This paper studies Uyghur single text summarization and proposes some of new or improved approaches in the aspects of keyword extraction and evaluation, sentence selection and redundancy removal, also in readability improvement and so on. Proposes an improved frequent pattern-growth approach to extract the semantic strings which perfect both on its semantics and structural integrity, to evaluate this strings uses multi-feature fusion approach and select most important ones as keywords to describe the text theme effectively. In the aspect of sentence similarity and redundancy removal, proposes the idea of theme including degree, so as to effectively remove the redundant sentences and improves the summary quality significantly. Also introduces sentence alignment between the texts that after being stemming and original text, so as to solve the problems that summary naturalness, coherence and comprehensibility decline and other issues caused by stemming process.

      • The Acoustical Analysis of Vowel Harmony Issues in Uyghur Words

        Seyyare Imam,Gulnur Arkin,Askar Hamdulla 보안공학연구지원센터 2016 International Journal of Multimedia and Ubiquitous Vol.11 No.1

        The vowel harmony phenomena in Uyghur words, its representation forms and main types are introduced first in this paper. Then, from the experimental phonetics point of view, the feasibility and affectivity of acoustical analysis method for harmony analysis are verified. finally, as the application part, the acoustic features such as formant frequency, resonance peak value, vowel duration, vowel pitch and the sound intensity etc. for every disyllabic, trisyllabic and quad syllabic Uyghur words are collected from the “Acoustical Database for Uyghur Language” established by Laboratory of Institute of Ethnology and Humanities of Chinese Academy of Social Sciences and Xinjiang University. And the statistical analysis carried out on those acoustical feature values and some conclusions and rules are also summarized.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼