This site uses cookies. More information presents the page The policy for cookies and similar technologies. CLOSE Close the warning message

Seminarium naukowe Zakładu Systemów Informacyjnych


Zakład Systemów Informacyjnych Instytutu Informatyki zaprasza na seminarium 3 grudnia br. o godzinie 12:15 w Audytorium Centralnym (AC). Wykład pt. "Exploiting Semantic Analysis of Documents" wygłosi profesor Evangelos Milios z Dalhousie University. Poniżej autorskie streszczenie referatu oraz krótka notka biograficzna o prelegencie.

Abstract: Many document organization tasks, such as a student writing the related work chapter of a thesis, a professor surveying the state of the art in a proposal or planning a reading course, or a conference chair organizing sessions would be performed more efficiently through the use of document clustering. Fully unsupervised document clustering does not always yield clusters that are relevant to the user’s point of view. In this work, we pursue document clustering algorithms that allow the interactive engagement of the user in the clustering process. The main challenge is how to obtain useful clusters with minimum user effort. To address this challenge, we propose (1) a user-supervised double clustering algorithm, designed in three stages, and (2) a novel approach for mapping documents to entities and concepts.
The user-supervised double clustering algorithm was demonstrated to be competitive to state-of-the-art clustering algorithms. It was further extended into an ensemble algorithm to incorporate Wikipedia concepts in the document representation. User supervision was introduced into these algorithms in the form of term supervision (term labelling) and document supervision. A visual interface was designed to make the algorithms accessible to real domain users. The work received the Best Student Paper award at ACM DocEng 2014.
To address the problem of coming up with succinct and intuitive representations of documents in terms of entities and concepts, we have pursued two directions of research: (1) we designed a system that accomplishes entity recognition and disambiguation using the Wikipedia category structure in multiple languages.  We are currently extending this system to concept recognition and disambiguation. Our system got the first prize in the ERD challenge at ACM SIGIR 2014; (2) we proposed a simple but very effective approach for computing semantic relatedness between words and documents based on the Google n-gram corpus, which is competitive to human performance on standard word pair data sets. 
The clustering work is joint with H. Nourashraf and D. Arnold, the ERD work with Marek Lipczak and Arash Koushkestani, and the Google n-gram based semantic relatedness with Aminul Islam and Vlado Keselj.

Speaker’s Bio. Evangelos Milios received a diploma in Electrical Engineering from the NTUA, Athens, Greece, and Master's and Ph.D. degrees in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. Since July of 1998 he has been with the Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, where he served as Director of the Graduate Program (1999-2002) and as Associate Dean - Research since 2008. He is a Senior Member of the IEEE. He was a member of the ACM Dissertation Award committee (1990-1992), a member of the AAAI/SIGART Doctoral Consortium Committee (1997-2001) and he is co-editor-in-chief of Computational Intelligence. He was a member (2008-2010) and Group Chair (2011-2013) of the Computer Science Evaluation group of the Natural Sciences and Engineering Research Council of Canada. At Dalhousie, he held a Killam Chair in Computer Science (2006-2011). He has published on the interpretation of visual and range signals for landmark-based navigation and map construction in robotics. He currently works on modelling and mining of content and link structure of Networked Information Spaces, text mining and visual text analytics. 

Last modified: Wednesday, November 26, 2014 - 12:45:34 PM, Bożenna Skalska

x x News (4) - by publication date

‹‹ November 2014 ››
Mon Tue Wed Thu Fri Sat Sun
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30