This site uses cookies. More information presents the page The policy for cookies and similar technologies. CLOSE Close the warning message

Piotr Lasek sustained his PhD thesis


Piotr Lasek

On May 15, 2012: Piotr Lasek sustained his PhD thesis: "Efficient Density-Based Clustering", supervised by Professor Marzena Kryszkiewicz.

This thesis is concerned with effective clustering using density-based algorithms such as DBSCAN and NBC as well as the application of indices and the property of triangle inequality in improving efficiency of these algorithms.
A new LVA-Index is proposed as well as methods for building it and searching for nearest neighbors. LVA-Index combines some features of VA-File and the NBC algorithm: it uses the idea of the approximation vectors and the layer approach for determining nearest neighbors. The characteristic feature of the LVA-Index is that it does not require all cells to be checked in order to determine nearest neighbors. Contrary to the NBC approach, the LVA-Index was adapted to search nearest neighbors within layers of levels numbers greater than 1. Another key feature of LVA-Index is that during building the index, the representations of l closest layers containing of non-empty cells are stored in each cell. This feature significantly speeds up the search of nearest neighbors because only the closest layers are scanned. These layers are stored in memory, in order to be accessed very fast.
In this thesis, we also presented our proposal of using the triangle inequality property for increasing efficiency of density-based data clustering algorithms. We presented the results of the experiments we performed for examining the proposed solution with respect to a number of dimensions, number of data objects and number of reference points used for determining distances between data points. It was experimentally proved that, comparing to the density-based clustering algorithms using spatial indices like R-Tree or VA-File, the algorithms we proposed which use the triangle inequality property, are capable of clustering data having even large number of dimensions efficiently.

Last modified: Wednesday, June 12, 2013 - 8:40:31 AM, Bożenna Skalska

x x News (7) - by publication date

‹‹ December 2011 ››
Mon Tue Wed Thu Fri Sat Sun
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31