miércoles, 13 de febrero de 2013

Ontologies Mining


As a personal project I worked in the development of a system which search separates articles from a web site. The system takes the most important words which describe a concept amongst different writers and automatically the system builds an ontology (RDF graph) with all of the concepts with the sentences and the articles where they were found.

For the system’s development I used the python programming language using for the selection of the most important words which describe a concept, the Bag of words (BoW) algorithm where I constructed a histogram with the words used in the different articles and its repetition numbers. Each histogram element it was clustered using the k-means algorithm amongst “so repeated”, “normally repeated”, and “not repeated”, filtering and only taking the words classed as “normally repeated”.

Paris - France, February 2013

No hay comentarios:

Publicar un comentario