lunes, 12 de agosto de 2013

Prediction of a model for the detection of fraud in e-transactions

The work consisted in the implementation of different classification methods of machine learning already existing, and the development of several scientific papers about classification algorithms which don’t exist already in any of the free libraries of the market; we used the language R with the IDE eclipse.

After we made the choice of the algorithm which gives the best result, the work consisted in its optimization and evaluation, using several techniques of design of algorithms. Like analysis of correctness and time complexity reduction. We used dynamic programming and heuristics.

The third task was its integration with the algorithm of Map Reduce, for its implementation in a computer cluster RHadoop, and its implementation in multi-core programming with the programming language Julia.

After the construction of the model and its implementations we made a critical analysis of performances, we optimized its parameters using the ROC space and finally we made the comparisons with the models of the market using the confusion matrix.


We developed as well an interface in Java J2SE using the libraries Swing, AWT and Prefuse to the visualization of the model and its statistics.

Lille - France, April - September 2013

lunes, 1 de abril de 2013

Optimization of sequences of observations for the automatic diagnostic

For the final project of the second year of the master, I worked in a project based in the "Decision Theoretic Troubleshooting".

It consists in a interactive application with an user interface, when it is represented a system with its components. For each component we have its probability of running, and all the system is represented by a Bayesian network.

The application help the user to generate a diagnostic for the reparation of the system finding the states with anomalies and his dependence and repercussion among the others components.

The entire application was developed in python, with the PyQt and gnuplot libraries for the user interface and the PyAgrum library for the representation of the Bayesian network.

Paris - France, January - March 2013

miércoles, 13 de febrero de 2013

Ontologies Mining


As a personal project I worked in the development of a system which search separates articles from a web site. The system takes the most important words which describe a concept amongst different writers and automatically the system builds an ontology (RDF graph) with all of the concepts with the sentences and the articles where they were found.

For the system’s development I used the python programming language using for the selection of the most important words which describe a concept, the Bag of words (BoW) algorithm where I constructed a histogram with the words used in the different articles and its repetition numbers. Each histogram element it was clustered using the k-means algorithm amongst “so repeated”, “normally repeated”, and “not repeated”, filtering and only taking the words classed as “normally repeated”.

Paris - France, February 2013

jueves, 31 de enero de 2013

Recommendation system


As an academic project I worked in the development using the python programming language of a recommendation system which has as an entry the preferences and class of several users in a web site.

I used the naive Bayes algorithm when we suppose all the variables independent, with the maximum likelihood and the priori knowledge approaches to determinate the probabilities of belonging for each class in such a way that the system could to predict the class of a new user who doesn’t have all the preferences and furthermore to predict the preferences in absence.

I coded as well an approach using the tree-augmented naive model (TAN) algorithm building a Bayesian network which we learned the mutual information between the variables to predict the class which a user belongs.

Paris - France, January 2013