Wallis, CH
I'm a graduated software engineer based in Switzerland. My expertise lies in software development and data related technologies, such as databases and AI models. I take part in the full stack aspect of IT projects and can adapt to the fast evolving field of computer science.
This project aims to use document images with neural networks architectures to automatically classify them. The specific RVL-CDIP dataset was used to train and analyze the AI models performances.
Specific architectures named Convolutional Neural Netorks were used in the experiments. These models are perfomant for processing images and learn patterns in them. Serveral famous models like VGG-16 or ResNet50 were used and their performance compared.
Another way to learn on the images was to use extracted-text data and specifiec neural networks to classify the documents. The data was obtained by applying OCR techniques to the documents images and then passed to the models to be trained on. The type of neural networks used for this are Transformers, that are performant for text-processing tasks. The well-known BERT architecture and some variants were used and compared.
The last type of neural network architecture used in this project are the multimodal architectures that combines images data and textual data. This approach requires to build a multi-architecture system that combines two models and combines the predictions made on the image model and textual model. This architecture was experimented with different parameters combinationd and has a better accuracy on the dataset.
Sed modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
This small project realized during my Bachelor studies in Fribourg. The goal is to try different word embeddings algorithms to allow books recommandation based on the title, category and description. To achieve this, a CSV dataset is used to train word embeddings machine learning architectures, using Word2Vec, Doc2Vec and FastText. A sub-part of 1000 books are used to train the models due to the computation time it would take. A specific Python script allow to perform books recommandations for a provided one. The script allow to specify the number of recommandations and the prefered algorithm.
- HTML / CSS / JavaScript
- VueJS
- Dart (Flutter)
- JavaScript (Node.js, ExpressJs)
- SQL (MySQL, PostgreSQL)
- REST APIs
- Web Authentification