Gitlab search engine
Metadata
creation year: 2017
This repository contains the work done for the dissertation module of my MSc in computer science at the Oxford Brookes University, including a search engine written in Haskell using the Scotty web framework and several documents created alongside the engine.
Documentation
The documents are :
- Dissertation proposal : Created after the development stage, describe the future project.
- Interim Report : A mid-term document explaining the current state of the project.
- Presentation : A support for the oral presentation.
- Short paper : A small document that linger over the scientific side of the project.
- Final report : The larger document explaining the creation process of the entire project.
Each of these documents can be viewed in their raw formats (LaTex or Mardown).
Abstract
The goal of this dissertation is to create a search engine capable of gathering information autonomously about open source projects hosted on several Gitlab servers, and retrieve this information to fit a search query performed by a user while being the most accurate and quick as possible.
A literature review is conducted about the Trie and B-tree data structures and their efficiency within the scope of search engines as well as a study of the current state of web Git servers.
Finally the main decisions taken during the development process will be explained.
Search Engine
The search engine is composed by the following scripts :
- CustomTypes.hs : Containing each type declaration and functions related to them.
- DbSql.hs : Containing every CRUD function for the SQL database.
- JSONParser.hs : Containing functions for JSON parsing (JSON files are created from Gitlab API response).
- Main.hs : Containing the Scotty routing and templates functions.
- Query.hs : Containing function for creating JSON files by querying Gitlab's API.
- Research.hs : Containing search functions used for the comparison between SQL and Trie performances.
- Session.hs : Containing functions creating and managing the Scotty's sessions.
- TestDb.hs : Containing benchmark creation functions.
- Trie.hs : Containing every CRUD function for the Trie.