Data Mining and its Importance
Data Mining is the process of intelligently extracting patterns and information from large data sets involving methods like Machine Learning, Statistics and Database Systems. The extracted information will be transformed into an understandable structure that will help businesses make better decisions, thus achieving greater success.
A well-known retail company based in Dubai with operations across the entire Middle East added an online form to its website, for people to fill and upload their CVs for available and future job vacancies.
The client wanted the following information to be extracted from each form and CV and added to a final Excel spreadsheet: Name, Email, Mobile Number, Gender, whether the person is a UAE national or not, Language of the CV (Arabic or English), in which Emirate does the person live exactly, Functional Expertise and Years of Experience.
The main issue at hand was that there were no restrictions of any kind on the file types uploaded. This means that people could upload their CVs in an MP4 or JPEG format or as a PowerPoint or Excel instead of only restricting the uploads to Document or PDF format. And in the case of a corrupt PDF, the client’s system wouldn’t recognize it as an error and the PDF uploads normally.
B.O.T’s team processed 20,000+ CVs in less than 5 days.
The PNG and JPEG formats couldn’t be processed automatically since no text could be extracted. So both had to be processed manually by the B.O.T workforce. The workforce went through each CV, checked whether the file can open or not, and extracted all required fields.
The rest needed to be processed automatically.
B.O.T’s Data Scientist managed to overcome all obstacles by building custom systems that could bypass all challenges.
The end result was an Excel sheet including all required fields, for each valid profile, as requested by the client. The client can then filter from the list of profiles as they desire.