Date: 11 Oct 2024
In the first part of this series, we introduced how Software Brio helped a USA based client predict salaries based on various factors, including industry, location, experience, and gender.
We broke down the solution into three major steps: data collection and storage, model generation , and building a mobile tool .
In this blog, we'll explore the technical details of each step, offering an inside look at how we crafted, implemented, and refined every component.
One of the initial tasks was to ensure the client could easily upload salary data, whether in small batches or large volumes. This required a system that was user-friendly for manual uploads, yet robust enough to handle bulk data processing. Here’s how we accomplished this:
We built a ReactJS front-end that served as the interface for the client’s HR team to upload data. The tool had two primary modes:
In both cases, the data was sent through an API to a Flask backend , where it underwent preliminary validation.
Once the data was received on the backend, we performed data validation using Pandas , ensuring that it was properly formatted, cleaned, and checked for missing or inconsistent values.
The validated data was then stored in Amazon S3 . This cloud-based storage system was ideal for managing large volumes of data with high availability and scalability. Each dataset was organized by month and indexed by industry, location, and other features for easy retrieval.
Once the data was securely stored, we moved on to generating the salary prediction model. This process required efficient data processing and model training to ensure timely and accurate predictions.
Given the size of the datasets, which included thousands of records across different industries and regions, we needed a system capable of handling large-scale data. For this, we used Apache Spark. Spark is designed for distributed computing, which means it can process huge amounts of data efficiently across multiple nodes.
Each month, Spark retrieved the latest dataset from S3 and began the data processing pipeline:
After training the model, the next task was to ensure that it could be accessed by various client tools and applications. We hosted the trained model on a Flask-based API that acted as the interface between the model and the client’s Android app.
Finally, the client needed a user-friendly mobile app to allow HR teams to input features like years of experience, industry, and gender, and receive salary predictions in real-time.
We developed an Android app using Java that features a simple, intuitive interface. Users can input the required information and send the data to the API. The app is designed to be clean and efficient, ensuring easy navigation through the input fields.
Given the importance of speed and security in a mobile app:
In Part 2, we delved into the technical side of how SoftwareBrio helped the client build a salary prediction tool. By leveraging a combination of cloud storage (S3), large-scale processing (Spark), and mobile development (Android), we were able to create a seamless, scalable solution.
In Part 3, we’ll discuss the challenges we encountered during development and how we addressed them, along with feedback from the client’s end-users on how the solution transformed their salary prediction process. Stay tuned!
Software Brio is a software consultancy company in India that develops custom AI solutions for clients. Follow us for more case studies like this. Explore our portfolio here: Portfolio Link