Salary Prediction : Part 1

Date: 8 Oct 2024

In this blog series, we’ll talk about a real-world scenario where Software Brio  helped a USA based client predict salaries based on a range of variables. This blog series has three parts:   in the first part, we present an overview; in the second, we explore the technical aspects in depth; and in the final section, we address the challenges encountered and how we resolved them.

HRs are using the AI salary prediction tool to help understand the salary to propose.


Our client, an HR consulting firm, approached us with a complex problem: they wanted to create a predictive tool that could estimate salaries across various industries, regions, experience levels, and demographic factors such as gender. We used the power of AI to solve this problem for them.

Our general approach was to gather client requirements, propose a solution, take feedback, iterate, and ultimately deliver a solution that satisfied the customer.

Depiction of how SoftwareBrio divides client requests into steps and milestones, resulting in deliverables.

Client Requirements

The client was looking for a sophisticated solution that could process the following inputs:

  • Industry : The sector the individual works in (e.g., Tech, Healthcare, Finance).
  • Location : The city or country, since salaries vary regionally.
  • Years of Experience : The level of expertise and career stage of the individual.
  • Gender:  To account for gender-based salary discrepancies.
  • Additional Variables:  The client wanted the flexibility to include more variables over time.

They wanted an automated solution to:

  • Collect and manage large datasets on individual salaries.
  • Predict salary levels for individuals based on the inputs listed above.
  • Integrate the predictions into a mobile tool that HR teams could use easily.

We broke down the solution into three distinct steps

Step 1: Data Collection and Storage:

The initial step focused on creating a streamlined data collection and storage solution. We designed an intuitive tool that enabled the client to upload data either one at a time or in bulk. This tool was seamlessly integrated into their current systems, allowing HR personnel to input data manually or utilize batch processing for managing large datasets.

  • Cloud Storage (S3) : In the backend, the uploaded data was stored securely in Amazon S3 buckets. The S3 infrastructure provided a scalable, cost-effective solution for storing vast amounts of structured and unstructured salary data.
  • Data Organization:  Each dataset was organized by month, region, and industry, which made it easier to retrieve for future processing.

Step 2: Data Processing and Model Generation:

The next step was to build the predictive model based on the collected data. We designed a Spark-based job to process the data and generate the salary prediction model.

  • Spark Jobs : Spark was chosen due to its ability to handle large-scale data efficiently. Each month, the latest batch of data would be processed, and the updated model would be generated.
  • Model Training : The Spark job took the last month’s salary data and generated a predictive model using machine learning algorithms like linear regression or random forests , depending on the data distribution and complexity.
  • API Deployment : Once the model was ready, we hosted it on an API that could be accessed by various client tools and applications. This ensured that the model could be updated monthly, ensuring the salary predictions remained relevant to current market trends.

Step 3: Building the Mobile Tool:

In the final step, we developed a mobile tool (an Android app) to enable HR professionals to easily upload individual salary features (e.g., years of experience, location, industry) and get an instant salary prediction.

  • User Interface:  The app was designed to be user-friendly with a simple input form where users could add variables like industry, experience, and gender.
  • API Integration : The app was integrated with the backend API we developed, allowing real-time salary predictions. Once the user provided the necessary inputs, the app fetched the latest prediction model and generated an estimated salary.

A comprehensive overview of our initial implementation of big data technologies and machine learning to create a sophisticated HR salary prediction pipeline.

Challenges and Future modifications

Throughout the project, we faced several significant challenges that had to be tackled to achieve a successful outcome. One of the primary difficulties was ensuring the quality and availability of data, as we required comprehensive, reliable, and current data for all variables. Developing the predictive model was challenging due to the numerous factors affecting salaries and the necessity to consider the interactions among these variables.  Another hurdle we faced was ensuring user adoption; the tool needed to be user-friendly and straightforward for the HR team, minimizing the learning curve. Finally, we needed to guarantee scalability to handle future data expansion while maintaining optimal performance.

In the future, various enhancements can improve the solution. Connecting it with current HR software could simplify data entry and boost user experience. Incorporating advanced analytics and visualization tools would enable users to better understand salary information. Implementing more sophisticated machine learning algorithms, such as neural networks or ensemble methods, could boost prediction accuracy.   Finally, establishing a continuous feedback loop with users will ensure that the tool evolves to meet their changing needs.

Conclusion

In Part 1 of our series, we outlined the client’s requirements, problem break down, challenges faced, and future opportunities for enhancement in developing a salary prediction tool.

The insights gained during this phase laid a strong foundation for subsequent technical development and implementation, which we will explore in the following parts of the series.

Stay tuned for a closer look at the technical solutions and results achieved through this collaboration with our client! Software Brio is a consultancy firm with HQ in Bangalore that leverages AI to simplify clients’ lives by providing custom-made solutions. Explore our comprehensive projects .

Bootstrap