The project manager’s role is both critical and challenging. They are responsible for the project’s plan and its execution. At the beginning of the project, they help define the plan and set deadlines based on stakeholders’ requests and the technical team’s capacities. Throughout the project, they constantly monitor progress. If the actual state of tasks or deliveries deviates from the plan, they need to raise a flag and coordinate with the teams. As a result, they spend most of their time communicating with different teams, higher-level managers, and business stakeholders. Two major challenges in their job are:
- Interdependency between Technical Teams: This makes the role challenging because the outputs from one team (e.g., data engineers ingesting the data) serve as inputs to another team (e.g., data scientists consuming the data). Any delay or change in the first step impacts the second step. Project managers, though not typically super technical, need to be aware of these changes and ensure proper communication between teams.
- Competing Business Priorities: Business stakeholders often change their priorities, or there may be competing priorities across different teams that need to be aligned. Project managers must navigate these changes and align the various teams to keep the project on track.
By effectively managing these challenges, project managers play a pivotal role in the successful delivery of machine learning projects.
Fraud analysts’ domain expertise and knowledge are crucial for the development and evaluation of fraud prediction models. From the beginning of the project, they provide insights into active fraud trends, common fraudulent scenarios, and red flags, as well as exceptions or “green flags.” Data scientists incorporate this knowledge during the feature creation/engineering phase. Once the model is running in production, constant monitoring is required to maintain or improve performance. At this stage, fraud analysts are essential in identifying the model’s true or false positives. This identification can result from a thorough investigation of the customer’s history or by contacting the customer for confirmation. The feedback from fraud analysts is integral to the feedback loop process.
High-level managers and C-level executives play a crucial role in the success of ML/AI fraud projects. Their support is essential for removing obstacles and building consensus on the project’s strategic direction. Therefore, they need to be regularly updated about the project’s progress. So that they can support championing investments in necessary teams, tools, and processes based on the project’s specific requirements and ensure appropriate resources are allocated. Additionally, they are responsible for holding internal and external parties accountable for data privacy and compliance with industry standards. By fostering a culture of accountability and providing clear leadership, they help ensure that the project meets its goals and integrates smoothly with the organization’s overall strategy. Their involvement is vital for addressing any regulatory concerns, managing risk, and driving the project toward successful implementation and long-term sustainability.
Data engineers provide the data needed for us (data scientists) to build models, which is an essential step in any ML project. They are responsible for designing and maintaining data pipelines, whether for real-time data streams or batch processes in data warehouses. Involved from the project’s inception, data engineers identify data requirements, sources, processing needs, and SLA requirements for data accessibility.
They build pipelines to collect, transform, and store data from various sources, essentially handling the ETL process. They also manage and maintain these pipelines, addressing scalability requirements, monitoring data quality, optimizing queries and processes to improve latency, and reducing costs.
On paper, data scientists create machine learning algorithms to predict various types of information for the business. In reality, we wear many different hats throughout the day. We start by identifying the business problem, understanding the data and available resources, and defining a solution, translating it into technical requirements.
Data scientists collaborate closely with data engineers and MLOps engineers to implement solutions. We also work with business stakeholders to communicate results and receive feedback. Model evaluation is another critical responsibility, which involves selecting proper metrics to assess the model’s performance, continuously monitoring and reporting on it, and watching for any decay in performance.
The process of continuous improvement is central to a data scientist’s role, to ensure that models remain accurate and relevant over time.
Once data engineers and data scientists build the data pipelines and model, it’s time to put the model into production. MLOps engineers play a crucial role in this phase by bridging the gap between development and operations. In the context of fraud prediction, timing is critical since the business needs to prevent fraud before it happens, necessitating a pipeline process that runs in less than a second. Therefore, Mlops engineers ensure that models are seamlessly integrated into production environments, maintaining reliability and scalability. MLOps engineers design and manage the infrastructure needed for model deployment, implement continuous integration and continuous deployment (CI/CD) pipelines, and monitor model performance in real-time. They also handle version control, automate testing, and manage model retraining processes to keep models up-to-date. By addressing these operational challenges, MLOps engineers enable the smooth and efficient deployment of machine learning models, ensuring they deliver consistent and valuable outcomes for the business.
We talked about the roles I have identified in my working experience. These roles interact differently depending on the stage of the project and each specific company. In my experience, in the begining of the project, fraud analysts, high level managers and data scientists work together to define the strategy and requirements. Data scientist’s play a significant role in identifying the business problem. They collaborate with Mlops and Engineering to translate it into a technical solution. Data engineers need to come along to discuss required pipeline developments. One common challenge is when there is a disconnect between these teams and it just emerges at the time of execution. This can impact timelines and the quality of the deliverable. Therefore the more integrity between these teams, the smoother will be the implementation and delivery.
Comment below about the roles in your company. How are things different in your experience?