4 Agile Machine Learning

4.1 Machine Learning’s Waterfall

As discussed in the previous chapter, Agile is a response to Waterfall methodology that was widely adopted in the eighties and nineties. Many projects following this methodology failed because the long duration of these projects. The world had moved on while the process-heavy steps were completed and either the plug was pulled before the product was finished or the finished product had limited value because it was a misfit to the changed world. In machine learning, as far as I am aware of, there are no such formal methodologies that are followed by many practioners. However, there are ample testimonies of projects that never reached production and I think this is partially due to suboptimal workflow. Just like Waterfall, machine learning projects can take many months or even years before the results are productionised or the plug is pulled. The data scientist might want to optimise many asprects of the project to give the best predictions possible, before sharing the results with stakeholders. The code might be poorly organised, leading to a lot of time lost merging differenct scripts. Or there might be unclarities on what to predict in the first place, due to lack of communication between stakeholders, business people and data scientist. Whatever the reason, adhering to the principles of Agile can get you more productive and efficient. Here we take the time to interpret the twelve principles in the machine learning context.

4.2 The Twelve Principle in the Machine Learning Context

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

Just as Waterfall prescribes a complete and fault-free product delivered at once, data scientists might be inclined to only release a machine learning model to production once they are confident its predictions are spot on. This principle is a revolutionary break from Waterfall, you should not wait with releasing software until its perfect, instead get it out in the open when it is just good enough. A common term used for this is the MVP (Minimal Viable Product). After the MVP is released it is closely monitored how users are interacting with it and where the biggest room for improvement is. The biggest possible improvement is then tackled first and a new version is released. This cycle of release, monitor, improve, release is repeated many times, such that the product gets better and better. There is no clear definition of done, instead there is debate if the software can be further improved and if the required investments are worth the effort.

The machine learning equivalent to this would be a Minimal Viable Model, a model that is just good enough to put into action. This might be scary and counterintuitive to the high standards you have for yourself, but it is preferable over long optimisation before releasing for at least the following reasons:

It will keep stakeholders excited. Managers and users of the model who commisioned the machine learning project are impatient to see results. As the projects drags on without any output they are likely to loose interest and confidence the project will end well. Eventually they might pull the plug or put it on hold before anything reached production. If they can interact with the results soon, even if it is imperfect will give a totally different dynamic.
You will fail fast. There is a wide array of reasons a machine learning project might fail, such as; the problem appears not be translateable into a model in the first place, the data is not of the quality needed, or the relationship between the features and target. The sooner you implement the model the sooner lurking problems suface.
You will get feedback sooner. This is the main reason Agile wants to implement quickly and then iterate. Lets say you build a churn model which the sales department uses for customer retention. As soon as they start acting on your MVM they find out that the interval in which you predict is too short, many customers already canceled their subscription. Instead of further optimising this model, you focus on predicting a longer time ahead.

What a MVM looks like is project-dependent of course, but in many cases it would probably make sense to define it a regular statistical measure. The machine learning model might be replacing a business rule that has been in place for many years, the MVM is then ready as soon as the model outperforms the business rule. Another way to build an MVM is by only releasing the model for a subset of your target audience. This might be a certain geographical area or users of a certain age. A model only implemented for a part of the population also makes a great MVM.

Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.

This principle comes natural to machine learning, since the outcome of a project is at least partially dependent on the relationships discovered in data. The Waterfall approach in which every step of the project is planned would be appear hideous to even the biggest lover of process. Keep in mind that flexibility should not only be exercised towards assumptions of your data or the models and algorithms you use. Requirements can also be in the framing of the business problem or the way the model predictions are exposed. Whatever it is, don’t be lazy and be prepared to steer in a different direction as soon as the situation requires.

Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

Whereas the first principle is about the philosophy of early deployment and iteration, this one is about the frequency of deploying updates. The Scrum framework is really strict in the amount of time that can be spent until the next release. The team commits itself to making certain changes to the product in typically a two-week period. At the end of this period the improvements, small as they might be, are deployed. The Scrum mindset is not totally applicable for machine learning, as we will explore in the next chapter. It is typically not feasible to commit to a time interval for model improvent because we simply cannot commit to it. We are dependent on the relationships in the data and we don’t know beforehand if the next road we enter is a dead-end or not. However, it is good to keep in mind that every improvement to the model should be deployed as soon as its ready. This creates momentum and excitement by customers, stakeholders, your teammates and yourself.

Business people and developers must work together daily throughout the project.

Machine learning cannot be done in isolation by a data scientist. Navigating through the data cannot be done without knowing about the underlying business process, often additional information is needed from business colleagues. Stakeholder management, keeping them informed about the progress and presenting them with important choices. Also, the customer might be business colleagues who should act upon the predictions. Not involving the business is a recipe for disaster for every machine learning process. Within the Scrum methodology the role of Product Owner is crucial for the allignment of the team with the business. Having such a representative is also very welcome for a machine learning project, he or she is than the translater between the modelling and the business. Keeping this person informed at all times is essential for decision making.

Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.

This principle is the antithesis of Waterfall in optima forma. Instead of meticuously describing how the job should be done, just set the goals of the projects and leave it up to the team how these goals should be attainted. Machine learning practioneers typically already enjoy this type of freedom for the sheer reason that stakeholders often don’t really understand how the predictions are done. It can happen that business people get overly involved in the process, they can have a strong opinion on which targets should be used or how the target should be defined. Take their advice at heart but trust your instincts. If you feel a different approach will yield better results than rely on your expertise. You know about overfitting, multicollinearity, non-convergin algorithms and many other topics the business cannot grasp. Take the time to explain why you think a different approach is better (in lay men terms of course) and thank them for their input.

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

A machine learning project is rarely done end-to-end by one single person. Data might be made available by a dba, a backender might expose the model results within the website, the frontender builds the interface for interacting with the results, etc. If possible working with these people directly will speed up decision making and improve allignment. Communication by email or chat programs are often slow and lack the interaction. Make an effort to be in the same room with your direct colleagues, for at least a part of the project time.

Working software is the primary measure of progress.

As long it is not part of the modelling pipeline you have not reached any results yet. Only when the update to the predictions is fully implemented and the predictions are ready to be consumed by the business, there has been true improvement. All too often the reported improvement in accuracy in research scripts does not hold when it is implemented in the full model pipeline. Sometimes it has been done on just a subset of the data that was conveniently available. Or the new feature was tested in isolation and there is not yet a sense of multicollinearity. There is only one true measure of how well we are currently doing, and it is the pipeline. This implies that as long as there is not end-to-end pipeline in place, we cannot tell how well we are doing.

Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.

The deadline way of doing a machine learning project; the stakeholders and the developers meet, they set to have a first version of the model ready at a set moment in the future. The sponsors forget about the project until right before the deadline, busy with meetings and memos. The devdeloper goes to work, having ample time before the deadline there are many things that can be explored. The result is an array of research scripts and intermediate results. Suddenly, as the deadline comes near, all this separate research has to come together. Pulling an all nigther the team is able to deliver a result, which is presented to the sponsors. The project is then continued, a new deadline is set, and the cycle starts over.

Don’t - do - deadlines. They are a recipe for hastily created, nonreproducible results. They promote a workflow of taking it easy at first, stressing out when the dealine comes near and exhaustion after it. Instead set small goals that are attainable in a short timespan, update the model if its results are favorable and set a new small goal. This will result in better quality code, a better grip on the model results and a happier team. Moreover, it will result in a model that is constantly updated, which excites sponsors and users.

Continuous attention to technical excellence and good design enhances agility.

Machine learners can have much to learn from software engineers as it comes to standards and rigor. In machine learning much of the code that is used to produce the predictions is not shipped as part of the product. Cleaning of the train data, splitting in train and validation sets, running algorithms that produce the models, doing research on relationships in the data and many more steps are for the machine learning practioneer’s eyes only. It is tempting to cut corners when you are the sole user of your own code. Why go to the trouble of writing unit tests and documentation for your functions, as soon it does not do what it is supposed to do you are right there to fix it.

At the moment of writing your code it is very obvious what is supposed to do and as you run the code against the data you are then working with it is straightforward to see if the program indeed does what it supposed to do. However, three months from know you completely forgot the reason you wrote that part and you have no clue why it failed against the refreshed data. You never work alone on a project, even if you are the only person working on it. Always consider future you as a separate person who you respect very much and you want to help to do its job as good as possible. The result of many parts of code of poor quality is that they don’t click to make the bigger, complex system that large machnine learning projects are. Trying to create this system to produce final predictions is then a bit like … Each time you fix one part another part comes tumbling down. To produce reproducible, reliable predictions it is essential you can completely trust the code you wrote.

Simplicity–the art of maximizing the amount of work not done–is essential.

A machine learning project’s goal is often straightforward, predict y as best you can such that some business goals can be achieved. Other than software development there is not much to choose in which features should and should not be included in the final product (features as in characteristics, not as in predictors). The options how to arrive at predicting y, however, are abundant. The biggest challenge is often “what should I explore next?”. Should we explore another database in which we might find new predictors or should we try a different model on the current predictors which involves some additional preprocessing of the data?

We can roughly estimate what the amount of work would be to explore both options, it is, however, very hard to predict what the amount of value is the new part will add. A good rule of thumb is that when in doubt choose the option with the least unknown components. Choose an algorithm you know well over one you have never used in practise. Only tap into a new data source if you are convinced that the options on the current data base are exhausted. Machine learning is a field with rapid developments, it is often tempting to seize the opportunity to dive into a new technique or algorithm. Be critical before doing so, is there really no way to obtain similar results with something already familiar to you?

The best architectures, requirements, and designs emerge from self-organizing teams.

This is another principle that is a clear antidote to Waterfall. Instead meticulously plan every aspect of the project upfront, let the developers come up with the most important project designs as they go. It is impossible to foresee all the aspects of the software project before implementing it, so trying to come up with before writing code is a guarantee for going back and forth between the planning and implementation stages. Due to the iterative nature of building predictive models and the insecurity we have on the relationships in the data, this principle seems quite natural in the machine learning context.

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

In the Scrum methodology a retrospective is done after two week sprint. The team discusses what went well in the past sprint and what deserves attention. Every development process has its inefficiencies, wether they are unclear communication or not havring the right priorities. Having to reflect on the process forces you to look critically at all aspects of the project. Inefficiencies can quickly become project features when they exist for a while, the sooner they are tackled the better.

Even when you are not in a team following an official methodology such as Scrum or Kanban, you do best in planning regular reflection meetings. Even when you are the only data scientist or even the only development of the team, you should also your technical issues here. Maybe you are wanting to refactor a certain part of the project for a while but are unsure if it is worth the time. Even though your business colleagues don’t understand the technical aspect of the problem, they can still challenge you on the pros and cons of both sides.

We have reflected on the twelve priniciples with machine learning in mind. Some principles appear to be not too interesting for us, but many can be a great guide in delivering results quicker and with more joy and confidence. In the last chapter we briefly discussed the methodologies Scrum and Kanban. Now it is time to see what an Agile methodology for machine learning migh look like. We are going to explor this in the next chapter.