In this interview, AZoM talks to Dr. John Sosa, CEO and Co-Founder of MIPAR Image Analysis, about micrograph analysis and deep learning.
How have you seen deep learning and automation change and improve micrograph analysis?
We introduced deep learning into MIPAR around July of last year, and it has had a profound impact on the types of problems that we can reach with the software and the accessibility of those solutions to the everyday user.
We have realized that in many labs and facilities around the world there are approaches to quantifying microstructure today, many of which involve a manual component. Sometimes they are purely manual.
What we have seen is that deep learning has allowed us to take either archived, manually annotated microstructures, or allowed users to continue the manual process for longer to then be able to take that input and evolve intelligent automated solutions that work that quickly, replacing their manual method.
This has been a real shift in how challenging automation is approached, especially in MIPAR.
What are the challenges introduced by progressing to an automated system powered by deep learning?
It is often appropriate to ask whether, at the end of the day, automation, or the attempt to achieve automation returns significant value versus spending the time doing it by hand. The immense potential of automation is clear to everyone, from the orders of magnitude and time reduction to the reduction in user variability, and the ability to obtain more complex measurements that are possible by hand.
Ideally, you would put in work early on to develop an automated approach to solving a repetitive task or project, and after that initial work, you would perfect the automation and would no longer have to do that tedious task anymore and would be able to move on to something else.
However, in reality, whether it's writing our image processing scripts or trying to piece together a pipeline in a commercial product, you get off to a good start and then quickly realize it wasn't so simple; there are many iterations required, there are unforeseen challenges, and you end up getting lost in this pursuit of automation and don't have time to do the task that you tried to automate at all.
At MIPAR Software, we are sensitive to this. We understand that many real problems struggle to get over the activation barrier, spending time to automate a process, and realizing the process could have been completed more quickly manually.
How can this particular problem concerning the time-saving potential of automation be overcome?
When we look at considering automation over a current manual or even semi-automated approach, we're aware of the benefits: we get immense speed improvements, much more objective approaches to the problem, which is paired with a reduction in user to user bias, and to boot, our results tend to be much more traceable and auditable and defendable over time.
But, if these three requirements are not met in trying to develop an automated solution, it is going to be hard to realize those benefits. Almost always we need a solution in an automated sense that is more accurate or better at what we have been doing by hand. That solution cannot just work for one image or one condition.
It has got to handle at least a reasonable level of lighting exposure and sample prep variation and it cannot require a computer science expert to put the solution together and maintain it. It has got to be something that those who have been characterizing samples historically can move into to develop a solution that can be used throughout an organization.
Historically, even before the advent of more graphical processing applications, we were writing code for each problem, and in many cases that is still the approach. If we consider for a moment a very rudimentary plot of two axes, where you have a solution that ranges from requiring tons of experience to very little experience on the x-axis, and one that is not robust at all to very robust on the y-axis, it is reasonable to place many of our hard-coded solutions in an area like that.
They require a wealth of expertise to develop and operate, and typically they are tightly tuned to a specific problem with hard-coded settings and thresholds and other features that make them fairly inflexible or at least very difficult to adjust to different conditions.
How has MIPAR helped with this complexity and inflexibility in coding for specific conditions?
What we have been able to achieve with MIPAR over the years has been to move into a realm in which recipes that are graphically built to be very interactive and infinitely adjustable algorithms that require no code to develop, can lessen the learning curve, lessen the experience required to build and operate and are often able to get us to a more robust solution. This is all possible because of that interactivity and that ability to add numerous steps that can accommodate different conditions very quickly.
Schematic comparing the relative ease of use and robustness of different approaches to image analysis algorithm development.
For simpler solutions, this often works very well. Since they are simpler, you usually need even less experience to build them, and because there are simpler problems, the solution does not need to be very robust. So, getting somewhere to the middle of the robust scale is perfectly acceptable. However, many of the common real-world problems still fall outside of this solution space, where you need a significant level of robustness and they are simply too challenging for the average user to automate effectively.
This is where we feel deep learning places us. It is a real privilege to be working at this time where we have seen this rare breakthrough technology with which you end up working with something that allows you to develop even more sophisticated solutions while also needing less experience to do so. That is a very idealistic scenario, but we have seen through our deployment of it, that it is very much possible.
Could you give some insight into how deep learning training is carried out with MIPAR?
Start by dragging in reference images. Users can do their tracing directly in the training application if they want to, adding layers, and taking a pen tool and beginning to trace. Once boundary tracings have been done, two classes at least are needed for deep learning training.
There is a handy tool that allows you to take all the tracings. If users have only done one class and they would like to train that and its inverse, then they can add an inverse of all the layers too. When that is done, it is set up for training.
One particularly useful setting allows users to split up an image into several subfields. We found this not only reduces the memory demand while training but produces better results since deep learning training tends to fairly image hungry, although we have gotten great results even with just three to five images.
The size factor of the image needs to be chosen; it is just a degree of downsampling, which is done again to improve performance and reduce memory load. Next, users need to choose the processor.
For deep learning training, a GPU is strongly recommended. It is possible to train on the CPU, and you can apply on the CPU, but for training and applying, the speed-up on a GPU is usually eight to 10 times what you get on the CPU. For training, something that would have taken a whole workday on the CPU can be done in under an hour on the GPU, whereas with applying, it could take 10 seconds to apply the model on the CPU versus a second on the GPU.
Once training is initiated, the image will be broken up into tiles, and an ETA will be given.
Historically, it has been challenging to recognize grain boundaries in heavily twinned grain structures and EBSD has been used to achieve this. What other challenges arise in this particular application?
Recognizing grain boundaries but ignoring twins in heavily twinned grain structures is one of the most challenging grain recognition cases we have encountered, where it is almost easier to find the twins than it is the grain boundaries.
You have a very strong dependence on local pattern difference that tells your brain when to move from one grain to another. If you are experienced in looking at this type of microstructure, you can recognize the grain boundaries, but you have to do some mental gymnastics behind the scenes and piece together a lot of the striped features into a typical grain shape before your brain recognizes that it is looking at an entire grain. We have tried all kinds of approaches before deep learning, and we have never come close to getting the software to do what our brains are doing.
Historically, this has been done purely manually. To look for a mean grain, users will lay down a set of random lines and count off the intercepts and get a mean size that way. We have had clients that needed a full-size distribution, and even though going with EBSD would have been more unattended, the time cost and sample prep challenge involved caused them to sit down and trace all of the boundaries by hand so that they could get a full grain-per-grain size distribution measured.
Other challenges include low contrast in some areas and very challenging local morphological changes, as well as what the brain is picking up on to differentiate these grains we somehow have to teach the computer to do.
How does MIPAR deep learning compare to previous state-of-the-art software?
In work with previous clients, we have used a five by five grid with no downsampling, and 500 epochs. It took about 30 minutes to train. The best we could do with MIPAR before deep learning involved a combination of edge finding filters, artifact rejection, and trying to absorb some of the twins into their parent grains if they were very elongated or not.
But, we were not able to come up with something better than this, where you could find all the boundaries fairly well. We want to be finding only the grain boundaries, not all of the various twin boundaries. Quite honestly, I never thought I would see an automated solution to a problem like this.
MIPAR does a remarkable job detecting the various boundaries and ignoring the twins, and it blew us away when we first saw it. The grains on the edge of images are ones that contact the edge and are therefore not used in measurement. That is the classification of the edge and the full grains. You can see the mean diameter pop up as it is part of the recipe.
Automated deep learning detection of grains in twinned brass.
There can be a need for some minor corrections, and it is possible to make those edits if the recipe is set up for those kinds of changes. To fill in a boundary, for example, users can make the edits quickly and MIPAR will clean them up for you. Users can simply draw a line and it will fill that in and clean up the edges.
If there was a false boundary, users can take the erase tool, cross it out, and it will disappear. If users find themselves doing that frequently, they can take those corrected results and update the model with them, and retrain with those corrections so that the algorithm learns from those corrections.
So that's generating some measurements and processing a single image. What about batching? What about how this recipe performs on not only the other training images but one that wasn't part of the training set?
When batch processing is needed, we have found that MIPAR easily applies the recipe made with the images used in training to images that were not part of the training process. Once data has been generated, there have been times when users thought they had to stitch fields together just to collect enough area for their statistics or to adhere to a standard.
This often means they end up with an unwieldy stitched image, and sometimes the stitching is not perfect at the boundaries, and all they needed was a series of fields as long as they could collect the data from those fields together into one distribution. With MIPAR, the measure features tool can be used to choose the diameter measurement from the size panel, and by hitting view measurements, the software will only measure the complete grains.
It will measure each field, group all grains into one table, and then users can plot a grain size distribution across all fields into a single plot with a single mean and the other stats. Users can then generate a report out of this data, click print, and get the representative image that will be stored back in the post-processor and the rest of the results.
It is important to remember that the engine is only going to learn what it is provided with, so we recommend that users show it extremes if they have very different microstructural cases, different lighting conditions, or different prep conditions. It’s important to make sure that it is aware of the potential variation it can see.
What is one of the most complex types of measurement that the MIPAR software is now able to recognize more easily than previous state-of-the-art software?
Additive applications, such as recognizing and measuring melt pools, embodies many of the same challenges that I just described: complex, often subtly defined features that software can have some difficulty recognizing. We can add a level of complexity in terms of a measurement challenge, which involves estimating the degree of overlap between neighboring pools.
For example, we have done this in the past without any prior annotations, meaning that melt pool sizing was simply done by drawing a couple of line cords through the major and mean axis to get an estimate of dimensions, and overlap was not plausible since the particular client was looking for an aerial overlap. Guessing where the pool continued was impractical for many reasons.
To approach this annotation, we used a semi-automated approach. We could get very close to an accurate selection with a traditional recipe that did not use deep learning. We can often find 80 percent to 90 percent of the features fully automatically and then facilitate cleaning up the mistakes to result in what we are going to use for training.
That is what we did in this case. By dragging in the semi-automated setup recipe that was used, the software did the best it could to find the boundaries with the state-of-the-art before deep learning and then provided us with a manual edit intervention step.
With some additional cleaning, you are left with something to use for training far quicker than it would have taken to trace all of these features purely by hand. After this, it’s possible to train just as before to produce a model.
Now for each class that was trained, users will get a probability map, which shows which pixels are most likely to belong to a selected class. You can think of it as the computer prediction of what the segmentation should be. But, what we tend to do in MIPAR is still allow you the flexibility to turn this into the final, most accurate results, and add additional variables to accommodate your particular problem. It could be that users want to set a certain minimum size for the melt pools or ignore ones that touch the edge. The typical approach is to accept that probability map and then add those few additional steps to fine-tune this into the ultimate segmentation.
For an advanced measurement, users can load in a second recipe that not only has been set up with the melt pool detection but the workflow to allow the user to choose a melt pool and have it mirrored over its estimated central axis so that it is overlapped with the neighboring melt pool and can be estimated. For instance, one particular user wanted an interactive workflow that allowed users to spot-check particular melt pools and have their overlap estimated.
After the melt pool detection is done, the user is presented with an interactive window where they can simply click the pool they want to measure. This will estimate where its axis of symmetry is, mirror the melt pool to the right of that symmetry over that axis, and then compute its intersection area with neighboring melt pools to its left, and then measure that area of overlap as a percentage of the entire estimated melt pool shape. This is a specific measurement setup example, but MIPAR can be configured for that need without any coding, which is what it excels at.
Automated deep learning detection of melt pool boundaries in additively manufactured parts.
The results offer a reconstructed pool, estimated out to where it might continue, and then a boundary separates the visible from the invisible portion, and there will be a readout of the overlap percentage and then the estimated dimensions of this entire pool feature.
We've been in discussions with some folk that has rightly pointed out that that mirror symmetry may not physically be ideal, because there could be a liquid powder interaction or a more physics-based way to estimate the hidden portion of the pool. We're taking a look at how we can incorporate that in.
Could you give an example of how MIPAR can significantly reduce the time it takes for users to process samples and generate measurements?
One example saw our team configuring a very customized measurement workflow for a very strongly overlapped image. Powered by deep learning, the features were accurately detected. It took our team working with this client less than an hour to configure this solution and deliver it for their use.
This was a colony versus basketweave case, which embodied many of the same challenges, such as needing to find complex real-world features accurately for automation to be viable. In this case, the existing approach was point-counting. It was simply too overwhelming to try to manually outline everything. This was going to take about 10 minutes an image to point-count, and there were about 50 fields of view per sample.
This particular user was looking at about eight hours per day, every day, to process their samples. With point-counting, you sit three other people down and have them count the same grid of points, and they are probably going to classify some of them a little bit different because there is a continuum in the morphology between colony or organized colony and chaotic basketweave. This means that two or three people could come up with a different designation for the same point.
The approach to tracing in this particular case was a quick and simple one, where the user was just able to highlight some patches of the different morphologies, and the software just learned from that and did the rest of the image.
In this case, rather than a full outlining of all features of interest, and rather than a semi-automated approach, this training was set up by simply assigning some classes. Their image was broken up into a three-by-three tiling, with a size factor of around a 3k by 2k image. Downsampling was able to speed up the performance and still give us plenty of pixels to work with.
It’s important to mention that users can stop training at any time and the software will still preserve where it was in the training process. This can be a way to test the saturation point with those epochs, and you can sometimes get a well-trained result after just 100 epochs and not have to sit and wait for a full 500, as the process can be halted and restarted before it reaches 100 percent. That cut-off point does vary with the number of images that are used and the type of microstructure that is being trained.
In this particular case, what was produced after training was the full segmentation of colony and basketweave. When run as part of a batch process, it took about two seconds per image, meaning it was able to reduce this user’s processing time down from eight hours a day for a sample to about two and a half minutes per sample.
How would you summarize the main benefits of MIPAR for users looking to move from manual processing to automation through deep learning?
Each of the cases I have described cases represented real challenges that communities had struggled with for years in automating. They all represented cases where researchers could take an approach not that different from what they had been doing anyway and manually teaching the software how to interpret the microstructure, but now have that manual annotation carried forward and develop an automated solution that can take over from there.
In the case of the twin grains, the user was able to avoid EBSD for fully automated distribution measurements, which saves between 500,000 and 600,000 per year in microscope cost. In addition to orders of magnitude of time savings in the melt pool analysis, the automation opened up a whole new opportunity for overlap estimation that just was not tenable with pure hand analysis. The basketweave case demanding eight hours a day from the user was reduced to two and a half minutes with the bonus of less user subjectivity.
Automated deep learning detection colony and basketweave morphologies in titanium.
Those examples were just three of hundreds of applications that we have handled and we have seen our users handle with MIPAR. But, their challenges are fairly consistent across applications and embody a lot of what we see in real-world microstructure analysis.
MIPAR is a very powerful technology that allows great accessibility to take expert input from metallurgists and microscopists continuing to do what they have done for so long by manually marking up their images, and then allowing deep learning training engine to take over from there and automate what used to be impossible problems.
How does the MIPAR team support its users throughout their projects?
Our team works very hard to transition and carry through expert support from the early evaluation stages to the long-term client relationship stages. It often begins with an initial set of solutions aimed at addressing the user’s immediate needs, but then commonly we continue to be available into the future as new projects emerge and they may need additional solutions.
The software can be configured as much by the users as by us. It depends on your in-house availability and your ideal workflow. Whether you'd like to take our self-learning courses to become more comfortable with the product or gain expert training from our team, those are certainly available for the users that want to be more self-sufficient with the platform. But, if they are the type of user that wants to have our experts on demand into the future to develop more solutions as needed, that is also a perfectly acceptable model.
Where can our readers go to find out more?
If you'd like to explore whether these solutions and this technology are a good fit for your applications have a look at our applications and deep learning pages on our website. We also have brief 90-second tour video which is worth a look. Readers may also submit an image through our website, which will go directly to our application specialist team and someone will be in contact with them to discuss in more detail whether MIPAR might be a good fit. Your readers can also always email us directly at [email protected]ipar.us, or give us a call.
About Dr. John Sosa
Dr. John Sosa is the CEO and co-founder of MIPAR Software. He received his Ph.D. in material science and engineering from the Ohio State University while focusing on 2D and 3D microstructural characterization of titanium alloys.
Disclaimer: The views expressed here are those of the interviewee and do not necessarily represent the views of AZoM.com Limited (T/A) AZoNetwork, the owner and operator of this website. This disclaimer forms part of the Terms and Conditions of use of this website.