How to Prepare Data for Advanced Mining Technologies
Alyson Cartwright and Rudy Moctezuma explain how mining companies can ensure their data is ready, not just for today’s digital technologies, but for whatever the future holds.
It’s a widely accepted fact that for mining companies to remain profitable in the face of the significant socio-economic, operating and environmental challenges that will characterize the coming decades, they will need to invest heavily in advanced, digital mining technologies.
Technologies like automation and predictive analytics not only create safer, more efficient operations, they also offer new opportunities in mine optimization.
For years, miners have been working to optimize the performance of their assets retrospectively; processes are run, data is gathered and then analyzed, days, weeks, even a month, later to understand what happened and why across different operational functions.
While this has enabled improvements, it’s clear that if miners wish to increase their operational control and prepare for whatever new technologies and challenges the future might bring, then extra attention needs to be paid to data preparation and management today.
“Mining operations tend to work in siloes,” Rudy Moctezuma, Chief Business Development Officer at Eclipse Mining Technologies, explained. “With new technologies and equipment generating so much data, it can take a while to collect, integrate, prepare, and analyse it, which is why information is normally received after the event.
“However, if we want to take our performance to the next level using advanced technologies like automation, artificial intelligence or machine learning and break down some of those siloes, then we need to streamline the data management process and make it quicker and easier to understand what the data is telling us.“
Alyson Cartwright, Eclipse’s Chief Innovation and Services Officer, agreed: “Without good data, it’s impossible to understand what’s happening in mines, let alone make decisions that might improve the operation.
“For autonomous mining, data needs to be very reliable. Also, any technologies that use machine learning or AI… if the data is poor quality or if it’s not in an open format, then there’s no way these technologies can be applied. The more data you can feed to these kinds of systems, the better the results.”
Preparing data for integration
To fully enable these technologies requires data to be prepared and integrated into a single, centralized repository or platform, the benefits of which centre on cross-domain correlation.
Rather than mines trying to learn from events after they have already happened, processing and analyzing data in a centralized system allows correlations to be identified between datasets, and actions taken in real time or in anticipation of certain conditions arising.
For advanced mining technologies to perform optimally and cross-functionally they also require data to be in an ‘open format’.
“That’s one of the bigger technical issues that’s sometimes glossed over in innovation projects,” said Cartwright. “Advanced technologies and data analytics techniques are no use if the data you have is not open enough to be accessed by those systems.”
Is your data good enough?
Let’s go back to basics then. What makes for good quality, truly open data?
“The term ‘open’ means different things to different people and vendors,” Cartwright explained. “There are lots of things to consider. For instance: how the data was imported into the system. Can it be read? Does it have context? Is it restricted by licensing, or by the version of the application the user has? Who owns the data?
“’Open’ might mean something very different in the future to what it does today. Our philosophy at Eclipse Mining is that, to call data truly open, it has to be open in a generic way and it has to be vendor neutral so that it can be centralized and accessible regardless of what technologies that mine might be using.”
“Exactly,” said Moctezuma. “And all the data must be open, not just selective data as determined by system vendors.”
Good quality, open data has a standardized, flexible format that can be easily exchanged between systems. It also needs to have context –associated data and files that inform users how and where the original data was created – it needs to have history, and it must be cleaned and prepared properly before being imported into the platform.
“Good quality data has lots of context,” Cartwright explained. “The context will be different depending upon the type of data and what it relates to, but all it means is that, when random data comes into the system, the context tells you whether that data represents a schedule, reserves, plant information or any other type of mining data. The more context the data has, the more useful it is.
“Context also allows us to validate the provenance of data. When validating data, it should be immediately obvious if there are any outliers or if something doesn’t make sense.”
Moctezuma added: “In most cases, with projects related to advanced technologies, mining companies will hire a specialist firm to clean and integrate their data. That’s the hard part and the expensive part.”
Understanding data types and characteristics
One aspect which has limited the success of off-the-shelf data platforms in mining thus far, is the different types and characteristics of data involved, not to mention the shear amount of it.
Some types, for instance, geospatial data used in process or plant modelling, are common to other industries like construction and civil engineering. However, some data relating to the orebody or blast intensity and propagation, are more specialized.
“Some data is very unique to mining, for instance, drillhole data or block model data,” said Cartwright. “And some have diverse characteristics, like the size of the datasets or the frequency of change. There’s also a lot of extrapolation; not all mining data is measured.
“We are only able to sample about a trillionth of the orebody, so we have to make extrapolations based on that data and plan designs and schedules around that.”
Moctezuma agreed: “That’s one of the difficulties that technology vendors from outside the industry struggle with.
“Often, they can import the results from the data, say the mine plan report, but they cannot import the data the plan is based upon, because it is not easy to understand. That might solve an immediate problem, maybe to share the report of a plan with a colleague, but the data itself will not be available to be used by other mining technologies.”
Of course, this isn’t just a problem in data management. The complexity of the mining businesses and its various facets is a key reason why the industry has stood alone for so long.
Data transfer and APIs
Adding to operational complexity are the number of different equipment and systems from multiple vendors that mines generally run. Mines are long-life operations and, as such, it’s common for them to run older equipment (even second hand) and to upgrade legacy systems over time.
To facilitate data exchange from one system to the next, technology providers often use application programming interfaces (APIs) – essentially a piece of intermediary software – supplied as part of a software development kit (SDK).
However, the problem with APIs is that they often only transfer data in one direction – data cannot be written back into the system – and, sometimes, without the context that makes the data so valuable.
Until relatively recently, this was ‘good enough’ for most mine’s requirements. However, using multiple APIs to transfer data through an already labyrinthine management system, is not only inefficient. It also means that, depending on how the APIs were created and their capabilities, mines could be losing valuable information they might need years down the line.
Preserving data quality
Most off-the-shelf data platforms use an initial extract, transfer and load (ETL) process to normalize and import different types of data into the system. That can also generate data quality issues because data can be fragile; it’s easy to lose context or data in the transfer process.
With some systems, mines may have to make a choice between performance and openness,” said Cartwright. “For instance, they may choose to load the data in a way that enables good system performance, but then find it’s not open enough to use for analytics.
“Or maybe they decide to load the data in a more accessible format, but then performance suffers because many off–the–shelf solutions are not designed to handle mining specific data.
“That’s not the case with SourceOne®.”
A new paradigm
For the past three years, the team at Eclipse has been working on SourceOne; a centralized data platform that has been created from scratch specifically to address the needs of mining operations both today and in the future.
The platform has the ability to import and centralize access to any kind of data (there are no limitations) quickly and without the need for separate cleaning or data processing.
“At a very simple level, SourceOne uses a distributed ACID–compliant database structure,” said Cartwright. “Which means that you never lose data and is incredibly fault-tolerant.
“There are lots of different components. One is the very rich attribution – the data brought in includes lots of context – and the platform can also take in data that it doesn’t ‘understand’. That data participates in an ecosystem which means it includes history and auditing: you can tell who has handled it and what past versions look like.
This also helps to remove uncertainty related to multiple people handling the data and variations in the quality of their work.
“If a random piece of data is uploaded, that’s fine,” said Cartwright. “You can open it using whatever your Windows default setting is. But if it’s mining specific data then the platform will know that, and it will automatically offer appropriate visualisation, reporting and tracking options based on that.”
Moctezuma concluded the conversation neatly.
“Our focus with SourceOne is not only providing a platform that can handle any type of data, but also one that can do the data preparation and validation for mining companies.
“In doing so, we’re futureproofing both them and their data.”