Contact Info
San Juan Capistrano, CA
[email protected]
(844) OSPREY-NOW (677-7396)
Follow Us

Consistency and Connectedness of Source Data Quality

The consistency and connectedness of source data are also important dimensions when evaluating source data.  Consistency refers to the frequency of updates or new values in a time series data stream, while connectedness indicates the ability to trace a thread of connections for a well across all of the source data. This may appear to be simple, but it’s surprising how many different naming schemes can exist across teams and systems.

When considering consistency across source data, the figure above shows a signal that has great continuity and the data looks very reasonable over a five-day period. There are some variables in the signals, mostly in the tubing pressure.

However, if the signal was only being collected on an hourly basis, there would be changes and issues that are masked by the frequency of collection. For this well, that is exactly what is occurring. If we were to look at this same time period, but at the signal as a 5-minute interval, as in the figure below, the issues would become apparent.

There is a tremendous amount of variability in this signal. In this case, the issue was a problem with the back-pressure regulation that resulted in foaming in the tube. The increased frequency completely changes the behavior that the well is experiencing.

When it comes to the connectedness of source data, we can determine how the well design ties to signal stream, and to the asset management systems, that provide us with connectedness. In the example below, sampled from actual well names, most every source of data for a well has a slightly modified version of the name. We might be able to quickly guess that some of these are the same well.

During the early stages of this project, the various names caused confusion as teams looked in their respective systems for the “other” names of the wells.

When considering data quality, consistency and connectedness of source data are important dimensions used in the evaluation of the source data.  To consider the impacts of these source data issues, we suggest that you request our whitepaper, entitled  “Data Quality Fuels the AI Race.”  Feel free to comment or ask questions about our white paper below in our comments section.  We would love to hear your thoughts and begin a conversation.