Artificial intelligence

Get started with data and analytics

13 June 2017 by Michael Link
Getting started in the world of data and analytics can seem more complicated than it is. In this blog post, I will describe how you can get your data projects started on the right track.

Start off a data-driven project – or data study – with a hypothesis and some data that you anticipate will be able to confirm or reject it. The data might be large or small, time-series, images, text, or sound.

Get the experts you need for your data study

You need to involve many competence areas to ensure that your data study will produce useful results, which you can apply in the processes, products, and services of your organisation:

  • Domain expertise refers to specialised knowledge within your business area. You need someone with this knowledge in order to understand how the data is captured, what the desired results of an experiment might be, and how the learnings from an experiment can be applied in practice.
  • Software engineering is an engineering discipline that is concerned with all aspects of software production. Software engineering is integral to data exploration projects, be that harvesting the data, processing the data through a pipeline, managing access to data, or creating tools for users. For this reason, you need experts in this area for all parts of the process.
  • Data science is the broad set of skills that involves analytics, statistics, machine learning, and deep learning. You need someone with a toolkit of data science skills to be able to find and explain patterns in data, leading to new learnings.

Read our blog posts here

Let your data study follow an iterative process

When you have a team with these competences in place, your data study should follow this tried and true iterative process:

1. Firm up the desired outcome(s). Know what questions you hope to answer or what results you would like to achieve. If the question is not clear at the outset, you risk spending a lot of time pursuing an uninteresting trail.

2. Collect data. The data may exist and already be stored in different systems in the organisation. You may need to generate data through surveys or by collecting new or external data sets.

3. Data cleaning and processing. The raw data is the ground truth but may contain faults, gaps, or invalid values. Some fields may be processed to convert units, synchronise timestamps, normalise ranges, or transform them to new representations.

4. Exploration and model development. Use different algorithms to understand the data and arrive at a model that will test the hypothesis. At this stage, you might need to develop new algorithms or find ways to implement them in software.

5. Frontline deployment. Once you are satisfied with the model, you should provide your frontline users with tools that will enable them to make decisions based on the learnings of the study. Such tools could range from new features in an existing system, to new control systems, documentation, and training.

Follow us on Linkedin

Data can lead to additional valuable outcomes

We know that an analytics project can determine whether data can be used to confirm or disprove a hypothesis. Over the years, I have learned that the same data can result in other valuable outcomes:

Continuous improvement. Steps 1–5 are repeated continuously as the results of a study are tested in practice.

New avenues. During the project, new hypothesis or ideas emerge that are later selected for further investigation.

Organisational changes. The project learnings might lead to larger organisational changes, process improvements, or other projects intended to reinforce continuous improvement.

New technology. Techniques developed for working with large volumes of data and detecting patterns contribute to increasing an organisation’s technological wealth. This competence can be applied to other opportunities.

At Kongsberg Digital, I am fortunate to lead a team of highly skilled analysts with world-class engineering and scientific backgrounds. In my experience, a digital platform in combination with a skilful software engineering team can facilitate the data exploration process in many of the areas mentioned above and will enable you to:

  • collect data from sensors and systems
  • transform and clean data
  • store data, control access, and manage a retention policy
  • use a data lake for offline data exploration
  • use advanced machine learning algorithms for analysing data

Finally, the one thing that I find the most exciting: Applications developed and running on a digital platform can use real-time data, machine learning and advanced analytics to continuously empower end-users and take autonomous actions.

Recommended reading:

CRISP-DM Cross Industry Standard Process for Data Mining

Data Science process from Joe Blitzstein, Harvard Professor, from the CS109 Data Science Cource at Harvard

Microsoft Team Data Science Process (TDSP)

 

Artificial intelligence
Machine learning
About the writer
Michael Link
Michael Link has been Vice President of Advanced Analytics and Machine Learning at Kongsberg Digital since May 2017. He holds a master’s degree in engineering of information systems from the University of Surrey and has dedicated his career to technology, science, and engineering, previously at employers such a Telenor, Schibsted, and Opera Software. Michael is in charge of Kongsberg Digital’s efforts within advanced analytics, machine learning, and autonomous systems.