Meet Jono Yeo
This is Jono Yeo from ACE Finance.
In this module, you will learn the skills and knowledge that enable Jono to perform his Data Analyst role by presenting big data insights. This involves knowing how to:
- present insights relating to transactional and non-transactional big data
- select a format appropriate to the audience
- consider presentation requirements for different job roles in a broad range of industries.
We will be following Jono as he performs these tasks.
Jono’s role
Jono reports to the Chief Data Officer (CDO) as a Data Analyst at ACE Finance.
As described in this video, presenting insights into big data analysis helps an organisation make decisions and take action.
What is big data?
Big data is more than just large amounts of data. It includes data of different types from different sources and in large volumes.
The five characteristics of big data, often referred to as the 5 Vs of big data are:
- Volume – the amount of data large enough for the task at hand
- Velocity – quick data capture and analysis for timely business decisions
- Variety – related data from different data types and sources
- Veracity – reliable and trustworthy data
- Value – data analysis that provides value
Importance of big data presentation insights
Big data is a term that describes large, hard-to-manage volumes of data – both structured and unstructured – that inundate businesses on a day-to-day basis. But it’s not just the type or amount of data that’s important, it’s what organizations do with the data that matters. Big data can be analyzed for insights that improve decisions and give confidence for making strategic business moves.(SAS Institute Inc n.d.)
Big data analysis and the presentation of insights help businesses make decisions. Important reasons include:
- The decisions can be data-driven
- Patterns and trends are more easily identified
- Avoidance of personal biases or misinterpretations
- Decisions are consistent with business processes and procedures
- Data can be presented efficiently to a wide audience
- Showing and explaining complexity in an understandable and processable format
Watch this video that shows the importance of visualisation in presenting big data insights and complete the activity.
Knowledge check
Complete the following task.
Benefits of big data presentation
Organisations that understand the importance of big data insights can take advantage of this knowledge. Insights from big data analysis provide decision pathways with measurable benefits.
Acting on the presentation insights can yield increased efficiencies and competitive advantage. Decision-making is easier and more fruitful when based on evidence.
A recommended path to taking advantage of these benefits is through a method called data storytelling. Data storytelling is a method similar to human storytelling and improves audience engagement.
This video introduces the method of telling stories with data.
As you can see, data storytelling effectively provides data and presents evidence to support actions. Data storytelling promotes data-driven decision-making.
More detailed benefits of data storytelling and some examples can be found on this Microsoft Power BI page.
Knowledge check
Complete the following task.
Requirements for presenting big data
Strategies for determining presentation requirements
Developing a strategy to determine all the requirements for presenting big data insights is an important step. All the requirements must be collected and evaluated to ensure they meet the stated guidelines to achieve the desired outcomes. The requirements must meet organisational policies, procedures and protocols.
We will now explore the different strategies for determining the presentation requirements.
Consultation with stakeholders
Stakeholders are either interviewed or surveyed. The interviews can be personal (one on one) or as a focus group. Personal interviews take more time, but more dialogue will give answers to more targeted questions. Interviews are generally more appropriate with executives or key stakeholders. Focus groups can obtain requirements from company departments or functional work teams, but the requested requirements may be more generic.
Surveys can be performed online as a series of open and closed questions in the form of an automated questionnaire. An advantage of using an automated questionnaire is that feedback from more staff can be processed. Specific questions could also be collated and sent via email, and this method allows targeted questions to targeted stakeholders and the response to be written when it suits.
Set SMART goals
A common approach to setting goals and requirements is to use a SMART approach. SMART is an acronym for:
- Specific: Goals and requirements need to be specific. (To answer who, what, where?)
- Measurable: The goals need to be measurable to determine if they have been achieved.
- Achievable: The requirements need to be realistically achievable.
- Relevant: The results will be relevant to your initial requirements.
- Time-bound: The goals should be written to be completed within a set time frame.
Further details are available here.
SWOT analysis
A common strategy is a formal analysis of a company’s objectives and how well they are being achieved. SWOT analysis looks at Strengths, Weaknesses, Opportunities and Threats that apply to the business’s objectives, with the opportunities helping form requirements for presenting big data insights.
Improvement analysis
Analysis of areas for improvement. Strategies to analyse an organisation’s processes and procedures to find areas for improvement can take a few forms.
Common methods include MOST, an acronym to review the mission, objectives, strategies and tactics to determine if they align and highlight areas for improvement.
Big data can be used to model different environments and this can be a useful method to find ways to do things better, or improve company functions. This video introduces how this can be done.
Confirm business requirements for big data presentation
With the business requirements for big data presentation established, confirmation should be obtained before proceeding. Stakeholders and executives should review the requirements and provide feedback, including confirmation.
The context for the presentation of big data insights needs to be understood as it frames how well the analysis is received.
Presentation context
When considering context, the following factors may affect the effectiveness of the presentation: (Pal 2010)
- audience size and location
- communication environment, including sound and lighting
- personal appearance and manner
- use of visual aids or props
- presentation flow and organisation
- language and word choice
The business’s consideration of other environmental factors can also influence how to frame insights. These could include external pressures from the market, economic or political influences and social or ethical considerations.
Assistance to interpret the context may be required by asking stakeholders or executives questions about sensitivities or areas requiring emphasis.
Target audience
The target audience includes key stakeholders such as executives. Other staff or groups may also view the presentation. Key stakeholders may be involved in providing initial feedback on draft presentations to ensure the presentation meets business requirements.
Relevant big data needs to be sourced to build the presentation. The big data will need to meet organisational guidelines and provide relevant insights to meet business requirements.
Identify required big data
Big data can originate from many sources. The issues for an analyst will be to identify datasets containing relevant data to meet the business requirements. An analyst will need to identify the kinds of data required to build the presentation. Big data sources are categorised based on how they are sourced.
Expand each item below to see more details.
In-house sources of big data are those generated, owned and controlled by the organisation.
Some examples of in-house data sources include:
- Transactional data and point of sale (POS) information – which includes both current and historical data that relates to the organisations’ own purchases and details of their customers’ purchasing trends.
- Customer relationship management system (CRM) – this holds details of customers, which can be further analysed to understand where customers are located geographically and their company affiliations.
- In-house data systems – Enterprise resource planning (ERP), finance systems or other organisation databases.
- In-house documents – Today, in-house documents are mostly in Microsoft form (PDFs, Word Documents, Spreadsheets etc.), and they provide a wealth of information about business activities, policies and procedures.
- Archives – Historical data can contain a wealth of information on how the business used to operate and help further investigate patterns or to resolve issues.
- Device sensors – Any Internet of Things (IoT) device used within the icrosoft on can generate data on network connectivity as it relates to the business. For example, a transport company can use sensor data received from their vehicles to source valuable information such as milage, fuel consumption, travel route etc.
External sources of big data are those generated outside of the organisation.
Data obtained from external sources may include:
- public data (e.g. publicly available data for machine learning, official statistics, forecasts)
- unstructured data (e.g. social media posts)
- other types of data gathered by third-party organisations (e.g. Peak body data, Government data, data collected from Google)
An organisation may obtain permission to use these sources but does not own or control them.
Knowledge check
Complete the following two (2) tasks. Click the arrows to navigate between the tasks.
Exploration and discovery methodology
A strategy to identify the organisational requirements is important when consulting stakeholders. It can be useful when identifying key requirements to ask a set of questions to ensure the correct datasets have been sourced.
The analysis aims to answer questions, but with big data, this is complicated by five factors: (Scott n.d.)
- Familiarity: analysts may be stuck in their familiar data and domain boundaries.
- Data volume: dealing with huge datasets can make initial steps difficult.
- Where to look: with large, seemingly disparate datasets, searching for answers may be complicated.
- More than past performance: answers require looking for future patterns, not just past trends.
- Results-oriented: linking events to outcomes is complex.
Considering these complications, a best practice approach is to break the analysis into two parts, that is, data exploration and data discovery. Let’s look at each of these methodologies. (Scott n.d.)
Data exploration
Once the datasets have been loaded and prepared, the data can be explored to see what areas may reveal answers. Some holistic answers may be apparent, but not answers to specific questions. Exploration involves understanding the nature of the data, such as the field types and the type of data that is held. This task allows the testing of hypotheses and narrows down the field for finding answers.
Key requirements for big data exploration include: (Scott n.d.)
- Look at the entire dataset, as patterns and trends may be hidden
- Look across, not just down, as many discovery tools allow drill down but do not explore sideways across data very well.
- Explore anything outside standard search paths and search beyond know patterns.
- Explore different paths quickly and efficiently, as it may take many searches to find what is required.
- The software platform needs to perform quickly enough to keep up with the analysts thinking.
- The exploration tools need statistical functionality to widen the scope of finding answers.
- Production of a transformed and usable database needs to be easy to transition to the discovery step.
Data discovery
The discovery process is more targeted, digging deep into the data to find answers. This step is more likely to uncover patterns and trends linking sequences of events to outcomes. Data discovery is the step that will help answer the required questions and the step before building a presentation.
Data discovery is the process of navigating or applying advanced analytics to data to detect informative patterns that could not have been discovered otherwise.(Morris 2021)
In summary, data discovery performs functions including: (Morris 2021)
- an iterative process to extract valuable insights
- extracts valuable insights for better decision making
- blends data from multiple sources
- democratises data analysis allowing all business units to gain advantage
- preliminary preparation steps often block dirty data
- complex analytics enabled through artificial intelligence and machine learning
According to Morris, data discovery involves the following five steps:
- Identifying needs involves a clear purpose and reflection on company KPIs.
- Combining data from multiple sources is imperative as no single data stream tells the full picture.
- Cleaning and preparing data reduces noise and increases efficiency.
- Analyse the data to view trends and insights.
- Record learnings and decide what else could be investigated.
Knowledge check
Complete the following task.
Tools to discover insights
A quick search on “business intelligence tools” will yield many product offerings, each promising better decision-making possibilities. To make sense of this, we need to understand the two main functions needed to discover insights.
Business intelligence is the collection of relevant data and turning this into actionable information to support decision-making. Visualisation tools help users interpret the data to a more meaningful format. These tools can be aligned with an organisation’s key performance indicators. (Yellowfin n.d.)
Modern business intelligence (BI) tools are evolving rapidly. BI software providers continually improve their product performance, add features and improve accessibility. The more popular products, such as Microsoft’s Power BI and Saleforce’s Tableau BI, have grown multifunctional, providing a suite of BI and visualisation features and tools.
Many other software providers have offerings across the spectrum of BI tools, each providing unique solutions. Consequently, there are many players in the BI space.
A summary comparing business intelligence and data visualisation can be found here.
Business intelligence tools
Business intelligence combines business analytics, data mining, data visualization, data tools and infrastructure, and best practices to help organizations make more data-driven decisions.(Tableau n.d.)
Ranking different BI products based on capability and performance is complicated with continual upgrades, new entrants in the market and shifting demands. Some research companies regularly review the market and produce results comparing available products. Gartner has created a review of Analytics and Business Intelligence Platforms. At the time of writing, this review showed Microsoft’s Power BI and Saleforce’s Tableau BI products as the largest in the market.
The following table shows some commonly used BI tools and briefly describes their function.
Product | Feature/tool | Description |
---|---|---|
Tableau | Einstein discovery | Dashboard extension that generates interactive predictions |
Ask Data | Answers a common language question with visualisations | |
Power BI Desktop | Q&A | Answers common language questions with charts and graphs |
Analyze | Fast and automated insights | |
Anomaly detection | Discovers abnormalities within the data to help find root causes | |
Key Influencers | A visual to understand the importance of driving factors | |
Decomposition tree | A visual that works across multiple dimensions and allows drill down for root cause analysis | |
Smart narratives | Automatically summarises visuals to provide innovative insights |
Visualisation software
BI tools and standalone visualisation software allow a user-friendly presentation of underlying trends and insight. The data can be presented in many different ways, including bar charts, histograms, meter charts, scatter plots and treemaps. The selection of visualisation type depends on what is being shown and the question to be answered.
This page from Khan Academy showcases some methods of presenting datasets to find patterns.
Most visualisation fit into six categories: (Yellowfin n.d.)
- Comparison: compares metrics over a dimension such as time
- Composition: shows the makeup of values
- Distribution: shows how values are spread over time
- KPI: shows the current status of an organisation’s metric
- Relationships: highlights the strength of the relationship between two variables
- Location: shows the physical location of data points
Microsoft’s Excel is a very commonly used product for basic analysis, with some powerful features. Shiny is an R package that allows a developer/programmer approach to analysis.
A selection of commonly used visualisation tools and features are shown in the following table.
Product | Feature/tool | Description |
---|---|---|
Tableau | Explain Data | Explores data to provide insights and describes relationships in data. |
Power BI Desktop | Get Data | Allows connection across different data sources |
Power Query Editor | Tool to shape, transform and combine data | |
Quick Insights | Feature that looks for better business insights in a dashboard, report or dataset | |
Excel | Excel | Popular tool for manual exploration with pivot capability |
Shiny | Shiny | Web based visualisation package based on R |
Preparing for demonstrations with Microsoft Power BI
For the purpose of this module, you will be introduced to Microsoft Power BI Desktop, which is a single tool and a technology platform with multiple capabilities. Therefore, using this tool/platform would enable you to learn the hands-on skills required for this module.
Watch the following video as an introduction to Microsoft Power BI.
Power BI terminology and basic concepts
Common terms used especially when using Power BI for data analysis and visualisations are as follows.
- Workspaces
- Dashboards
- Reports
- Datasets
- Visualisations
Watch the following video to learn more about the terminologies and basic concepts in Power BI.
Refer to the following to familiarise yourself with the basic terminology concepts for using Power BI.
- The Power BI service - basic concepts for beginners - Power BI | Microsoft Docs
- Glossary for Power BI business users - Power BI | Microsoft Docs
What is Power BI Desktop?
Refer to the following articles from Microsoft Learn to learn more about Power BI Desktop and how to get started.
- What is Power BI Desktop? - Power BI | Microsoft Learn
- Get started with Power BI Desktop - Power BI | Microsoft Learn
- You can download the freely available Microsoft Power BI Desktop from Downloads | Microsoft Power BI
Watch the following video as an introduction to Microsoft Power BI Desktop and how to get started with using this tool.
Using Power BI Desktop on a macOS computer
Note: You are required to perform a variety of tasks using Power BI Desktop in this module. If your personal computer is installed with Windows operating system (e.g. Windows 10), ignore the following set-up guidelines as these instructions will not apply to you.
However, if your personal computer is installed with macOS you will not be able to install these applications on your computer. Therefore as an alternative solution you will be provided with additional guidelines on how to set up a virtual Windows environment in your macOS computer so that you are able to follow through the activities in this module.
Steps:
- Download the VM - Windows 10 (.zip) file to a location on your macOS computer from this link.
- Download the Virtual Machine instructions for macOS users document and read through the instructions for setting up the virtual environment.
- Watch the following video demonstration on how to follow through the steps using the files you have downloaded.
Install Power BI Desktop
You will be using the Power BI Desktop to carry out the demonstration tasks in this module. Therefore, do the following to prepare your computer with the required software before proceeding any further in the module content.
Refer to the article Introduction to Power BI - Learn | Microsoft Docs
Watch the following demonstration video that will take you through the installation steps in detail.
Topic summary
Knowledge check
Complete the following four (4) tasks. Click the arrows to navigate between the tasks.