Achieving Decision Agility via Better Data Integration and Visualization – A Practitioners’ View

In this paper we report a case study indicating how important it is to capture real-time data and present it effectively to the management for better decision-making and achieving decision agility. A GIS-based system or platform is presented, which is able to integrate data from various sources and present required data in real-time fashion to assist decision-making process. In combination with GIS techniques the data can be presented vividly that enables the user to understand the contents and trends of the data more effectively. With real-time data, KPI reports, and analytical functionality provided by the platform the management is able to make more reasonable business decisions to accommodate the challenging and demands from the market, and execute the decisions efficiently.


Introduction
In nowadays' data-rich business environment, management teams of corporates around the world are desired to access effectively the useful and relevant information or knowledge from huge volumes of data spreading crossing different resources to serve their analytical and decision-making purposes.Any management of a company faces challenges such as improving the services/products offered to their customers, increasing the productivity, reducing spending costs, racing against the fierce competitions from their competitors, and meeting new as well as fast-changing requirements from the market.The acquired data should be able to provide the great insights and assist the management in understanding the customer needs, diagnosing the health of the company's operations, predicting the market trends, creating the key performance indicators (KPI) reports, etc.Furthermore, the real-time availability of the relevant data is the key for the management to react rapidly to any crucial events and to stand out the competitions.
Nevertheless, management at the same time is also overwhelmed by data due to its enormous volume and complexity.Without being processed, cleaned, and analyzed, the raw data is actually nearly useless for a decision maker.Data integration, data warehouse, and data mining techniques are becoming more attractive these days because of their capabilities of overcoming this dilemma.
Business intelligence or BI is playing a greater role in decision-making and data analytical procedure of an enterprise.According to Watson and Wixom (2007), BI has been on the top-most priority of many organizations to assist the decision-making and ensure an organization run in the most economical way.Furthermore, real-time BI for business decision agility is also emphasized.In their paper, Tong et.al (2008) proposed a framework of BI for the retail industry.The system collects the data from different sources and organizes the collections as a data warehouse for the decision-making purpose.Wu (2010) suggested employing some modern computational intelligence in BI systems to make them smart enough to solve optimization problems such as forecasting, clustering, etc., though no real application examples were given.Some colleagues also conducted researches on real-time BI, real-time BI frameworks and mechanisms.For instance, Stonebraker et.al (2005), Grabova et.al (2010) presented a light-weight OLAP designed particularly for small and middle-sized companies aiming to create more appropriate BI tools for these companies' data analyses and decision-making.In their book, Han and Kamber (2001) covered concepts and techniques for data mining that is the backbone in most BI systems.Recently, Oliveira et.al (2012) gave the overview of data mining techniques and their applications in service industry.An interesting case study was conducted to demonstrate the value of the data mining approaches.Linoff and Berry (2011) presented more comprehensive survey of a variety of data mining techniques and their applications in the real world.
While non-spatial data attributes are still the important factors in data analysis and decision-making of a BI system, spatial attributes and their relationships with non-spatial data have attracted more attentions from researchers and practitioners.Zhou et.al (2010) discussed how to utilize GIS/GPS to monitor, analyze, and optimize the business activities of a logistics company in China.Rivest et.al (2005) proposed a methodology called SOLAP (spatial on-line analytical processing) to take advantage of spatial components in datasets to provide better tools for user to access, visualize, and analyse data, which in turn facilitates the entire data analytical procedure.Dzenana and Orucevic (2008) reported the integration of spatial data within a business intelligence system in order to better maintain telecommunication networks, to resolve failures more effectively, and to better control the telecommunication network systems.In the book edited by Miller and Han (2009), methodologies and applications were presented on how to incorporate geographic data in data warehouse (data mining) and knowledge discovery, which pointed out an interesting research domain.
The paper is organized as follows: business background of this case study is given in Section 2. The detailed description of the platform and its architecture are presented in Section 3. Section 4 presents some applications and benefits of the platform.At the end, section 5 concludes the paper.

Business Background
A global 500 fortune company in China operates a nationwide auto-dealer network consisting of more than 45 auto-dealers in China.These auto-dealers sell various cars to their customers and provide corresponding (repair, maintenance, finance, etc.) services to their customers.Thanks to the fast economic development in China, the company is experiencing the rapid business growth while facing fierce competitions and challenging requests from customers.In order to make reasonable decisions on their business operations timely, achieve the sustainable growth, and provide more satisfactory services to their customers, the management of the company often needs the following questions (but not limited to) to be addressed: What the market trend or pattern is What the prospective market potential is Where my customers and who they are Where the hot spots (generating higher revenue and profit) are What the current or daily revenue (or profit) is What the revenue of each auto-dealer is What my customer profiles (gender, age, incomes, buying behavior, preference, etc.) are The management wants to learn the market trend and potential so that they are able to put right brands and products to the marketplace attempting to attract more customers.Hot spots will identify these areas where customers are concentrated and/or profit margins are higher.According to hot spots information the management may assign additional personnel, open new dealerships, and/or promote more necessary services and products.By knowing where their customers are, the company can identify if the current service or dealer network is reasonable or not, or if there is a need to open a new auto-dealer or shut a certain auto-dealer to match the circumstance.The real time information on the revenue as well as the profit definitely helps the management team master the operation health of the company and take necessary actions upon the needs.The customer profitability and profile information assist the management in determining the best product mixture and quantity to be promoted.It also provides a solid basis for the management to design the personalized or targeted services and/or promotions for different customer spectrums.Furthermore, the accurate and timely data on each dealership's revenue and profit will provide the management with the clearer insight on setting up a more reasonable or achievable goal for an individual dealership rather than merely upon the experiences or emotions.
Nevertheless, while the datasets required to accommodate the above questions are crucial for the management to perform their jobs better, the data were not well integrated due to the previous rigid information infrastructure.The past multi-year IT projects resulted in various application systems, heterogeneous system architectures, and databases.The datasets, particularly those real-time ones, were not easily to be accessed not to mention the data to be ready available as an easy-to-understand dashboard or to be delivered to any (desktop or mobile) devices.
In order to provide the management of the company with the information to perform more effective analytical tasks and make more reasonable decisions, we initiated a project with the objective of building a platform similar to "Smart Operation Center" mentioned in IBM (2011), which is able to incorporate and integrate the data from different sources, heterogeneous IT platforms, from internal as well as external data providers.
The project team consisting of company's managers, business analysts, and software/solution vendors was formed to carry out the tasks of implementing the platform.To ensure the success of the project, the team is led directly by the CIO of the company.The project team determined the project objectives, created the project plans, acquired the business requirements, and conducted the system/database analyses and designs.The agile software development approach is adapted to offer the company a flexible and low-risk manner of getting the desired system functionality.Based upon the project team's decision the ultimate objective of this platform is to create a one-stop portal for the enterprise analytics and decision-making.The platform will integrate data from a variety of sources, supply the management team with relevant information for decision-making, and provide necessary data-analytical and decision-making tools.This is a multi-year and multi-phase project.The first phase of the project that was completed mainly services the following purposes: The prototype for the final system Proof of concepts (case study) Study and evaluation of technologies and mechanisms to answer the questions listed above In the following sections, we are going to discuss the case study in details.The system was built upon Esri ArcGIS product suite, which provides the foundation for data displaying, data integrations, spatial and business analysis.ArcGIS server hosts all the backend functions while ArcGIS API for Silverlight was used to develop friendly user interfaces.The company had built a financial management system gathering various data such as the car purchasing activities, information on the customers who purchase the cars, the sales prices, the types of cars sold etc. in a real time fashion.Though it also contains some noise and certain data that is irrelevant to our business analytics, this treasure unfortunately had been sitting there for a long time without being utilized effectively.

System Framework
In order to perform the data visualizing and analytical tasks more effectively, the data from different sources must be cleaned and processed to fit the schema used by the data warehouse of the system.To this end, we built a middleware called data retrieving adaptor (not shown in figure 1) that is able to load/extract the data from different sources including internal and external ones, clean the data, and convert it according to the schema required by the data warehouse.As the matter of fact, the data scheme of this data warehouse is defined by the its metadata specifying the data contents, applications, sources, etc.
The data retrieving adaptor pulls the required data from the data sources and updates the data warehouse periodically upon the pre-defined and adjustable parameters (updating interval, for instance).In this case the data warehouse contains not only the historical data but also the most recent one.The management is able to view the fresh data and get the reports using the most recent datasets if necessary.In addition to pulling the data from the internal resources, the data retriever adaptor utilizes the interfaces provided by some third parties, for instance, some financial web sites, to load the data from external resources such as stock market, currency exchange data, macroeconomic data published by government agencies, etc.The management may use these macroeconomic datasets to assist their decision-making.The data retriever adaptor has the capability of pulling data from heterogeneous data resources, removing irrelevant components, cleaning the data, and transforming the data suitable for the data warehouse.Demographic data like population and its distributions, genders, educational background, is also a crucial piece of information stored in the data warehouse.Furthermore, to meet the spatial-enabled analysis requirements some important geographic features including polygons (provinces, cities, counties, and districts), lines (streets, railways), and points (POIs, dealerships, customer locations, competitors) are incorporated into the system.For the map-creation purpose other features such as rivers, mountains, building footprints, etc. are included as well.The data warehouse provides the solid base for the data displaying and analytical tasks as well as decision-making.
The data display or visualization is built with the help of GIS components.Spatial-temporal data has been playing a more important role in data mining and decision-making.Cao and Glover (2010) described in their paper how valuable it is to apply geographic data in creating more balanced and realistic clusters to solve logistics problems.The results obtained by utilizing underlying geographic data are superior to those yielded merely based on Euclidean or Manhattan distances.In fact, a lot of operational/business data has spatial or location characteristics.Spatial enabling data makes richer analyses of positions, shapes, extents, orientations, and geographic distribution of phenomenon possible.In practice, geographic factors impact our daily decisions as well.For instance, if other conditions are the same people tend to shop grocery at the stores closer to their homes due to the proximity and convenience.Demographic factors (such as gender, income, educational background) of one area have the great correlation with people's living habits and shopping behaviors of this area.By blending regular business or attribute data with geographic data, an analyst or decision maker will be able to identify the trends or patterns buried in the original data.For example, using GIS and business data an analyst can know where the customers are located, where the hot selling areas (people are buying more cars comparing to the average) are.With the help of proper data visualizations, a user understands, for a given dealership, where its 50% customers are located and where its 85% customers are located.This is a very important piece of information when the management decides how to configure their existing dealership network, and/or open new dealerships in order to match the market needs.Certain critical information is organized as a dashboard where KPIs are displayed and updated periodically based upon the predefined parameters for monitoring the company's operations.
The sales data is updated using the most recent available information.For each dealership the management is able to know if this store meets its predefined revenue goal, if not, what the gap is.The management may take the actions accordingly.By comparing with certain available marketing data such as the nationwide auto sales data, the platform provides the management with the tool to estimate the market shares and penetrations, which is valuable information used in market exploration and development.
While the platform is still being developed iteratively, it provides certain type of data analytical functionality that helps the management of the company make reasonable decisions.The relationship between analytics and decision-making in the current system is loosely coupled analytics and decisions as defined by Davenport (2013).Wide range of information is made available, and the usage of the information and analytics for possible decisions is still not mandatory.
Based upon the purchasing information acquired in the data warehouse, the system generates the relationships between customers and the dealerships where they purchased cars or services.Furthermore, this relationship will be displayed on the screen upon the request.The system employs the underlying geographic data such as street networks, administrative areas (polygons) to perform some advanced analyses.For instance, a user of the platform can find the service area (e.g., 5-minute travel time from the dealership) of a given dealership and the customers within the area covered by this dealership.Under this circumstance the management recognizes easily the radius covered by a dealership.This information is crucial for the management to make the strategic decision on the optimal configuration (such as square footage and manpower, the service to offer, the product mixtures, etc.) for a dealership.The system combines the historical sales data with seasonable factors and demographic data to perform the sales forecasting task.The result can be utilized to set up more reasonable and achievable revenue objectives for individual dealerships operated by the company.One of analytical model built in the system conducts the clustering analysis, and finds out the customer groups employing both spatial and non-spatial attributes.Based upon the customer classifications or segments, the company may make proper decisions to provide the best products meeting customer demands and to offer excellent customer cares.

System Functions and Benefits
In this section we present some platform's functionality already built and outline the benefits by utilizing the outcomes from the first phase of the project.

System Functionality
Figure 2 represents the dashboard of the system that summarizes the real time operational data for the company.The dashboard contains the sales data for all dealerships operated by the company, the number of cars sold and the corresponding customers' information will be updated dynamically.It is quite handy for the management of the company to learn the current status of each dealership nationwide without sending e-mails or making phone calls to inquire it.Some external data such as stock market and currency exchange rates are captured here as well.As we mentioned earlier, this macroeconomic data is very useful for making better and faster decisions.The content of this dashboard may vary upon the user's privilege, for instance, the manager of a dealership may not have the broad view as the top managers of the company do.The use is able to choose the sales data for months, weeks, or days.The historical and real-time sales for each car brand (monthly, weekly, or daily) are presented on the dashboard as well.By knowing where their dealerships and competitors are, the management team can make proper marketing decisions to promote their products and services aiming at increasing the market shares and exploring more market's potential.The current market gaps can be easily identified via this view.The tool provided by the system enables the user of the system to gather the information on the proximities of their competitors.The visualization of the dealership and competitor locations also helps the management reorganize their dealership network nationwide (opening or shutting down dealerships) to offer better customer services and market penetration as well as to meet the challenges.Figure 4 presents a result of querying the data warehouse, which can be queried spatially or non-spatially.The user can use this tool to find other POIs (points of interest) such as banks, restaurants, dealerships within a particular area, etc.The user is able to use regular SQL statements embedded in the querying tools to get the data he/she wants to know, for instance, providing me all dealerships whose revenues are greater than a certain dollar amount.However, more powerful spatially querying tools are provided by the system as well.A user may be interested in learning how many people who are younger than 35 years and male are living within 5-mile radius of a Volkswagen store.The information can be utilized to explore the market and to promote company's products as well as services.From a customer perspective, this function help find the closest dealership who sells the cars he/she is interested (usually the call center may use this function to answer customers inquiries).Similar to the service territory analysis tool described below, when the management finds two or more their own dealerships are too close to each other via the query, they might consider reconfigure the dealership network so that it can operate more economically instead of competing each other.This picture presents vividly and clearly to the user of the system where the customers for a particular dealership come from.From the query window (upper-left) the user can select a dealership for study; the platform will gather all customers who made purchases at this dealership.At the same time, the corresponding reports can be obtained.Based upon the information obtained from this presentation or report, the management could find some potential problems that may not be uncovered by merely studying non-spatial data.For instance, the management may ask themselves why some of their customers came so far away.If this is a pattern for certain areas, it might indicate that the sales network is not built properly and some important areas are not covered by the network efficiently.If it is possible and economically reasonable, the management may make a decision to open up a new dealership in the area where currently has not been serviced properly.This analysis can be drilled down further based upon other criteria or parameters.
Figure 6 presents a very interesting scenario in which the platform is used to analyze the service area of a dealership.

Figure 6. Service territory analysis
It is worthwhile for a dealership's manager to know how big the service area (radius) it can offer or cover based upon different criteria, for example, within 5 driving minutes or 10 driving minutes.The analysis utilizes the underlying street network that incorporates certain considerations such as one-way streets, divided streets, residential areas, geographic obstacles, etc.By analyzing the service area, better service or sales strategy can be determined.The user of the system is able to know how many percentages of customers of the entire service area (say, 1 hour driving time) within 10-minute, 30-minute, and 60-minute rings respectively.The tool assists the dealership to operate more effectively.Furthermore, this analytical tool can be employed to identify if there is any self-competition within a certain area, i.e., other dealerships of the same corporate are located within the service area.The result of this analysis will help the management adjust the sales network, sales (product promotion) and service strategies to avoid this kind of competition compromising the business or to minimize the negative impact of this competition if the physical network adjustment is impossible or not economic.Because of the usage of the real street networks and associated demographic data, this analysis is more realistic and accountable.
Figure 7 presents the results of the analyses using spatial and business data.The result shows the number of different car brands get registered within a selected geographic area.In this case the user of the platform is able to understand for a given area which car brand (color coded) is the most popular.The company can undertake a proper marketing strategy accordingly.

Figure 7. Analysis results
Using the spatial analysis tool such as spatial-overlay provided by the system the user can get useful information including the sales activity within a spatially selected area like a street block, a city, or a province.Furthermore, according to the attributes obtained by the spatial selections, the user may drill down further to find the interesting trends/patterns in the collected data.For the selected geographic feature the user can classify customer groups, find the hot-selling products, analyze the profitability, and so on.The integrated data stored in the data warehouse of the platform provides a wealth of opportunities for learning markets, customers, and targeting offers to the customers.The acquired information is again very helpful for the management to make better and faster decisions on their marketing and service strategies.

Benefits of the System Applications
To facilitate the analytical and decision-making procedures, more sophisticated models and algorithms were embedded in the system.The management and business analysts of the company can benefit from applying these models and algorithms.These models and algorithms help the user of the system what factors drive demands for products/service and how to react to these factors.Here are two examples.
The manager of the company or a dealership needs to understand which factors impact the auto-sale the most so that he might pay particular attention to these factors and take associated actions if necessary.The multiple linear regression model (MLM) in the system may address the management's concerns.As it is well known that a general MLM can be represented as follows: here y is the auto-sale volume, x i is an explanatory variable (a factor could impact the auto-sale), and a i is the regression coefficient of x i .According to the available information and the sales person's experience for a given city we investigated the following potential impacting factors for the auto-sale: Population: Average annual salary: Population employed: Average household saving: Infrastructure investment: Number of public buses per 10,000 people: Average car sales price: Table 1 lists some sample (10-year) data for the selected city (please note that the data listed here is out-of-date, we use it to illustrate how the model is applied).
From this result we recognize that average household saving, average car sales price, and number of public buses per 10,000 people have the significant influence on the auto-sales.In this case though the management doesn't have the control of those macroeconomic issues, they may keep monitoring these variables and make any business strategy adjustment if needed.The macroeconomic data is being updated in the system synchronizing with those published by the government agencies, and the model is adjusted as well.In combination with other forecasting models built in the system as the guidance, the management is capable to set up more reasonable revenue/profit goals for the dealerships operated by the company.
The clustering algorithm proposed by Cao and Glover (2010) is also implemented in the system to analyze or profile the (potential) customers segments and to propose targeted product/service offers to the customers based on each cluster's or group's features.The historical sales data stored in the data warehouse consists of the following contents or elements: Age Gender Income Purchasing price Car's brand Car's color By applying generalized Euclidean distance between customer i and customer j defined as follows: where is the value of element k of customer i and all the element values are standardized for consistency, the clustering algorithm is then applied to the dataset to create desired customer segments.Due to the significant differentiation between male and female preference and favorite, we built the clusters under genders.The result of this study provides a great insight into the customer preference of each segment; for instance, male customers with more than $20K/annual incoming and age between 30 and 40 usually prefer to purchase white Nissan sedans.The demographic and geographic datasets reveal more interesting information that can assist the management to make effective business decisions.Figure 8 illustrates a clustering result for potential customers in a selected area based on the street network using the same clustering algorithm, where coordinates are standardized for the plotting purpose.

Figure 8. Potential customer clusters
Overlapping with the demographic data it turns out that those customers living in the outskirt of the city generally are young and single because it is a newly developed area while in the center of city most people are middle aged, married with kids.Apparently this result and customer segments described above help the management determine on offering best products and services for people from different areas and segments.Again, the combination of GIS and non-spatial data demonstrates the great power in data analyses and decision-making.The platform has been deployed for more than six months and mainly used by the managers and business analysts of the company.The system is being gradually rolled out corporate-wide so that the mangers of all dealerships will have the access to the system and its valuable assistance tools to drive sales and growth for the dealerships.While more functions and models are being developed iteratively, some preliminary benefits of applying this platform can be summarized as follows: Decision agility: in order to make better and faster business decisions, the management requires the knowledge and insight on what happened in the past, what happens now, and what will happen in the future.The platform provides comprehensive, high-quality, and actionable information that make the decision-making procedure more effective.The platform contains a consistent source of data for leading indicators patterns and trends.The KPI reports and built-in data analytical models facilitate the decision-making significantly.
Better access of data and analytical/decision tools for business users: unlike traditional IT systems where transaction data and other operational data were kept in separate databases that were difficult for business users particularly decision makers to access, the platform offers the one easily accessible resource, i.e., the spatial enabled built-in data warehouse, for the business analysts or decision makers.The data and tools supplied by the system allow the users (business analyst and decision makers) to conduct their jobs more efficiently.
Comprehensive and high-quality information: using the system as a portable or single connection, the user is capable to consolidate the spatial and non-spatial data from various data sources no matter internal or external ones to form high-level overviews or detail views of the business.The integrated data crossing traditional silos provides the user with a single, real-time, and complete source of the information needed for analytics and decision-making.
More reasonable business objectives: the system supplies the management with not only the historical but also the real-time information on sales, customers, and markets.Instead of setting revenue/profit goals upon the experiences, the management can set more reasonable and achievable revenue/profit objectives for individual dealerships that in turn reduce risks in executions.The information stored in the data warehouse of the system is used to validate the objectives dynamically and to improve the accuracy of the forecasts.Furthermore, the goals can be tracked and monitored.Actions may be taken if necessary.
Easier identification of business opportunities: by combining GIS and business data provides the management with the more complete overview of the markets, its trends, and customers.The user (either a business analyst or a decision maker) of the system is capable to identify and act on opportunities more effectively thanks to the integrated key datasets.As we mentioned earlier, the management possesses the capability of identifying the area where no good product/service is offered and decide opening a corresponding dealership to fill the gap.

Conclusions
In this paper we present a GIS-based platform that is built for a global 500 fortune company in China employing GIS, data integration, data mining (analysis), and optimization techniques to support daily and strategic data analytics and decision-making.The system works as an analytical and decision-making center for the company.
Although it is being built iteratively, the so-far finished system is operating and providing the users with tools for data collection, data visualization, and data analysis.The management or a business analyst is using the system to assist their decision-making for daily operations, long-and medium-term decision-making, and data analytical tasks.The system provides the management with a solid base to set up more reasonable revenue and profit goals, to offer customers better services and products, and stand out amid fierce competitions.
The outcomes of the first phase or this case-study are quite positive and promising.The experiences gained and lessons learned during this phase provide the guidance for the future development phases.The future research and development work includes developing a more efficient data integration middleware, creating more data mining/analytical tools to support operational and strategic tasks.One interesting topic might be to better cluster customers by analyzing customer habits/behaviors and the associated impacting factors.Based on this result the company is able to make the more effective decisions on the best offers for personalized services and advertisement.
In summary, the application of the system presented in the case-study demonstrates that by employing GIS, better (real-time) data integration, visualization, and analytics can facilitate the decision-making procedure to improve the company's operations efficiency and profitability.The business decision agility is achieved by employing the system.

Figure 1
Figure 1 depicts the overall platform framework, which is SOA based and consists of the following major functionality (though some of these functions are still being developed and enhanced iteratively): data mining data visualization/report, and data analytics/decision-making

Figure 3 .
Figure 3. Dealership and competitor locations

Figure 5
Figure 5 displays the function of establishing the relationships between customers and the dealerships from where they purchased cars and/or obtained services.

Figure 5 .
Figure 5. Relationships between customers and dealerships

Table 1 .
Sample dataIf the data listed in Table1is used to run the model, after the model fitting the system outputs the following result: