Modern Data Connectivity: The Challenges of Data Integration Demand Simplification
The backbone of mid- to large-scale enterprises today is applications – on premises, in the cloud and oftentimes both. The lifeblood? Application data flowing throughout operational units and beyond. This data enables visibility across geographies and supply chains, customers and consumers that previously was unimaginable. This wealth of information is used to run operations and steer organizations with the use of analytics and BI applications.
At the same time, the rapid growth and adoption of applications have given rise to an overflow of data storage requirements and disparate data formats. Not only have data stores become much more complex, with many different types and formats, but also organizations have more choices than ever as to what to store and where to keep their data assets. Architectures are evolving from homogeneous on-premises infrastructures to heterogeneous on-prem and cloud (hybrid) deployments.
This presents a major challenge …the true value of data can only be realized when all relevant data – regardless of location or format – is fully accessible. Data connectivity has become a strategic imperative!
In a recent DATAVERSITY® interview we talked with Craig Chaplin, Senior Product Manager at Simba, a Magnitude Company and Karen McPhillips, Magnitude’s Vice President of Marketing – who weighed in on this challenge and updated us on Connectivity at Simba.
What are the most common complications that arise during the “evolution from homogeneous ‘on-prem’ applications to heterogeneous hybrid deployments?
Craig Chaplin: Our customers, be they enterprise architects or software companies, need to connect data from a number of different applications, not just the corporate ERP system. Plus, they pull data from other third-party sources that help with decision making, such as Facebook, LinkedIn or other cloud services. The tremendous size of customers’ data sets makes accessing them through a single data warehouse prohibitively expensive and sometimes impossible.
How did data become so hard to manage?
Chaplin: Organizations face daunting data connection problems today, especially when it comes to the volume, fragmentation, and quality of that data. Go back 20 years, and data sources were pretty well consolidated, with a few players dominating the market for a long time. This made data connectivity predictable; data assets were on-premises, easy to find, store and access. Back then, organizations had monolithic data stores, like ERPs, that lived on-site and were considered IT’s responsibility to manage. Then the number of data sources exploded as a greater variety of data applications emerged to solve different data input and output challenges. Companies purchased these “best of breed” solutions to manage data more efficiently and effectively.
Has data storage evolved alongside application growth?
Chaplin: Yes, the types of data stores have changed as well. Traditionally they were made up of relational architectures with maybe some OLAP (Online Analytical Processing) for the complex computations. However, newer Big Data sources, have been introduced with new data storage and access architectures to handle large data sets. Graph and time-series databases are other examples of data stores built to handle a specific way of structuring and analyzing data.
The variability as to where the data is deployed has also increased. Data resources now can be deployed equally on-site and in the cloud, often provided and maintained by a third-party vendor. So, collected data can live in many different places, literally around the world, making whole data sets highly fragmented.
How can data retrieval be simplified?
Karen McPhillips: One approach toward simplifying retrieval of data assets that’s been common for years is a standardized data store such as a data warehouse or data lake. These environments allow vast amounts of data of various structures to be ingested, stored, assessed, and analyzed in one place. But, it’s a cautionary tale, as data lake technologies can be difficult to use and awkward for various business operations – especially where there’s demand for live data or business user self-service, so they are not the universal solution for every use case.
For both data lakes and traditional data warehouse solutions, IT needs to be involved due to the technical complexity of collecting, storing and accessing the data.
In today’s on-demand data world, businesses require direct access to the data they need, when they need it … no matter the format, system variability, or data volume; the business needs the data (to be accessed) reliably and with high quality for reporting, insights and decision making. Getting to such “Big Data” requires a different type of solution. It requires a level of data connectivity expertise that most solutions just don’t have.
With Simba’s technologies, we can provide a non-proprietary and well understood view of a data store to these applications. We ensure that capabilities of the data store’s query engine are still accessible but also provide any missing capabilities through our own query engine technology.
Don’t Application Programming Interfaces (APIs) solve for this?
McPhillips: Application Programming Interfaces are specific to a particular application and tend to be oriented towards programmatically initiating application workflows such as adding a CRM contact, viewing an order, etc. and less about querying the application’s data store as a whole. This makes them less than ideal for the general-purpose data access needed by analytic applications, for example.
How does Simba solve for data connectivity?
McPhillips: Simba’s technology solutions focus on enabling data connectivity, regardless of where the data sits and in spite of the differences among applications. Simba’s technology enables accelerated adoption of all reporting, analytics, BI, and other enterprise solutions. Four primary strengths allow Simba to make connecting to data much easier.
- Focus: Simba has had and continues to have an entire business built around data access and integration.
- Expertise: Simba worked with Microsoft to develop Open Database Connectivity (ODBC) standard back in the 1990s. The API provides a standard interface for applications to query a data source using SQL rather than individual proprietary interfaces specific to the data source.
- Partnerships: Simba works with customers to solve for their needs, whether that is through commercially available drivers, custom drivers, integration services or some combination. Simba also works with a many Big Data source and systems providers such as Tableau, Click, Google, Amazon, and Microsoft. This translates into hundreds of person years of experience.
- Market Vantage Point: Simba keeps up with the changing requirements and standards for data connectivity. The team believes that data access is an absolutely critical pivot point to running a business. Part of Simba’s job is to keep up with not just the most adopted, but also many of the niche data sources and applications.
Chaplin: At Simba we have a continuous commitment to work with our customers and partners to understand their data access needs. If it’s a data source vendor then we make sure our solutions take advantage of features unique to their data source. If it’s an application vendor wanting to provide connectivity as part of their application, we work with them to ensure that what we provide best fits how they deliver data access capabilities to their customers.
Our partners also rely on us to stay current with the latest technology in play in the data and analytics space. Simba’s early investment in NoSQL, is one example where we developed leading edge connectivity that our partners would soon include in their own products.
Data assets should not be “clogging” an organization’s data ecosystem just because the diverse types and possible locations have dramatically expanded – and will continue to do so. Reliable data connectivity is a necessity for modern data-driven enterprises, but such solutions need to be able to address a vast, and ever-growing collection of platforms and sources.
With their focus on connectivity, the team at Simba have developed technologies based on the expertise that comes with decades of experience partnering with software leaders and the enterprises that depend on them to run a modern enterprise.
For a deeper dive and more firsthand examples, watch the DMRadio Webinar Fully Connected – Enabling the Modern Data Ecosystem with Craig Chaplin and host Eric Kavanagh, the CEO of The Bloor Group.
Image used under license from Shutterstock.com