;
At the simplest level, big data is data that is too big to be processed on a conventional relational database system.
The Vs of big data, originally coined by Gartner, usefully enhance this definition:
We work with our customers to select the best technologies from leading vendors across business intelligence, data warehouse, big data and analytics.
A data lake is a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured and unstructured data. The data structure and requirements are not defined until the data is needed. Data is stored once, in one place but can then be used and accessed in many different ways – for example queried, searched, visualised. It complements, rather than completely replacing, the enterprise data warehouse.
Azure Data Catalog is an Azure cloud service that allows for enterprise-wide metadata discovery and data source access across an organisation's data stores. It’s a fully managed service and lets any user—from analyst to data scientist to developer—register, enrich, discover, understand, and consume data sources.
Supported objects that can be catalogued include:
Metadata captured in Azure Data Catalog includes the data source, asset definitions (tables, views, reports) and data profiles (types, row counts, max/min). Users can enrich the metadata by adding documentation, notes, tags or common terms to build up a business glossary.
Users can work with data in the tool of your choice, eg Power BI, Reporting Services or Excel. Your data stays where you want it, and the Data Catalog user portal helps you discover it and work with it where you want. It’s also possible to integrate into existing tools and processes with open REST APIs.
Implementing Azure Data Catalog will enable users to get more value out of the BI environment and encourage self-service reporting and analysis.
Big data and analytics can now be implemented cost-effectively and deliver high return on investment. Contact Theta’s data specialists to learn more about extracting maximum value and business impact from your data assets, and making the most of big data and data-driven insights.