Discover frequently asked questions about Microsoft's latest innovation in data management, the Fabric Data Factory.
In an insightful Q&A session during the Microsoft Build 2023 event, the product team delves into the core aspects of Fabric, touching upon SQL, Spark, data integration, and more. Their comprehensive responses offer a rich understanding of Fabric's capabilities, making it an enlightening resource for all.
The below questions and answers are compiled from the session and edited for clarity and conciseness.
Fabric already includes connectors to SAP, such as ECC, HANA, and SAP Data Sphere. The Microsoft Fabric team is always striving to improve SAP integration and provide more capabilities for SAP data integration. Customer feedback plays a crucial role in shaping their roadmap for SAP-related features and connectors in Fabric.
Domains in Fabric are a higher-level concept than workspaces. They let users group related workspaces together, for instance, finance, HR, or sales domains.
This structure supports a data mesh architecture, allowing data assets within a domain to be shared and accessed by users who have access to that domain. It's a useful way to organize and manage data assets within a larger data ecosystem.
As they move forward, the Microsoft Fabric team is enhancing the concept of domains in Fabric and aiming to provide features and capabilities specific to domains in future updates.
Power BI Premium is evolving into Fabric and will encompass all the different workloads available in Fabric. Power BI is already a part of the fabric, and admins can opt to enable Fabric within Power BI Premium.
The Microsoft Fabric team wants to assure their existing Power BI customers that they will continue to support them and there's no immediate need for migration. Their focus is on ensuring a smooth path forward, ensuring compatibility, and leveraging the robust foundation of Power BI within the fabric platform. The Microsoft Fabric team is also developing tools and capabilities for migration to Fabric while preserving the existing investments in Power BI.
The Fabric team has a long roadmap ahead for data science capabilities in Fabric. They are actively developing features like the managed feature store and other enhancements. The Fabric team recommends checking their regularly updated roadmap and on-demand sessions and blog posts for more detailed information about the roadmap for data science capabilities in Fabric.
Fabric enhances the copy capability by amalgamating the best of Power Query and Azure Data Factory. The compute engine in Fabric optimizes data movement and transformation for high speed and scale. The Microsoft Fabric team is consistently making improvements to the copy engine to ensure the best performance.
Existing customers using Data Factory and Synapse will continue to be supported, and there's no immediate need to migrate. The Data Factory team is developing tooling for migration and ensuring a smooth transition for customers who want to move to Fabric. The team is committed to ongoing investment in Data Factory and Synapse, although future feature updates may be concentrated on Fabric.
The Microsoft Fabric team recommends customers start exploring Fabric and consider it for their analytics needs. Fabric symbolizes an exciting new approach to simplify analytics and combines the best-of-breed capabilities.
The Microsoft Fabric team is committed to long-term investment in Fabric and is always working to improve it based on customer feedback. They are dedicated to offering a path forward for customers and ensuring that their investments in ADF and Synapse are future-proof.
Fabric brings together the best of Power Query and Azure Data Factory to optimize data movement and transformation speed. The Microsoft Fabric team is making continuous improvements to the copy engine for better performance and simplifying the process to ensure the best performance by default.
The Microsoft Fabric Team maintains transparency in their roadmap for Fabric. They’ve already published their next set of investments for the next 6 to 12 months. They also plan to release monthly blogs to communicate updates and new features. The Microsoft Fabric team is dedicated to regularly engaging with customers to meet their needs.
Synapse Data Engineering empowers data engineers to build a lakehouse architecture and perform large-scale transformations with Spark. Azure AI Studio (Everml) caters to a wider range of machine learning use cases and offers additional capabilities like training models on non-Spark computes. Although there may be some overlapping functionalities, each platform has a different focus.
The Microsoft Fabric team is integrating with Purview and Azure Machine Learning registries to govern machine learning models across an entire estate. Their goal is to provide end-to-end lineage across all model assets and ensure centralized governance.
The team is collaborating with partners in the master data management space and integrating with their offerings to provide a comprehensive solution.
The Microsoft Fabric team is working to integrate Purview with Microsoft Fabric to create a better story together. The experience will be accessible through Fabric, but the full spectrum of Purview's data governance features may not be immediately available.
The Microsoft Fabric team is actively working on integration and improving data asset management and lineage capabilities.
Data can be shared across tenants in One Lake, but the native data-sharing capabilities are still in the works. One Lake is compatible with ADLS Gen2 APIs, enabling data access and integration with other Azure services. More native data-sharing capabilities will be introduced in the future.
Fabric already supports web scraping using Data Factory. Web scraping can be performed using data flow generation tools, where users can define examples and train the system to extract structured data. Features are also available for handling semi-structured text files and extracting relevant data.
Source code management, including integration with GitHub, is part of Microsoft Fabric’s roadmap. The underlying representations of data pipelines in Fabric are JSON-based, which makes integration with Git repositories possible.
Features like code reviews and pull requests will be available, and they will adopt a consistent approach to OPS and CI/CD across all workloads in Fabric.
Data scraping and web scraping capabilities in Data Factory can be integrated with source code management tools like GitHub. The underlying representation of data pipelines in Data Factory is JSON, which can be stored and version-controlled in Git repositories.
There is available documentation, best practices, and learning paths for Fabric. The Microsoft Fabric Data Factory team encourages you to explore these tutorials and resources to understand how to utilize Fabric effectively. They will be publishing customer stories and case studies in the future to provide real-world use cases and practical guidance.
The Microsoft Fabric team is working on several investments related to data quality. Improvements are being made in data transformation, such as the ability to define quality rules and metadata-driven transformations. The team is also collaborating with organizations specializing in master data management and aim to integrate with their offerings.
Discussions and plans are underway to integrate the feature store functionality from Azure ML into Microsoft Fabric. The Microsoft Fabric team is closely working with the Azure ML team and leveraging the technology and infrastructure already developed in Azure ML. Feature store capabilities will be a part of the roadmap for Fabric.
The Fabric Data Factory product team provided an enlightening insight into the future of data management and analytics. In this comprehensive Q&A session, they not only explained the various facets of Fabric but also assured ongoing support for existing Microsoft products.
Furthermore, they emphasized their dedication towards enhancing Fabric based on customer feedback, and their commitment to providing transparent updates through regular blogs. As Fabric continues to evolve, it indeed signifies an exciting new chapter in Microsoft's journey to simplify analytics and data management.