Microsoft Fabric

From Resco's Wiki
Jump to navigation Jump to search

Microsoft Fabric is a unified data analytics platform that provides tools for data movement, processing, ingestion, transformation, real-time event routing, and report generation.

If you are using Resco Inspections, you can leverage Fabric to store and transform collected questionnaire answers and use the data to generate reports.

components of ms fabric

Questionnaire data in Fabric

Questionnaires are digital forms, usually running in Resco mobile apps, that allow you to collect data in the field. Microsoft Fabric is a tool designed for data management, and there are two main reasons to consider its integration:

The amount of the collected data
  • The amount of questionnaire data can grow substantially, increasing data storage expenses. Compared to Dataverse, Fabric offers a cheap alternative for storing large data.
Need for structured questionnaire data (AI and BI ready)
  • To save storage, questionnaire answers are stored in a serializedanswer column in a JSON format. First, we have to transform this format to create structured data. Here, we can utilize the Fabric's notebooks.
Tip Notebook is a multi-language interactive programming tool that executes Spark jobs to transform, process, and visualize data.

We at Resco have developed a custom notebook (script) that crunches the data for you. This notebook is available on demand. The current script supports template-dependent questionnaires with flexible or minimal JSON.

How to import questionnaire data

To import data, we have to create a Lakehouse (collection of files, folders, and tables that represent a database).

  1. Go to Microsoft Fabric and select Synapse Data Engineering experience.
  2. Create a Lakehouse from the landing page or go to Create in the left panel and select Lakehouse there. To this lakehouse, we import raw questionnaire data; it's a bronze data layer.
    Homepage lakehouse.png
  3. In the lakehouse, click New Dataflow Gen2.
    Getdatain lakehouse.png
  4. When Dataflow loads, click Get Data in the top panel and select Dataverse as a new source.
    Getdata toppanel dataflow.png
  5. Fill out the required information to connect to your Dataverse.
  6. Once connected, select resco_ questionnaire and resco_question (plus tables you need for reporting later).
    Dataflow choosedata.png
  7. Check the data destination and Publish the dataflow.
    Publishdataflow.png

How to import notebook

Currently, the Notebook that handles the questionnaire data transformation is available on demand. Contact Resco support for more information. The notebook is a script file in .ipynb format.

  1. Open the workspace while in Data Engineering experience.
  2. Click New in top panel and then Import notebook.
    Importnotebook.png
  3. Click Upload and select the file.

Questionnaire data processing in notebook

Once the notebook is imported, the data processing can start. Questionnaire data processing consists primarily of parsing JSON in the serializedanswers column. The result of this transformation is a structured table where each column is one question, and each row is one questionnaire. There are multiple variables that need to be defined before the script can be used.

Initial variables
  • source_lakehouse = name of bronze-layer lakehouse where we import raw questionnaire data
  • path = path to bronze lakehouse .../Tables/
  • destination_lakehouse = name of the silver-layer lakehouse where we save transformed data
  • save_path = path to silver lakehouse ... /Tables/
  • mergeVersions = True/False
  • neutralizeDataTypes = True/False
Saving methods
  • mergeVersions = False
    Each questionnaire is saved to a table specific to their version.
  • merge versions = True
    Questionnaires with the same name are merged and saved into one table, despite the version (possibility of evolution conflict).
Evolution conflicts

Adding/Removing questions behavior:

  • Added question = new column
  • Removed question = column stays but null value in new questionnaires.
  • Merging questionnaire versions can cause evolution conflicts, primarily due to differing data types in one column. If this issue occurs, use the fallback neutralizeDataTypes = True. This will cast all columns into strings.

Once initial variables are defined, click Run all in the top panel.

RescoProcessingDemo.png

Results and availability

By completed these steps, you can achieve the following:

  1. Structured table for each questionnaire.
    Structuredquestionnaires.png
  2. Structured data is AI and BI ready.
  3. You can delete transferred data from Dataverse, reducing total storage cost.

The notebook is currently on-demand. For more information please contact Resco Support portal.

See also

Alternatively, check out the following video that walks you through a very similar scenario.