Business Intelligence with PowerQuery: First steps getting JSON into Microsoft Excel

There are many business intelligence solutions large and small for knowledge workers to choose from. But due to its ubiquity, I assume many (myself included) just use Microsoft Excel to interactively analyze data.

A lot of data on the web is available via RESTful APIs returning JSON, e.g. the REST countries service I’ll use in this example.

In the 2016 release Microsoft has vastly expanded the data analysis capabilities of Excel compared to previous versions. And with the free PowerQuery plugin, these capabilities are available to Excel 2013 users as well.

Getting JSON into Excel

One of these capabilities is to retrieve JSON data from the web and turn it into an Excel table. Just go to Data > From Web and enter the URL, e.g.

http://restcountries.eu/rest/v2/all

image

Excel will figure out whether there is a regular web page at the given URL (and offer to extract HTML tables from it) or JSON. In the latter case, a list of records is displayed in the Query Editor.

There is a Convert To Table button conveniently placed at the top left. But with every JSON document I’ve come across, this has required several additional steps to create a proper table.

image

Making a table

Instead I recommend going to View > Advanced Editor and add a manual conversion step by changing the query to this.

let
    Source = Json.Document(Web.Contents("http://restcountries.eu/rest/v2/all")),
    Table = Table.FromRecords(Source)
in
    Table

The same can be achieved by adding a manual step in the Query Settings sidebar and adding the bold text as the function.

Voilà, you now have that JSON as a table with one record per country.

You may want to add some more steps such as right-click the topLevelDomain column and Extract Values… to get the list of domains comma-separated in a single cell.

image

Great usability for non-programmers

This is just one of the many conversion and transformations available via easy to use (but numerous) commands in the context-menu and ribbon.

The great thing about Excel’s handling of all of these conversion steps is that they are applied non-destructively. In the Query Settings sidebar, you can see each one and click it to see the intermediate results it produces.

Not so great: M language

There is one, thing, however, that is a bit disappointing: the programming language behind all of this is the M language inspired by F#. This is unfortunate as it means any previous skills you’ve had working with data in Excel macros is useless, as even basic things such as if statements are syntactically so different I needed to look it up.

Advertisements

Data Analytics and Minimalist Art

For 31 days starting in the beginning of July I posted one new picture each afternoon. All of these pictures were created using the same basic shapes following the rules set out here.

It was interesting to see, how despite their similarities some pictures scored substantially more likes than others. One thing that was abundantly clear from the start was that using many popular hashtags significantly increased my chances of getting likes. Every Instagram guide will tell you that.

What I am interested in is whether there is a way to forecast how many likes a given picture might get. I’m planning on writing a little application to analyze the dataset gathered in July for correlations. This would be an interesting opportunity to try out TensorFlow or Azure Machine Learning and Cognitive Services.

Reading up on Blockchain and Distributed Ledger Technology

Last year, I was reading up on Bitcoin, blockchain and beyond. Since then, there have been several interesting developments in distributed ledger technology (DLT).

If you’re new to the technology, I highly recommend this introductory, plain English guide to blockchain.

I also came cross this great article defining criteria to avoid pointless blockchain projects and its follow-up on four genuine blockchain use cases.

R3 Corda

For one, R3, which I thought then and still think today shows a lot of promise, has released the code for Corda, its distributed ledger project. They also published a non-technical whitepaper as an introduction and two webinar videos: Introduction to Corda and Corda Developers’ Tutorial. There is alos this excellent non-technical 18 second definition of DLT by Richard Gendal Brown, CTO of R3.

R3 also offered its code to the Hyperledger project.

Hyperledger

Hyperledger isn’t a distributed ledger, per se, but contains multiple DLT projects, e.g. Fabric, which is backed by IBM. While you can run Hyperledger Fabric on your own machines, IBM also gives developers an opportunity to play with the technology in their cloud Bluemix.

Unlike Corda, which was built from the ground up for the financial services industry, finance is only one of the industries Hyperledger is targeting. There are, however, a number of projects underway in the financial services that use Hyperledger, as their proof of concept tracker shows.

One of those projects was undertaken by Germany’s central bank Deutsche Bundesbank and the country’s largest exchange operator Deutsche Börse. A November 2016 speech by Carl-Ludwig Thiele, member of the executive board of Deutsche Bundesbank contained mostly questions about the new technology. His speech from January 2017 already presented a prototype to handle simple settlement, payment and corporate actions.

Conclusion

There are a number of interesting projects underway to apply distributed ledger technology to finance.

Still, a lot of questions to be addressed regarding distributed ledger technology, as this position paper by SWIFT and Accenture from last year points out.

The Germany IT industry association Bitkom looks at some of these, e.g. legal ramifications of distributed ledger technology in banking (in German).

It is interesting to see though that regulators and central banks are already actively involved even though distributed ledger technology is still in its infancy in the financial services industry.