Forum

Creating a data mod...
 
Notifications
Clear all

Creating a data model from a flat file - can't type in box

4 Posts
2 Users
0 Reactions
189 Views
(@agwalsh)
Posts: 100
Estimable Member
Topic starter
 

cvxcvx

I have a large flat file, which I got from Kaggle which I want to normalise into a number of tables: gender, neighbourhood, NoShow and the actual appointments table to load into a PowerBI model. What I have in mind is to start with the flat file and then create a number of queries to create the gender, neighbourhood, NoShow  tables using the Reference option e.g. start with the flat file and then use the Reference option to create another query in which I would extract just the neighbourhood details and remove duplicates - thereby giving me a table for neighbourhood. I also want to create a calendar table as well..Am I on track with this? Am attaching just a data sample. 

 
Posted : 12/02/2019 1:04 pm
(@mynda)
Posts: 4761
Member Admin
 

Hi Anne,

I can't see any benefit to creating a neighbourhood table since there is only one column containing this data.

If there were multiple columns e.g. neighbourhood, suburb, postcode that you could move to a dim table and then remove from the fact table all but the column you need for the relationship, then it would be more efficient to create a dim table for this data. However, as it is, there'd be no benefit. In fact it would just add data to your model.

Mynda

 
Posted : 13/02/2019 8:13 am
(@agwalsh)
Posts: 100
Estimable Member
Topic starter
 

So if I wanted to use this flat file as the basis for a PowerBI demo - should I normalise it with reference queries or just go with it as is? I want to use it to demo different outcomes of NoShow by using slicers by gender, neighbourhood and possibly day of the week. (Still flailing around a bit with it - so just want to see what is the most effective way to set up the data..

 
Posted : 14/02/2019 5:30 am
(@mynda)
Posts: 4761
Member Admin
 

Hi Anne,

For demo purposes you could create a neighbourhood hierarchy which would give you a reason to have a dim table. e.g. Nova Palestina and Conquista map into Region 1 and Mata Da Praia and Andorinhas map into Region 2 and so on.

Otherwise just use it as is.

Mynda

 
Posted : 14/02/2019 7:10 am
Share: