Forum

Power query very sl...
 
Notifications
Clear all

Power query very slow

5 Posts
2 Users
0 Reactions
997 Views
(@td99)
Posts: 22
Eminent Member
Topic starter
 

Hello,

 I have two csv files from my heat pump, each with two columns: date in unix time + electrical input, date in unix time +heat output. Both hourly readings for up to last 5 years. Currently only the last 6 months. 

Both electrical input and heat output are cumulative  values.

Thanks to this site I found the unix time conversion for PQ and how to extract the hourly actual values :-). 

Then it's a matter of grouping by day to see the performance.

Problem is that with the current 4700 input rows, this is taking 40 mins to close and load, after waiting ages for the preview. My system is an i7 980 hex-core, GTX 950, 18GB memory and SSD, Office 365. Not the latest but no slouch. Windows shows Excel memory usage going up to above 1 GB. 

Wondering if changing to office 365 64-bit might help? Or are there other solutions to the lack of speed?

I'm sure that doing it via VBA would result in it working much quicker, but PQ is so cool! (and it's new to me...)

code shown below:

let
Source = Csv.Document(File.Contents("C:UsersTrevorDropboxDocuments_AnconalaanToonToonDataElecUsed.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", Int64.Type}}),
#"Divided Column" = Table.TransformColumns(#"Changed Type", {{"Column2", each _ / 1000, type number}}),
#"Merged Queries" = Table.NestedJoin(#"Divided Column", {"Column1"}, Energy, {"Column1"}, "Energy", JoinKind.LeftOuter),
#"Expanded Energy" = Table.ExpandTableColumn(#"Merged Queries", "Energy", {"Column2"}, {"Energy.Column2"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded Energy",{{"Column2", "KW-Tot"}, {"Energy.Column2", "H-Tot"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Date", each #datetime(1970, 1, 1, 0, 0, 0 ) + #duration(0, 0, 0, [Column1])),
#"Extracted Date" = Table.TransformColumns(#"Added Custom",{{"Date", DateTime.Date, type date}}),
#"Removed Columns" = Table.RemoveColumns(#"Extracted Date",{"Column1"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1),
#"Added Custom1" = Table.AddColumn(#"Added Index", "KW-Hourly", each if [Index] > 0 then (#"Renamed Columns"{[Index]} [#"KW-Tot"]) -(#"Renamed Columns"{[Index]-1} [#"KW-Tot"]) else null),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "H-Hourly", each if [Index] > 0 then (#"Added Index"{[Index]} [#"H-Tot"]) -(#"Added Index"{[Index]-1} [#"H-Tot"]) else null),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"KW-Tot", "H-Tot", "Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns1", {"Date"}, {{"KW Daily", each List.Sum([#"KW-Hourly"]), type number}, {"H-Daily", each List.Sum([#"H-Hourly"]), type number}})
in
#"Grouped Rows"

 
Posted : 06/11/2019 6:02 am
(@catalinb)
Posts: 1937
Member Admin
 

You are using row context in 2 steps, #"Added Custom1"  and #"Added Custom2".

This operation is highly expensive in resource consumption, you have to consider an alternative solution to get to the same result.

Trying to imitate excel sheet formulas is not the best approach, if you can explain the logic of those 2 columns, maybe we can come up with an alternative.

Sample data will help a lot, if you can upload.

 
Posted : 06/11/2019 3:12 pm
(@td99)
Posts: 22
Eminent Member
Topic starter
 

Hi Catalin,  the two source csv files are an hourly record of heat output and electrican input. But the values of heat and electricity are cumulative (like meter reading) and I used those two added columns to extract the actual values per hour. Basically subtracting value of previous line from current line. I've uploaded a stripped down version of the excel file, see tab "Warmtewinner"

 
Posted : 07/11/2019 8:06 am
(@catalinb)
Posts: 1937
Member Admin
 

Hi Trevor,

You'll love PQ more after this 🙂 , below is the query modified.

What I did to achieve the same results:

After your AddedIndex step, I added another Index column, but this time starting from 1 not from 0 as the previous index column.

The next trick is to Merge the current query with itself, matching Index column with Index.1 column, then expand the new column to get the columns {"KW-Tot", "H-Tot"} renamed as {"KW-Previous", "H-Previous"}. This will basically bring the next row value into the same row, as new columns. The calculated columns will be much more natural, with current row operations: [#"KW-Tot"]-[#"KW-Previous"]. Same for H-Hourly.

Now it takes 1-2 seconds to refresh.

let
Source = Csv.Document(File.Contents("C:UsersTrevorDropboxDownloadsElecUsed.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", Int64.Type}}),
#"Divided Column" = Table.TransformColumns(#"Changed Type", {{"Column2", each _ / 1000, type number}}),
#"Merged Queries" = Table.NestedJoin(#"Divided Column", {"Column1"}, Energy, {"Column1"}, "Energy", JoinKind.LeftOuter),
#"Expanded Energy" = Table.ExpandTableColumn(#"Merged Queries", "Energy", {"Column2"}, {"Energy.Column2"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded Energy",{{"Column2", "KW-Tot"}, {"Energy.Column2", "H-Tot"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Date", each #datetime(1970, 1, 1, 0, 0, 0 ) + #duration(0, 0, 0, [Column1])),
#"Extracted Date" = Table.TransformColumns(#"Added Custom",{{"Date", DateTime.Date, type date}}),
#"Removed Columns" = Table.RemoveColumns(#"Extracted Date",{"Column1"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Columns", "Index", 0, 1),
#"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
#"Merged Queries1" = Table.NestedJoin(#"Added Index1", {"Index"}, #"Added Index1", {"Index.1"}, "Added Index1", JoinKind.LeftOuter),
#"Expanded Added Index2" = Table.ExpandTableColumn(#"Merged Queries1", "Added Index1", {"KW-Tot", "H-Tot"}, {"KW-Previous", "H-Previous"}),
#"Sorted Rows" = Table.Sort(#"Expanded Added Index2",{{"Index", Order.Ascending}}),
#"Added Custom3" = Table.AddColumn(#"Sorted Rows", "kW-Hourly", each [#"KW-Tot"]-[#"KW-Previous"]),
#"Added Custom4" = Table.AddColumn(#"Added Custom3", "H-Hourly", each [#"H-Tot"]-[#"H-Previous"]),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom4",{"KW-Tot", "H-Tot", "Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns1", {"Date"}, {{"KW Daily", each List.Sum([#"kW-Hourly"]), type number}, {"H-Daily", each List.Sum([#"H-Hourly"]), type number}})
in
#"Grouped Rows"
 
Posted : 07/11/2019 12:41 pm
(@td99)
Posts: 22
Eminent Member
Topic starter
 

Catalin, thank you so much for this! Perfect and almost instantaneous! SmileSmileSmile

 
Posted : 07/11/2019 3:55 pm
Share: