I will try to describe this the best I can with my limited understand of Power Pivots. And I do not mean to be whiny.
From all the material that I have ever seen, nothing seems to describe (well enough for me) the basic concept of what Power Pivot is used for. I was getting confused about how the data is stored in Excel, how is it reference, etc.
From what I have gathered so far is seems that Power Pivot is used to collect (and copy) data into an Excel file (*.xlsx), but the data is not copied into an Excel Sheet. I guess there is a benefit of not storing the data in an Excel Sheet. Power Pivot maintains a link to where it got the data so that the result set can be refreshed. So technically, I could refresh all my Power Pivot datasets, go offline, and analyze the data and create reports show it off.
It appears that Power Pivot is a built-in limited version of MS Access. Does that sound right? The branding name of "Power Pivot" is more confusing to me. We should be expecting some kind of branding name like "Pivot Cloud" soon. LOL!!! 😉
Have I got the basic concept down?
So what about Power Query. How is Power Query different from Power Pivot?
Hi Troy,
I think you've got a pretty good idea of what Power Pivot does. Another way to look at it is to think of it as a blend between Access; because you can work with big data in multiple tables and create relationships between those tables, and Excel PivotTables; because it's in Excel and you can then create PivotTables from those tables.
The data is stored in the Power Pivot model which is in Excel, but not in a worksheet. You can only have one model per Excel file and yes, after refreshing/getting your data you can then work with the data offline to create measures with DAX (similar to custom calculated columns and fields but more powerful) and PivotTables.
On my Power Pivot course page I describe a bit more about it:
https://www.myonlinetraininghub.com/power-pivot-course
So, Power Pivot is all about working with big data in Excel using PivotTables and writing powerful formulas using DAX (the new formula language for Power Pivot).
Power Query is all about getting and cleaning data so you can use it in Excel formulas, PivotTables, Power Pivot or anywhere else you might use Excel data. I wrote this post last week to try and explain the differences:
https://www.myonlinetraininghub.com/power-query-and-power-pivot-definitive-guide
Let me know if you have any questions.
Mynda
Thanks,
Hi Troy,
If the website data is in a HTML table, as opposed to JavaScript, I would connect from Power Query to the web and get the data and clean it. Otherwise I'd import the Excel file with Power Query, assuming the corrupt error didn't cause a problem. Otherwise import the CSV with Power Query.
If you have a lot of data then load the query direct from Power Query to Power Pivot and analyse it from there. If it's not too big then you might get away with loading it into a spreadsheet and using regular PivotTables to analyse it.
Mynda
Thought I would describe my success and troubles and looking for constructive criticism in that I am doing this in an efficient way.
The report dataset is 1,500 rows. I am using MS Excel 2016 (32-bit).
I have been unable to get Pivot Query to connect to the website directly, but will keep trying. I have been able to download the data and it looks like the following image (attachment: report_example_html.txt). As you can see the web-application is providing a result set using an html table.
By the way, the original file is not beautified like this. I used an editor to beautify and organized the source html. Also, I removed the oringal data and generated this data to protect the innocent. 😉
(Note: Image upload size constraint is 102,400 bytes. Can this be bumped up to maybe 256,000 bytes?)
I have the result set file downloaded to my desktop and it appears that Query Pivot is able to import the html table data. While importing the data, the "Emp ID" gets auto-defined as a number and the leading zeros get truncated off. So "0011434856" becomes "11434856". So within Query Editor, I add a column, change the column type to Text and padded the data with zeros with something like this:
Text.PadStart([EmpID],10,"0")
Now I have two columns like this:
Emp ID2, Emp ID, . . . . . .
"0011434856", "11434856", . . . .
Would there be a more efficient way to do this that I could eliminate this extra step?
After completing this task, I click "Close and Load" and the data set gets loaded into a spreadsheet as a table.
Please remind me how I am to import this into PowerPivot?
The idea is to import into PowerPivot which adds this to the data model for efficient storage of the data set, correct?
For some reason, I am unable to upload this file: (attachment: report_example.html.txt). The content is like this:
<table cellspacing="0" cellpadding="6" id="reportDataGrid" style="color:#333333;border-style:None;border-collapse:collapse;">
<tr style="color:White;background-color:#425573;font-size:10pt;font-weight:bold;">
<td>Emp ID</td>
<td>Last Name</td>
<td>First Name</td>
<td>E-Mail</td>
<td>Expected Start Date</td>
<td>Dept Code</td>
<td>User Type</td>
<td align="center">
Security Training
<br />
(Module 1)
</td>
<td>Completion Date</td>
<td>Last Viewed Date</td>
<td align="center">
Privacy Training
<br />
(Module 2)
</td>
<td>PA Completion Date</td>
<td>PA Last Viewed Date</td>
<td>Account Status</td>
<td>Last Login</td>
</tr>
<tr class="contenttext" style="background-color:White;">
<td>
<a href="javascript:getUserInfo('2001891887')">2001891887</a>
</td>
<td>RILEY</td>
<td>Josh</td>
<td>[email protected]</td>
<td>1/11/2016</td>
<td>XNV2H</td>
<td>FELLOW</td>
<td align="center">Y</td>
<td>6/2/2016</td>
<td>6/2/2016</td>
<td align="center">Y</td>
<td>6/2/2016</td>
<td>6/2/2016</td>
<td>A</td>
<td>6/4/2016</td>
</tr>
<tr class="contenttext" style="background-color:#EAE9E7;">
<td>
<a href="javascript:getUserInfo('0011434856')">0011434856</a>
</td>
<td>MARSH</td>
<td>Nathan</td>
<td>[email protected]</td>
<td>1/12/2003</td>
<td>XNV2</td>
<td>EMPLOYEE</td>
<td align="center">Y</td>
<td>5/5/2016</td>
<td>5/5/2016</td>
<td align="center">Y</td>
<td>5/5/2016</td>
<td>5/5/2016</td>
<td>A</td>
<td>6/5/2016</td>
</tr>
<tr class="contenttext" style="background-color:#EAE9E7;">
<td>
<a href="javascript:getUserInfo('0011434860')">0011434860</a>
</td>
<td>PEARCE</td>
<td>Joel</td>
<td>[email protected]</td>
<td>1/12/2003</td>
<td>XNV2G</td>
<td>CONTRACTOR</td>
<td align="center">N</td>
<td> </td>
<td> </td>
<td align="center">N</td>
<td> </td>
<td> </td>
<td>A</td>
<td>6/5/2016</td>
</tr>
<tr class="contenttext" style="background-color:#EAE9E7;">
<td>
<a href="javascript:getUserInfo('0011434871')">0011434871</a>
</td>
<td>WILLIAMS</td>
<td>Henry</td>
<td>[email protected]</td>
<td>1/12/2003</td>
<td>XNV1</td>
<td>EMPLOYEE</td>
<td align="center">N</td>
<td> </td>
<td> </td>
<td align="center">N</td>
<td> </td>
<td> </td>
<td>A</td>
<td>6/5/2016</td>
</tr>
<tr class="contenttext" style="background-color:#EAE9E7;">
<td>
<a href="javascript:getUserInfo('0011434961')">0011434961</a>
</td>
<td>Cooke</td>
<td>Marlon</td>
<td>[email protected]</td>
<td>1/12/2003</td>
<td>XNV126</td>
<td>CONTRACTOR</td>
<td align="center">N</td>
<td> </td>
<td> </td>
<td align="center">N</td>
<td> </td>
<td> </td>
<td>A</td>
<td>6/5/2016</td>
</tr>
</table>
Hi Troy,
Just in reply to your queries about uploading files, I've increased the max image upload size to 200KB, but really I would hope that snapshots/screenshots don't approach this size. We need to control the amount of space used on our server by uploads and I have to set the limit somewhere.
If you need to reduce the size of an image you can resize it, and you can also compress it using a site like Kraken before uploading. If you have an image that is larger please contact us and we can perhaps arrange another method for you to get it to us.
With regards to uploading html files, I'm afraid that is not possible, as HTML can contain code that could be harmful to the site. Not yours of course, but I have to allow for bad people trying to do bad things 🙁 But as you found, renaming the file to .txt means the system interprets the code at text rather than dealing with it as HTML.
Regards
Phil
Hi Troy,
Instead of adding the Pad.Start step to fix the number formatting, can you just delete the Changed Type step that gets automatically applied, or at least remove the number formatting to the Emp ID column in that step M code?
To load the data into Power Pivot instead of a spreadsheet, simply check the 'create connection only' and 'Add to data model' buttons when you Close & Load To. If you've already loaded the queries you can change the Close & Load settings by right-clicking the query in the Workbook Queries pane > Load To.
Let me know if you get stuck.
Mynda