Hi Mynda, Phil and fellow travellers,
Thank you for your great, free tutorial on web-scraping Mynda (and thank you Phil for your latest webinar on Advanced Editor, which added to content in the course very usefully).
I have a problem with messy websites, I wonder if there is some 'try, otherwise' solution?
I've made a custom function that brings and transforms about 50 columns of data from a public website, at the moment with a few hundred rows.
When it works, it brings back 1-2% errors which I understand and can handle. But there have been a couple of days of catastrophic errors that just crash the whole script. The custom function goes to a web page each in turn and applies filters and other transformations and then looks to collect data. I've found that if the filters define that there is no relevant data on the page, the query breaks and will not continue.
Is there a function or an application of try/otherwise that will let the custom function skip the failed URL and just go on to the next one?
As always, I appreciate any help you feel that you can give and really don't mind if there are more pressing things that need to occupy your time.
I'm a huge fan and looking forward to moving on to Power BI once I've mastered your PQ and PP courses,
All best wishes,
Simon
Hi Simon,
Great to hear you're making progress with the courses!
In lesson 6.11 of the Power Query course I cover error trapping using try/otherwise. Have you seen it?
Mynda
Hi Mynda, thank you for your reply. Yes, I reviewed 6.11 before writing. Sorry to be the dull boy at the back of the class, but I was unsure whether it would work with a function that is going through a full script for each web-page. I'll take another look and do some experiments now that you have given me the confidence that the answer is in there.
Happy summer... golly it's turned very autumnal here, just going out for a walk - many waterproof layers, gloves, etc. Can't wait for the spring now.
Simon
Hi again Mynda,
That's brilliant! I tried:
Source = try Web.Page(Web.Contents("https://www.somewebaddy/"&pagestart&"/somesuffix")) otherwise 0,
And ran it against some clean and the dirtiest set of web pages in my sample, and it worked like a dream.
Many thanks,
Simon
Great to hear, Simon!