In the course example file 4.11 is a text column 'ShipAdress' which is a mix of streetnames and housenumbers. In some cases the numbers are shown as the leftmost part and in some cases as the rightmost part.
How can I create a column out of it containing only the streetnames and a separate column only containing the numbers? For instance to be able afterwards to merge the 2 columns and have all the address numbers at the end as the rightmost part like we normally have in the Netherlands?
Hi Wico,
Welcome to our forum. Looking at the data, there's not much consistency that you can leverage to automate separating of the street numbers from the street name.
I thought maybe you could separate the addresses into countries and then apply specific rules based on each country's address layout but they aren't consistent either e.g. France has two addresses with a comma after the number and one without. Germany has two addresses with a full stop after the street name and one without. Switzerland is similar. Venezuela is completely different again.
If you could group the countries into address formats (numbers at the front and numbers at the end) and separate the data into different tables you could split the addresses with numbers at the front by the space delimiter and then remove any commas/full stops. Likewise for the addresses with numbers at the end. Then you could merge the two tables back together.
I don't know how big your dataset is and how many countries you need to identify address formats for, so this may be an easy job, or not!
I hope that points you in the right direction.
Mynda
Hi Wico,
A while ago, found a very interesting solution, that opens another hidden gate for Power Query.
Seems that we can use the power of JavaScript language within Power Query, therefore we can use its strong web libraries, including Regular expressions.
Here is a function that I use, based on this idea:
Basically, we build a Web page using our text, when PQ loads the webpage, the script will be run to evaluate a regular expression pattern against your text.
You can build very complex patterns to handle any specific exceptions in your data patterns, here is a website that can help.
<textarea>
let
RunScript = Web.Page(
"<script>
document.write(
"""&TextString&""".match(
new RegExp('"&Pattern&"','g')
).join(';')
);
</script>"
),
Matches = try RunScript{0}[Data]{0}[Children]{1}[Children]{0}[Text] otherwise null
in
Matches
</textarea>
Attached is a file with an example based on your data.