I have two queries that I am trying to merge, so created some dummy data to show the example.
I want the Bus Route and Trip ID to match up in both queries, but the 'Transaction' Date/Time to fall between/equal the Arrival Time and Departure Time of the 'Tracking Data'.
I've also created an additional conundrum of a transaction time not falling between an arrival or departure time, where the ideal scenario would be to match to the closest from/to time, but I can live with this not being possible.
Thanks
Hi James, you can use the append queries, sort on time and fill, this will do the trick for you.
In the end of this YouTube they will demonstrate this fun trick.
https://www.youtube.com/watch?v=o9qAov3kSN4
Grtx, Jeroen
Hi Jeroen,
That works well on the sample file, but unfortunately on the real version there's too much data that gets assigned incorrectly when there is missing data in the TrackingData table. For instance, if Trip ID '1' has no data for 5 days of the month for whatever reason, all of the PassengerData is put onto the first 'Bus Stop Name' stop of trip 2 of the first day of the month.
Any further advice is appreciated.
Could you perhaps provide a sample with more realistic scenarios then? Working off overly simplified examples often results in overly simplified solutions. And please include the expected end result.
Hi Riny, that's fair - I've added a slightly more close to real life version of some data from a few years ago.
The full picture is that within the data I can access, it isn't possible to pinpoint the actual bus stop that passengers are boarding, only by stages (which is a group of bus stops), but I do know the time that a ticket was sold. The tracking data though will give me the arrival and departure time of each stop. The bus stop data gives the GPS coordinates of a bus stop.
I've started off the query by merging the tracking and stop data sources, now I want to match the ETM (electronic ticket machine) data to the tracking data. The Route and Trip ID from each of the data sources should match, and then the theory is that a ticket would be sold at some point between the arrival and departure at a bus stop, which enables the 'Boarding Stage Journey Name' and 'Passengers' to be associated with a bus stop. However, I would also want the safety net that if the ETM data cannot be matched, then it would be good if I could be associated with an arrival/departure of a stop that is within say, 10-20 seconds of an arrival or departure of a stop before ignoring it.
I hope this extra detail will help.
file
That helps, but now please give a few examples of some end results. For instance, where would you match, let's say the first 10 entries from the ETM table if you had to do it manually and what would be your reasoning?
Hi Riny,
I've added a worksheet called 'sample', which takes 'Trip ID' number 2 from the TrackingWithStopData Query result, from 21/09/2020. To the right is ETMData against the records that I would want it to match against where the transaction time is <= 'Actual Departure Time' and >= 'Actual Arrival Time'.
Hmm.... not easy. Will get back to you if not someone else has some clever ideas.
See is the attached file contains the correct query outcome. I filtered the query to only load 7A/2 for the 21st of September so that it can be compared to your example. It's difficult to get the matches on the same row but believe it's near to your requirement. The two "Golf Course" items cause trouble though. Haven't figured that on out.
Hi Riny,
Many thanks for looking at this. I think my best bet is to not try to get every record individually, and rather just sum the passengers in Power Pivot instead. It won't give the same flexibility I was hoping to get, but it does mean I get better insights than I can get currently.
All the best,