Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
E-1 A P P E N D I X E Ridesourcing Demand and Transit Capacity Calculation
E-2 Shared Mobility and the Transformation of Public Transit Overview of data collecon To collect the data, we built a set of scripts in the R and Python computer languages that did the following: 1. For each metro geography, we built files with tract-level counts of a variety of Census variables, by which we weight the random tract selecÂon for the next step. 2. Each hour, query the Uber API for esÂmated wait Âme and price for each of 1000 theoreÂcal trips in the study ciÂes, and store the responses for later analysis. from Uber API For proprietary reasons, ridesourcing companies are extremely protecÂve of their actual trip data, and the researchers were unable to secure an anonymized or aggregated set of trip data for this phase of the study from either of the two largest ridesourcing companies, Uber or LyÂ. However, Uber does provide a way to request informaÂon about their services via their applicaÂon protocol interface (API), a portal where two computers can pass specific informaÂon back and forth in a structured way. In the case of the Uber API, a client computer can ask the API for a cost and Âme esÂmate for a ride between a specific origin and desÂnaÂon at that moment in Âme. Queries from the Uber smartphone app use the API to get informaÂon, request rides, and interact with their account; Uber also provides documentaÂon of and limited access to the API to third-party soÂware developers. Uber granted the researchers access to their API for a limited number of requests per hour (1000 each of Âme and price). All of the queries we made were to a purely informaÂonal porÂon of the API, which did not generate actual ride requests or spoof calls for service. By systemaÂcally querying the API throughout the day and week, feeding it origin/desÂnaÂon pairs from specific points providing coverage of our study ciÂes, we gradually assembled a picture of how ridesourcing availability and demand varies across Âme and geography. The response from the Uber API contains several potenÂally interesÂng data points, among which the most useful for purposes of inferring supply and demand are an esÂmated Âme in minutes before an Uber car could reach the origin point, and a price esÂmate, which includes a component called the surge mulÂplier, a factor applied to the base price of a ride at Âmes when demand for rides is high in a specific area. Because surge mulÂpliers are limited in Âme and in geography, and because they vary along a scale from 1 to more than 6 (which means a rider would pay 6 Âmes the base price), they can tell us something about the relaÂve level of demand at a given point and Âme. For each study city, we chose to limit the geographical extent of our queries to Census tracts consÂtuÂng the core county of each metro area. With tract counts ranging from 180 (DC) to more than 2300 (Los Angeles) we would be unable to query the full extent of our regions at the tract level every hour. Instead we chose to employ a weighted random sampling method for an iniÂal four-week round of data collecÂon, and for a second four-week round narrowed the view to four core counÂes that were able to be fully covered every hour (AusÂn, San Francisco, SeaÂle, and, Washington, DC).
Ridesourcing Demand and Transit Capacity Calculation E-3 Combined, the two rounds of collecon produced some 1.07 million usable observaons for the study regions. Scheduled transit capacity from GTFS To determine how Uber rides corresponded with transit trips, the researchers compared the Uber data with agenciesâ General Transit Feed Specificaon (GTFS) service informaon. For the transit capacity side of the comparison, we started from the assumpon that the transit agencies schedule service in accordance with customer demand, and used the GTFS schedule data to build esmates of service capacity at the zip code level across the day and week. The researchers were assisted in assembling the transit capacity analysis by our partners at Sam Schwartz Engineering, who gathered all relevant transit agenciesâ GTFS feeds and programmacally transformed it to hourly counts of trips, vehicles and vehicle types, and maximum wait mes for each stop in the system (limited, like the ridesourcing data, to the core county of each region). Using standard load factors and agency-specific vehicle sizes to esmate capacity at each stop, we arrived at a measure of seat-stops per hour for each stop; schedule informaon allowed us to calculate typical headways at each stop. We then assigned each stop to its containing zip code and generated aggregate measures of seat stops per hour and average headways at the zip code tabulaon area (ZCTA) level. Because of differences in how individual agencies convert their operaon schedules into GTFS (WMATAâs feed in parcular has a number of unusual features), cross- agency comparisons based on this data should be approached with cauon, especially for more sensive stascal analyses. However, in aggregated form, the data do serve to usefully illustrate the fluctuaon in scheduled service levels across the day and week. Summary maps of the transit and ridesourcing data are in Appendix F. Validity of surge pricing as a demand indicator Though Uber readily acknowledges that surge pricing is their systemâs way of signaling high demand to both drivers and customers, we validated our interpretaon of this indicator by comparing our own addional scrape of these data for Brooklyn, New York, to trip data released by the New York City Taxi and Limousine Commission (TLC). While the samples were not concurrent (the TLC data covered the period January-June 2015, while the API data was collected between October and December 2015), they do show contours in their hourly and daily fluctuaons that resemble both one another and the surge pricing paÂerns in the seven study cies, with the highest use at weekend late nights and moderate rush hour peaks on weekdays (the two sources are shown in Figure E-1). While the surge data showed less range than in other cies and fit was far from perfect, stascal modeling showed that the surge mulplier, day of week, and hour of the day were fairly strong predictors the actual rider count. The surge mulplier tended to overesmate the weekday demand, while moderang the weekend nights somewhat, but the overall paÂern remained. Possible explanaons for these differences are differing seasonality of the data, actual changes in trip paÂerns, or that the surge mulplier is a beÂer predictor of demand in a parcular locaon than for a large area.
E-4 Shared Mobility and the Transformation of Public Transit Note: Data not concurrent; TLC data covers JanuaryâJune 2015, while API data was collected OctoberâDecember 2015. Figure E-1. TNC rider count data from New York City TLC trip reporng (top) vs. surge mulplier data from Uber API (boom), for locaons covering Brooklyn.