Big Data Analysis in Transportation
In today's world, global rates of urban development have increased rapidly. It is estimated that by 2030, about 70% of the world's population would reside in urban areas. Rapid urbanisation, as a result of internal migration, population growth and a resultant increase in the number of vehicles plying roads, is a major precursor of intense pressure on public transportation systems and infrastructure. Challenges like increased pollution, road accidents, traffic congestions have arisen in the burgeoning denser metro areas with crowded roads. Events like the recent COVID-19 pandemic and the fast effects of global warming and climate change have hastened calls for decarbonization and increased efficiency of the transportation system. While transportation deficiencies is a global problem faced by all of the world's megacities, there isn't a singular framework that has been adopted to correct them. This is because no two urban megacities are the same.
Enter the advent of technology. Areas that have recorded massive changes due to ICT advancements include transportation and urban planning, either through groundbreaking research and the creation of practical industry solutions. An updated transport structure is essential in every megacity to keep up with the present volume of traffic of those cities. Innovations like smart cities, Intelligent Transport Systems, Internet of Things (IoT), advanced traffic information systems, autonomous, connected, electric and shared vehicles, surveillance systems, wireless network communy, integrated rail systems, navigation systems amongst others are symbols of the technological inventions that have transformed global transportation systems in recent decades. However, these interventions pose another connundrum which is more complex and tougher to tackle in order to demystify into proper transport planning systems.
A complex characteristic of today's society is the sheer amount of data generated daily by commuters, from data generated from modern channels such as transaction histories, social media feeds, ride sharing mobile applications, customer feedbacks, tracking and surveillance systems, data obtained by hardware, firmware and roadware, navigation apps, smart cards, ticketing systems to the traditional modes of data collection such as sensors and customer surveys.
The size of the data poses another problem as the process of collecting, storing, sorting, extracting and processing enormous data sets is incredibly cumbersome. This is complicated by the fact that the data to be processed and managed is obtained from multiple channels and in different formats. For instance, travel patterns of commuters across different modes are to be collected and organised into neat traffic flows for optimal transport planning and traffic system. Traditional data processing and data management methods are often labour-intensive, expensive and due to the high rate of human interference, subject to many errors. This is an especially problematic in the context of developing countries.
Thanks to transportation data analytics, commuters in modern cities now have better access to real-time traffic information such as trip distances and estimated commute times, blocked routes, accidents on routes, routes with traffic congestion and alternative routes, amongst other uses.
However, though the data revolution has made it possible to generate, collect, store and analyze huge data sets, many modern cities still struggle with underdeveloped transportation infrastructure and developments, which are, incidentally, developing at a rate slower than data advancements.
What is Big Data?
Big Data often refers to data sets so large and complex that cannot be managed anymore by traditional data processing applications and traditional data management tools. Due to the influx of generated data, huge data sets have to be captured, processed, stored and managed efficiently within acceptable time frames.
A single Big Data set can range from a few dozen terabytes to several petabytes and exabytes. One Petabyte is equal to 1,024 Terabytes or 1 million Gigabytes. An Exabyte contains about 1,024 Petabytes and over 1 quintillion Bytes. Petabytes and exabytes are enormous data sets which are only used by large companies.
Big Data can also refer to structured and unstructured data generated naturally as a result of activities in transportation usage, including transactional, operational, planning and social data obtained from either traditional or new sources.
Big Data is often defined in the context of certain characteristics that serve as the main criteria that differ Big Data and usual sizes of data. The Gartner research defines the characteristics of Big transport and mobility Data in factors described as 3Vs:
Volume: Increase in the amount of data, as a result of the increased number of channels for obtaining and collecting data. Big Data refers to huge data sets that can jeopardise their collection, management, processing and analysis by traditional approaches within reasonable times. Due to massive deployment of sensors in vehicles, wearables, cell phones and other devices, the volume and coverage of sensed transportation data have become increased and more granular. The challenge with large volumes of data is how the relevance of all gathered and stored data is determined.
Velocity: The speed of data coming in and being disseminated between their source to their destinations. Large volumes of data flow in at unprecedented speeds, must be dealt with in lightning speeds and must be disseminated real-time updates to road users. Big Data allows the analysis of data while being generated, without even storing them in databases. The challenge is to seek effective ways to manage data in a timely manner.
Variety: The broad range of data types and formats. The diversity of data sources make the types and formats in which the data is represented extremely varied. Managing large amounts of different data is the challenge in this context.
Other V-themed characteristics include Variability (frequent changes of data that complicated to decipher their exact meaning within context), Value (the implicit potential of Big Data to provide efficient, safer and cleaner transport and mobility systems) and Veracity.
According to Gartner, "Big Data is high volume, high velocity and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimisation."
The process of analyzing Big Data in Intelligent Transport Systems includes several stages:
Data acquisition: relates to the collection of the huge volume of raw data captured from specific data sources and signals, converting the samples in digital values that can be manipulated and bringing such digital data into the processing flow of a given data system.
Data processing: involves the cleansing and sorting through of data obtained from the various sources, removing redundant errors within the data sets and storing the data for further processing and aggregation.
Data aggregation: includes the conversion of processes and data sets from an unstructured/semi-structured set to a structured set . Data is gathered, organised and expressed in a summary form that can lead to the discovery of patterns and trends.
Data delivery: Analysed and transmitted data is organised, presented and transmitted to end users.
USES OF BIG DATA IN TRANSPORTATION SYSTEMS
Big Data technologies help the government and private/public transportation companies to provide high-quality services, optimize operations and cut unnecessary costs while achieving these results with precise accuracy.
Specific instances where big data analytics are being utilized in the transportation system include:
1. Transportation Forecasting:
Before making a commute within the city via road before the advent of data management and analysis, the usual commuter behaviour was to predict travel time and traffic conditions by relying on nothing more than environmental conditions, mass media bulletins and hunches. However, big data technology has made it possible for accurate predictions about commute time and other necessary traffic information within minutes.
Historical data gathered from mobile network operators (call detail records, shared smartphone location data); vehicle data (navigation apps or in-car navigation displays); public transportation usage (ticketing systems, e-payment smart cards) can be fused, analysed with the use of statistical tools and simulation techniques to predict traffic flow and accurate forecasts. With Big Data analytics, there is no need for amalgamation as one of these data sets can provide the precise information needed.
2. Traffic Congestion Management and Commute Optimization:
Traffic congestion is one of the major problems of urban cities and a source of worry for urban planners. It is associated with reduced productivity levels, reduced economic development, higher stress levels and increased levels of air pollution through heightened greenhouse emissions. However, no city is proven to be completely congestion free, in spite of the many ambitious ideas to solve it as proposed by city managers and road users alike.
However, with Big Data analytics, traffic flow can be improved. Big Data technologies can be combined with IoT technology to collect information about traffic flow, send such data sets for analysis using machine learning algorithms and real-time traffic information can then be transmitted digitally to passengers' phones. Alternate routes can then be introduced to reduce traffic flow and offset congestion.
Big Data can also be utilized to locate congestion factors such as inefficient parking layouts, improve traffic light signal timing and advance multi-modal transportation options and routes. This allows for even distribution of traffic flow along different routes. One megacity that has successfully hacked the traffic congestion problem is Singapore, which, through the help of Big Data analytics in transport planning has designed a highly efficient transportation route layout and integrated its citywide bus network with the rail system to promote multimodal journeys for commuters.
An optimization of routing, in turn, leads to better travel times, lower levels of stress in commuters, decreased fuel consumption and minimizes the carbon footprint, thus reducing pollution.
3. Road Infrastructure and Vehicle Infrastructure Management:
Given that road repair and development is one of the most frustrating and often delayed measures taken in a transportation system, one of the innovative solutions posed by Big Data technologies is the provision of a way for gathering information about transport infrastructure. Using mobile app technology, data about infrastructure issues can be sent by residents and commuters who have spotted deficiencies in the roads surfaces in form of photos and videos. Residents can also make a note of jolts and potholes and the app can use the smartphone's accelerometer to detect the precise location of these faults. As a result, such problems can be corrected at an early stage, saving costs that might have otherwise been incurred if the repairs had occurred at a later stage.
The need for vehicle wellness and maintenance through wear analysis by Big Data, through data collected from sensors that are installed in the vehicles. These sensors provide real time information about the vehicle's performance, travel speeds, transit time and engine idling periods. Sensors also monitor the health of the entire engine of the vehicle. Errors and faults that might otherwise have compromised the safety of the user can be predicted and timely preparations for maintenance can be made.
4. Logistics and Supply Chain Management:
Big data technologies provide for real-time route optimization for freight companies by speeding up the rate of deliveries. Satellite navigation technology helps freight trucks, airplanes and ships to be tracked. The collection of this form of data, instant processing and analysis is done in a manner that helps workers to make quick routing choices by monitoring the location of the freight vehicle and location for delivery, speed of the vehicle, break times and driver capabilities. Identifying the best routes for delivery facilitates quicker delivery times. Where the consumer changes their delivery address, Big Data technology allows the driver to find optimal routes to the new destination with ease. Other ways by which Big Data solves transport logistics issues is through the reduction of idle mileage for freight transport and the identification of additional windows in routes for passing loading of partially filled trucks. Deliveries can be done in a reliable, efficient and transparent manner, which improves the service-customer partnership and trust.
Big Data analytics is not without its own challenges and developments.
Data privacy is a huge focus for most persons. Although individual commuter and road user transportation data as obtained from mobile and smartphone resources can be anonymized and protected, many persons do not trust authorities or app suppliers. Also, a data lake containing information on an entire population's identities, vehicles and daily movements is a prime target for hackers. Perceptions of privacy are highly likely to influence the value of big data. Regulations have been passed to protect data privacy, including geographic location data which is associated with a device with individual identification. However, individuals are more susceptible to sharing their location data if they are younger or if they perceive a clear benefit in doing so.
Big Data is also set to affect the skill sets required to work in the transport industry in future. As operations get more and more complicated, improvement in service and efficiency would depend more on systems and data, which would be required to be seamless. Systems would be developed to eventually master these processes better than human counterparts. Artificial Intelligence and robotics have started making strides in this direction. There would be lesser dependence on operators' skills and knowledge, and the needed skill sets would then move to data specialist to manage performance. The key is to make sure that these skills sets are available in the right quantity and in sufficient level.