In modern transportation, pavement is one of the most important civil infrastructures for the movement of vehicles and pedestrians. Pavement service quality and service life are of great importance for civil engineers as they directly affect the regular service for the users. Therefore, monitoring the health status of pavements and conducting necessary maintenance are essential for public transportation safety. Many pavement damages can be detected and analyzed by monitoring of structure dynamic responses and evaluating the road surface conditions. Monitoring of pavement structure responses can be realized through different sensor technologies. In the past, evaluation of road surface conditions was often performed manually by monitoring and identifying pavement distress based on field inspection. With the rapid development of transportation infrastructures, it has become increasingly difficult to manually monitor and analyze the service status of all roads owing to their large extension. In recent years, advanced technologies have been widely used to monitor the structure dynamic response and evaluate the road surface conditions, including various intrusive sensing techniques that can be used for monitoring the pavement structure conditions, image processing techniques that can be used for evaluating the road surface conditions, and machine learning methods that can be used for analyzing or predicting the performance of pavement materials and structures. The use of these advanced technologies can partly replace the manual detection/inspection and thus help to accelerate the decision-making process and improve the efficiency of pavement maintenance.
For pavement sensing applications, this state-of-the-art review mainly summarizes the development of intrusive sensors, that is, sensors embedded in the pavement structure that can be used to monitor the dynamic mechanical response of the pavement, as well as Internet of Things (IoT) technologies in pavement monitoring. For image processing techniques, this study reviews some typical processing algorithms that can effectively identify the type of pavement distress. For machine learning methods, this review introduces some fundamental theories and relevant applications in pavement engineering. Generally, the use of these advanced methods has the following advantages: ① monitoring the pavement dynamic response for a relatively long period; ② automatic/semi-automatic detection/identification of some typical pavement distresses; and ③ time and labor-saving. Meanwhile, the possible disadvantages of these methods include: ① The applications of these advanced methods may require specific trained and skilled pavement engineers, compared with traditional methods; ② analysis based on the proposed three methods may require significantly large amounts of monitoring data; and ③ many of the fundamental theories are still under development and thus may not be as mature as the traditional approaches.
During the past decades, the applications of these three technologies have remarkably advanced the development of pavement monitoring and analysis, which has helped improve its service quality and service performance. To help civil engineers better understand these technologies, the current review summarizes the state-of-the-art of the intrusive sensing techniques, image processing techniques, and machine learning methods in pavement monitoring and analysis in recent years. In addition, suggestions for possible developments in pavement monitoring and analysis using these approaches are also provided.
《2. Intrusive sensing and IoT technologies in pavement monitoring》
2. Intrusive sensing and IoT technologies in pavement monitoring
Currently, advanced sensing technology for pavement monitoring mainly consists of non-intrusive and intrusive methods. The non-intrusive methods include visual inspection, pneumatic tubes, cameras, light barriers, and radar systems among others, which are very convenient due to their non-destructive nature and easy implementation. However, they are easily affected by weather conditions. The intrusive methods, i.e. sensors embedded in the pavement structure, can monitor the dynamic response of pavement under repeated vehicle loads and various environmental factors. The dynamic response of pavement can be analyzed to acquire the information on traffic and structural status, which is important for traffic management and road maintenance. This study mainly summarizes the development of typical intrusive methods, as shown in Fig. 1.
Fig. 1. Typical intrusive sensing technologies for pavement monitoring.
《2.1. Structural monitoring》
2.1. Structural monitoring
Pavement structure performance is crucial for the service quality and service life. Researchers have built many test lanes and buried sensors in the pavement structure to monitor the real-time performances under traffic loadings and environmental conditions in order to optimize the design of pavement structures and materials, and prolong the service life of roads.
Rollings and Pittman  developed an analytical model of stress-based pavement performance using embedded strain gauges inside the pavement structure. Their results showed that temperature and water had significant effects on the pavement performance. Sebaaly et al.  obtained lateral and longitudinal strain information of pavement under various working conditions using embedded stress-strain sensors. In their research, the relationships between the modulus of the pavement structure and a stress/strain relation were established. In the United States, Xue and Weaver  conducted a study on the mechanical response of the test road in Ohio under a moving load. During the test, the mechanical indexes of different pavement structures were tested, and the changes of structural forces under temperature were considered. Al-Qadi et al.  evaluated the strain response of pavement under vehicle moving load in a test road, where the vertical compressive strain of the asphalt pavement under different temperatures, vehicle speeds, and tire pressure was investigated. Gonçalves et al.  installed strain gauges on the top of the roadbed of two different pavement structures to monitor the stress response under accelerated loading tests. In the National Center for Asphalt Technology, Timm and Priest  installed temperature, humidity, stress, and strain sensors in 18 test sections to measure the dynamic response of asphalt pavement under different vehicle loads and different environmental conditions. In Oregon, Scholz  used stress sensors, temperature sensors, and displacement sensors among others to monitor the bending strain at the bottom of the surface layer for a long time under different axle loads and different climate conditions. Hornyak et al.  installed a large number of sensors in test roads, compared the effects of three different strain sensors, optimized the depth and position of the sensors, and collected the data measured by the sensors for a long time. Xue et al. [9,10] used the asphalt strain sensor and pressure cells to monitor the stress and strain response of the pavement under the vehicle moving load. They further analyzed the road service condition and traffic information based on the monitored data.
《2.2. Traffic monitoring》
2.2. Traffic monitoring
The pavement dynamic response under vehicle moving load can further be used to obtain traffic information, including vehicle speed, vehicle type, and vehicle weight among other characteristics. The weight-in-motion (WIM) system is one of the most popular technologies for traffic monitoring, which can be divided into two types . The first type is the high-speed WIM (HS-WIM) system , which is used for traffic data collection and traffic volume control. The most commonly used sensors mainly include loop detectors, piezoelectric sensors, and fiber sensors. The second type is the low-speed WIM (LS-WIM), which is used to help law enforcement determine overweight penalties. The LS-WIM is mainly installed in toll stations. Generally, the sensors used in WIM systems mainly include stress-strain sensors, piezoelectric sensors, and fiber-optic sensors.
Using the stress-strain sensors, Zhang et al.  obtained vehicle axle spacing and the number of axles by measuring the dynamic strain of pavement under vehicle load. Xue et al.  measured the stress-strain signals of pavement under vehicle moving load using strain gauges and pressure cells. The vehicle weight, axle spacing, traffic volume, and other information were backcalculated using a Gaussian model in an ABAQUS simulation.
Piezoelectric sensors are widely used in WIM systems because of their high sensitivity, small size, and high rigidity. The main mechanism of piezoelectric sensors is the conversion of mechanical energy to electrical energy. Within a certain range of force, the generated electric charge is almost in a linear relationship with the pressure on the piezoelectric material . Piezoelectric sensors use materials including piezoelectric ceramic transducers (PZT) and piezoelectric polyvinylidene fluoride (PVDF). Mazurek et al.  fabricated piezoelectric sensors using PVDF materials and conducted dynamic weighing experiments. Their results proved that piezoelectric sensors have good performance for dynamic weighing. Zhang et al.  used cement-based piezoelectric sensors to monitor traffic flow information and established a mathematical model between the voltage output of the sensor and the traffic flow.
Fiber optic sensors can also be used in WIM systems . When the vehicle load is applied on the fiber optic sensor, it deforms it and hence the light intensity changes. Axle load information can be obtained by acquiring the light intensity. Malla et al.  evaluated the optical properties of the fiber-based on the relationship between the bend radius and the intensity of the output optical signal. Yuan et al.  tested a developed Michelson interferometer by using dynamic compression load tests with different sizes and loading rates. Batenko et al.  discussed the possibility of applying fiber-optic sensors to WIM, and used measurement error analysis to improve the weighing accuracy. Zhang et al.  developed a WIM prototype system based on fiber Bragg grating (FBG) technology and conducted relevant field tests. Zhao et al.  embedded distributed fiber optic sensors into a circular silicone rubber package unit to form a compression sensing unit. Dong et al.  installed FBG sensors in the airport asphalt pavement to monitor the pavement dynamic response under aircraft load. During the tests, the load offset position, speed, dynamic response duration, and other information were obtained. In summary, the advantages of fiber optic sensors are simple structure, low electromagnetic interference, wide monitoring range, simple installation, and easy maintenance. However, compared with conventional bending plate and piezoelectric sensors, fiber optic sensors need more complicated techniques and expensive instruments to measure the intensity and phase of optical signals.
Generally, intrusive sensing systems can be used for pavement structure health and traffic information monitoring. The sensors used by this technology include FBG sensors, stress-strain sensors, pressure sensors, piezoelectric sensors, displacement sensors, temperature sensors, and humidity sensors. These embedded sensors usually transmit monitoring data to acquisition equipment through cables. Hence, there are still some disadvantages in these sensing systems, such as road structure damage during sensor installation, excessive amount of field data, real-time data processing difficulty, high energy consumption, high cost of data acquisition equipment, and complex system installation procedures.
《2.3. IoT in pavement monitoring》
2.3. IoT in pavement monitoring
The IoT is a new type of information network that uses sensors, electronic tags, and computer networks to interconnect things . It is also a platform to provide real-time information of things and realize automatic tracking and control, which can be used in pavement sensing systems. Currently, there have been some studies on applying IoT to pavement monitoring in the following areas.
2.3.1. Micro-electro-mechanical system
Micro-electro-mechanical system (MEMS) is a micro-system that integrates micro-sensors, micro-actuators, micro-mechanical structures, micro-power supply, and high-performance electronic integrated devices . The system size is only several millimeters or even smaller.
Some researchers applied MEMS sensors for pavement structure/material monitoring. Alavi et al.  developed a selfpowered intelligent piezoelectric sensor. They tested a new small spherical packaging system for damage monitoring of asphalt concrete. Ong et al.  developed an embedded wireless MEMS sensor for real-time monitoring of water content in civil engineering materials. Lian  developed the Pi sensor platform to measure local pressure, strain, moisture, temperature, and acceleration in the X, Y, and Z directions.
Generally, many MEMS sensors have been designed for stress, strain, and displacement monitoring, and are still in the experimental stage. The short-term effects of high temperature, humidity, and a corrosive environment in the construction process need to be considered, as well as the long-term effects of freeze–thaw cycles and repeated vehicle loads.
2.3.2. Wireless sensor networks
Wireless sensor networks (WSNs) have been widely applied in various fields including data aggregation, signal analysis, event location, time synchronization, discrete monitoring, and cost control among others , as shown in Fig. 2. WSN can be conveniently used for pavement monitoring. Bennett et al.  evaluated the performance of asphalt pavements using strain and temperature sensors. The measured data was sent to a laptop located ~4 m from the monitoring point using radio frequency (RF) communication. Xue et al. [9,10] installed horizontal and vertical strain gauges, load cells, thermocouples, and humidity sensors in a road segment. All the embedded sensors were connected by cable to a V-Link wireless node on the roadside. Haoui et al.  used the Sensys Networks VDS240 vehicle detection system to monitor individual vehicle lengths, vehicle speeds, and traffic flows. Pei et al.  used the Mica2 Motes WSN to monitor the temperature and humidity of the pavement in order to reflect the status of traffic safety.
Fig. 2. Traffic monitoring system based on wireless sensors. LoRa: long rang; VPN: virtual private network.
As shown above, there are many advantages of using WSNs for pavement monitoring. However, the severe service environment of roads poses many challenges for the applications of WSN, including wireless communication in noisy environments, difficult data transmission and processing, software and hardware development, and energy supply among others.
The pavement structure condition and traffic information can be obtained by monitoring and analyzing the dynamic response of pavement under vehicle loads. The pavement structure conditions include stress, strain, displacement, deflection, and vibration, which are crucial for early warning and timely maintenance. The traffic information includes vehicle volume, weight, speed, and type, which is important for improving the driving efficiency and optimizing the management of road networks. Traditional intrusive sensing systems include stress-strain sensors, optical fiber sensors, and piezoelectric sensors. These sensors need to be equipped with adapters and data acquisition equipment, which results in high energy consumption, low integration, and high cost. To overcome the shortcomings of traditional intrusive pavement monitoring, IoT systems have been applied to pavement monitoring using MEMS and WSN technologies. To sum up, considering the progress and limitations of current research, the following studies need to be conducted in the future:
(1) Pavement structure is affected by repeated vehicle loads and severe environmental factors during its service life. To achieve long-term and stable monitoring, it is necessary to improve the performance of intrusive sensors and optimize the packaging of the sensors to meet the requirements of low power consumption, low cost, high precision, high integration, compression resistance, and waterproofing.
(2) In the actual pavement monitoring, vehicle types, speed, and wheel-load distribution vary considerably. The temperature and humidity also change frequently. The health condition of the road structure and pavement roughness deteriorate with the increase of road service life. Effective data processing algorithms and accurate models should be developed to eliminate all the negative effects caused by the above factors.
(3) The energy consumption for intrusive sensors in pavement monitoring, using conventional power supply, is high. To achieve large-scale monitoring, long-term stable communication, and low-cost energy supply, a new system architecture should be designed in future pavement intrusive sensing systems.
(4) Real-time pavement monitoring can be significantly developed based on the latest 5G communication technology, compared with the traditional 2.5G/general packet radio service (GPRS)/3G communication technology. However, a high-power supply may be needed for the 5G communication instruments.
(5) Current installation of intrusive sensors requires the destruction and reconstruction of the pavement structure. In the future, prefabricated technology and 3D printing technology can be used for the design, manufacturing, and installation of intrusive sensors during the construction or maintenance processes of the pavement structure in order to achieve a more efficient monitoring.
《3. Image processing techniques in pavement monitoring》
3. Image processing techniques in pavement monitoring
Pavement distress occurs during the pavement service life. Fast and accurate monitoring and detection of pavement distress are essential for public transportation safety. Crack is one of the most commonly seen pavement distresses. Typical pavement crack types include : longitudinal cracks, transverse cracks, diagonal cracks, alligator cracks, and block or map cracks.
Compared with traditional manual detections on pavement cracks, image processing techniques can provide faster and more accurate results. As cameras become increasingly powerful, highresolution images of pavement can be obtained and therefore the image processing techniques can now be widely used in the analysis and identification of pavement distresses. Fig. 3 shows the typical steps of image processing methods for pavement crack detection summarized by Zakeri et al. : ① The crack images are captured using a camera; ② the images are pre-processed by removing the noise; ③ the contrast of the denoised images is enhanced; ④ the enhanced image is segmented to fully extract the crack information; ⑤ image post-processing is performed; and ⑥ crack identification is performed on the images.
Fig. 3. Steps of typical image processing for pavement crack detection.
《3.1. Image pre-processing》
3.1. Image pre-processing
Normally, the pavement images are taken by pavement detection vehicles circulating across the whole road network. In the actual pavement images, together with pavement distress, dirty spots, water, pavement texture, and shadows, can be found, which result in noises. Different illumination and external conditions may affect the quality of the pavement cracks found in the photos. Thus, in the image preprocessing stage, an image filtering method is widely adopted to remove the noise in the image while retaining the useful characteristics of the target area.
Image filtering methods [36,37] can be divided into spatial domain filtering methods and frequency domain filtering methods. Spatial domain filtering methods have the advantage of batch processing images. Before spatial domain filtering, many researchers converted the original images into grayscale . The major spatial filtering methods include the mean filtering method, median filtering method, and morphology filtering method.
3.1.1. Mean filtering method
Based on Wang  and Li , it was found that the mean filtering method has good results on smoothing the Gaussian noise. It is fast due to its simple processing steps. However, it blurs the target area while smoothing the noise, resulting in the loss of some edge information . The expression for mean filtering is shown in Eq. (1) :
where g( x, y) is the output image, f (m, n) is the input image, D is the number of pixels covered by the filter, Sxy is the neighborhood of the pixel to be processed, and m ×n is the image size.
3.1.2. Median filtering method
Median filtering is a statistically nonlinear filter compared with linear averaging filters . Since the grayscale value of the crack is generally low compared with the neighborhood, the method can easily identify the cracks. At the same time, as the template window becomes larger, its noise reduction performance will be stronger. Ma et al.  used a window with multiple directions to obtain the median values of the grayscale image, which not only removed the noise but also obtained crack edge characteristics. The expression for median filtering is shown in Eq. (2) :
where g( x, y) is the output image, f (m, n) is the input image, Sxy is the neighborhood of the pixel to be processed, and m ×n is the image size.
3.1.3. Morphology filtering method
Wang  and Liu et al.  conducted crack detection using the morphology filtering method. This method, which showed better results on the treatment of salt and pepper noise, includes two basic operations: opening and closing. For the open morphology, erosion is first conducted and then dilation; for the closed morphology, dilation is first conducted and followed by erosion . Erosion operations eliminate small bright spot noise while dilation operations enhance the crack detail in the image [43,44].
The algorithm for opening can smoothen the edges of the crack while preserving the brightness of the original image. It can also remove the details and eliminate the sharp noise in the image. The algorithm for closing can connect the gap between the cracks and fill small holes in the cracks.
3.1.4. Other methods
In addition to the above three methods, there are many new methods. Wang  used the K-neighbor method, which is less significant than the median filtering method and mean filtering method. Han K and Han HF  and Luo  adopted a filtering method based on the region features/dodging methods. Wang  sharpened the image to increase the clarity of the edges and reduce the noise. Li et al.  used the improved Ostu method based on the image transformation and applied it to pavement crack detection. Talab et al.  used the Sobel operator for filtering. Gao et al.  adopted a Gaussian convolution template, and Qiu  adopted an improved Sobel method based on the gradient value. Zhu  used the gradient inverse weighted method to remove the noise and improve the accuracy.
《3.2. Image enhancement》
3.2. Image enhancement
After the filtering process, most of the sharp noise in the image is removed and the whole image becomes blurred. At this stage, the edges of the shape in the image become less clear as its grayscale value is closer to the background. To extract edge information, images need to be enhanced using various methods that include grayscale transformation and histogram equalization, among others.
3.2.1. Grayscale transformation
The main function of grayscale transformation is to compress or extend the grayscale range of the original image so that the contrast between the target area (pavement crack) and the background area (pavement matrix) can be adjusted.
Wang  and Di  both used the grayscale transformation. A linear function can extend the grayscale range of the entire image to a larger range. However, it not only enhances the crack information in the image, but also enhances the noise. The piecewise linear transformation is shown in Eq. (3)  as
where g( x, y) is the output image; f (m, n) is the input image; and b are the gray level upper and lower limits of the original image, respectively; and c and d are the gray level upper and lower limits of the processed image, respectively.
The grayscale transformation can also be divided into gamma transformation and logarithmic transformation. The expressions for logarithmic transformation and gamma transformation are shown in Eqs. (4) and (5), respectively :
where g( x, y) is the output image, f (x, y) is the input image, v is the base of logarithmic transformation, q is a constant, and is a positive constant for transformation. Also note that grayscale value can be used to judge the types of transverse crack, longitudinal crack or map crack based on the vertical and horizontal projection .
3.2.2. Histogram equalization
Histogram equalization is used to extend the grayscale range of the grayscale image histogram so that the image can be displayed in more detail. At the same time, histogram equalization can display the grayscale values of the crack area and of the background area, which can be useful during image segmentation. Wang , Di , Zhang , and Zhu  used this method for images that were either very bright or very dark. The expressions of the histogram are shown in Eqs. (6) and (7) :
where rk is the grayscale value, n( rk) is the number of specific grayscale values, MN is the total number of image pixels, p (rk )is the probability of the specific pixel appearing, L is the number of grayscale values, Sk is the output grayscale value, and T( rk) is the transformation function.
In addition to the above-mentioned methods, researchers have also proposed some other methods for image enhancement. Wen  proposed an improved gray correction algorithm for the preprocessing of pavement crack images. Gang et al.  proposed a finite ridgelet transform (FRIT)-based image enhancement algorithm for faint pavement cracks. Li et al.  used a mathematical morphology method to refine the image target, remove redundant information, and preserve the shape of the crack.
《3.3. Image segmentation》
3.3. Image segmentation
After the above two steps of image pre-processing and image enhancement, a pavement crack image with low noise and high contrast can be obtained. To facilitate the identification of the crack, the edge information of the crack needs to be extracted and the image needs to be segmented. There are many methods for image segmentation, which include the morphological detection method, the threshold segmentation method, and the edge detection method among others.
3.3.1. Edge detection method
An edge detector can clearly outline the edge information of the crack. It recognizes the edge information according to the gray level change of the crack edge using a differential function.
There are many kinds of edge operators, as shown in Fig. 4. Different pavement crack detection tasks like edge search [40,51,58,59] and edge detection [51,58,59] require different operators.
Fig. 4. Some operators for edge detection.
The expression for the Roberts operator is shown in Eq. (8) :
The expression for the Prewitt operator is shown in Eq. (9) :
The Laplacian operator is a second-order partial derivative operator that can detect both horizontal and vertical cracks [40,54]. The expression of the Laplacian operator is shown in Eq. (10) :
where f is the original image, x is the pixel value in the horizontal direction of the image, and y is the pixel value in the vertical direction of the image.
3.3.2. Morphological method
The morphological method can also be used for crack edge detection. Zhang et al.  suggested that the morphological method may easily lose the crack edge information. Li et al.  considered that, compared with other edge detection methods that used pixel value change in the image to extract the crack edge, the morphological method can extract the crack morphology feature. Xu and Gao  obtained crack edge information based on image enhancement and mathematical morphology.
3.3.3. Threshold segmentation method
The threshold segmentation method is used to divide the image into two parts based on a calculated threshold value. Generally, the part with values below the threshold is the crack area, and the part above the threshold is the background pavement matrix area. The threshold segmentation method can be divided into the global threshold method and the local threshold segmentation methods. Ma et al.  and Talab et al.  used the Ostu threshold segmentation method, which is a global threshold method. Normally, before or after using this method, the image can be preprocessed by performing grayscale stretching to reduce noise. Ma et al.  employed the closing operation by using a cross-shaped structural element and sutured the crack after the Ostu threshold segmentation. Wang  improved the cement pavement crack detection algorithm based on the image transformation. Liu  adopted the local threshold segmentation method, which is effective for pavement crack images with shadow. Xu et al.  also proposed an adaptive morphological filtering and the Ostu algorithm to achieve the dual-threshold objective.
《3.4. Image postprocessing》
3.4. Image postprocessing
Sometimes, after processing images following the above three steps, the crack edge information is still hard to be extracted. In this case, some researchers have performed image postprocessing operations, such as the morphological image processing, maximum connected domain denoising, or edge connection.
3.4.1. Morphological image processing
In mathematical morphology methods, images are processed using operations including dilation, erosion, opening, and closing. Wang et al.  used the dilation and erosion operations to obtain crack images with clear edges. Wang  used the morphological operation to remove the noise and identify the crack more clearly. Ma et al.  used a close operation to process the images; this process showed a small negative effect on the crack edges.
3.4.2. Image denoising
Sometimes the image processing algorithms have limitations in incomplete crack edge detection and blur crack shape due to the noises, and thus many researchers used the maximum connected domain method to remove the noise. Liu  and Ma et al.  used the maximum connected domain method to determine the location of the crack based on the connectivity of the crack.
3.4.3. Edge detection and connection
In addition, some researchers used edge connection to stitch the cracks for a clearer shape. Zhang  proposed an improved algorithm based on the wavelet transformation for image edge detection. He also suggested a new canny-based algorithm for the connection of discontinuous edge points on the crack image.
With the development of computer technology, researchers are continuously improving the traditional methods for crack detection and proposing innovative image processing algorithms, especially for low quality pavement images. Fig. 5 shows a typical asphalt pavement crack image under a low illumination condition. To obtain better image processing results, many researchers have proposed many innovative algorithms, which makes the computation process very complex, and thus, hard for batch processing of pavement images. In addition, since the characteristics of the field images of pavements are different, many of the current image processing algorithms are not able to automatically adapt to all types of pavement images. Therefore, further research is needed to improve the adaptability of the algorithm in order to admit images with a wide range of different conditions.
Fig. 5. One typical asphalt pavement crack image under low illumination condition.
《4. Machine learning methods in pavement analysis》
4. Machine learning methods in pavement analysis
Machine Learning is an advanced system of algorithms and models based on computer technologies targeted at solving various problems using patterns instead of explicit conditions . Using machine learning methods, pavement structure conditions and traffic information can be effectively calculated, identified, classified, and analyzed. Normally, machine learning methods in pavement engineering include support vector machines (SVMs), artificial neural networks (ANNs), and deep learning methods like convolution neural networks (CNNs).
《4.1. Support vector machines》
4.1. Support vector machines
The SVM was first proposed by Cortes and Vapnik . It is essentially a generalized linear classifier for binary classification of data using supervised learning. Cortes and Vapnik  first used this method for digital handwriting recognition.
In general, the SVM algorithm constructs a decision boundary by inputting the data and dividing the data into two categories, where are input data in real vector space, and are the data labels. The decision boundary is the maximum-margin hyperplane for solving the learning samples . SVM uses a kernel mechanism. When the kernel is linear, it is not essentially different from logistic regression; when the kernel is nonlinear, even if the data cannot be linearly separated in the basic feature space, SVM will demonstrate excellent performance .
The training stage of SVM can be reduced to the optimization of a loss function. Eqs. (11) and (12) can be combined to solve the minimum value of the loss function :
where JP is the function of w and ek, yk are the data labels, is a relaxation variable; and are the normal vectors and intersects of the hyper-plane, R is real number; and is the mapping function of the nonlinear separable problem .
Many researches have used SVM for pavement performance prediction and distress detection. Hoang et al.  used a multiclass support vector machine learning model based on the artificial bee colony (ABC) optimization algorithm to classify the pavement cracks. In their study, the non-local mean value, differentiable filter, and other techniques were also used to analyze the crack characteristics, which significantly improved the prediction performance. Schlotjes et al.  collected a large number of road data information and expert failure charts, and used SVM to predict the structural failure probability of road surface. Pan et al.  used four different kernel functions to classify and predict potholes, cracks, and pavement Fujita et al.  used machine learning for crack detection in asphalt pavement surface images.
《4.2. Artificial neural network》
4.2. Artificial neural network
ANN is a nonlinear feature processing and prediction network structure with strong self-learning capability . Its basic structure is divided into an input layer, hidden layers, and an output layer. The hidden layers contain a certain number of node units called neurons. Each neuron is connected with every node unit in the previous layer, as shown in Fig. 6. The function of the neurons is to carry out a linear transformation and a nonlinear transformation of the input data of the previous layer . The difference between the output layer and the hidden layers is that the nonlinear activation function is changed into softmax  and other logic functions are used to predict the probability of the classification task output.
Fig. 6. Schematic view of ANN algorithm.
For input , the linear and nonlinear transformations of single-layer neurons can be expressed in Eqs. (13) and (14) [73,74]:
where w is the weight matrix of the hidden layer, is the bias of the hidden layer, and is the activation function.
Common activation functions are the sigmoid function, the tanh function, and the rectified linear unit (ReLU) function . The activation function must use a nonlinear function, otherwise, regardless of the number of hidden layers, the neural network would only be a linear combination output of the input values.
The softmax function is a gradient logarithmic normalization of the discrete probability distribution of finite terms, as shown in Eq. (15) :
To solve the problem of the zero derivative of the ReLU function in the negative domain, an advanced Leaky ReLU function has been proposed, as shown in Eq. (16) :
where is a very small number.
In addition to the above forward propagation process, the most important part of an ANN is the backward propagation process. The difference between the predicted value and the true value of the output is represented by a loss function. Backward propagation is the process of finding the minimum value of the cost function by using an optimization algorithm such as gradient descent . One widely used cross entropy loss function is shown in Eq. (17) :
where is the actual output value and is the desired output value.
The standard gradient descent algorithm and parameter updating rules are shown in Eqs. (18) and (19):
where w' is the weighting after updating; and α is the learning rate, namely, the step size of the gradient descent for each iteration
Pavement engineers have also widely used ANN for distress detection and performance evaluation. Similar to Hoang et al. [68,69], Banharnsakun  trained an ANN to classify the transverse cracks, longitudinal cracks, and potholes in damaged images using the ABC algorithm, and the results were compared with those of SVM. Comparisons showed that ABC-ANN was better than SVM-ABC. Elbagalati et al.  proposed an ANN pattern recognition model used to assist the decision-making process of the pavement management system (PMS). Pan et al.  used an ANN for the fast and accurate judgment of pavement cracks and potholes.
However, ANN has a shortcoming in the field of image recognition as the calculation cost is too high. Owing to the considerable amount of information in the images and the full connectivity of the neurons, the number of generated parameters increases exponentially, which greatly increases the iteration time of neural networks.
《4.3. Convolution neural network》
4.3. Convolution neural network
Traditional machine learning methods, including SVM and ANN, have been widely used for various purposes in pavement monitoring and analysis. Recently, with the rapid development of computer technologies, deep learning methods have been used in pavement distress monitoring and detection. A CNN is a typical deep neural network that uses convolution for computation. Compared with the ANN, which can only use fully connected layers, the CNN has natural advantages in computation efficiency. The parameter sharing of the convolution kernel and the local connection between layers enable it to complete complex feature learning tasks at less computational cost . The number of weights is exponentially lower than ANN for the same layer. Unlike ANN, the hidden layer of a CNN is generally composed of a variety of different functional layers, a convolution layer, a pooling layer, and a fully connected layer, among others.
The function of the convolution layer is to convolute the input data . The function of the pooling layer is to select and filter the information extracted from the convolution layer , reducing the size of the model, speeding up the calculation and improving the robustness of the extracted features. The hyper-parameters are filter size, stride, and padding. Generally the max pooling and mean pooling methods are used. The max pooling consists on taking the maximum value in the pooled region as the new characteristic output, and the mean pooling uses the mean value in the output pooled region .
Generally, for the deep CNN structure, the pooling layer is set behind a plurality of continuous convolution layers, and a number of fully connected layers are set at the end of the whole network. For example, VGG net , which was used by Gopalakrishnan et al. , was used as a transfer learning example to identify pavement cracks. However, the CrackNet proposed by Zhang et al.  did not use a pooling layer in order to achieve pixel-level crack recognition.
The full connection layer is equivalent to the hidden layer of the traditional neural network. After the feature map is passed into the full connection layer, the three-dimensional structure is lost, expanded into a vector, and passed to the next layer through the activation function. Fig. 7 shows a CNN structure designed for the classification of pavement with and without cracks.
Fig. 7. A CNN structure for pavement crack images recognition. 64 × 64 × 3 represents height, width, and channels of feature maps; size, stride, and valid are hyperparameters of kernels; conv: convolution; FC: fully connected layers.
《4.4. Machine learning related theories》
4.4. Machine learning related theories
The dataset selection affects the performance of the machine learning algorithm. In supervised learning, the dataset is divided into three parts: the training set, the development set, and the test set . Firstly, the training algorithm is applied on the training set; then, the optimal model is determined on the development set; and finally, the performance of the network model is evaluated on the test set. Generally, a deeper and wider neural network needs a significantly larger dataset for training. Thus, for applications of deep learning methods in pavement monitoring, sufficient samples  need to be collected and a large dataset needs to be prepared before the training process.
If the neural network variance is too large, that is, over-fitting of data occurs, there are mainly two methods to solve this problem: One is to increase the amount of data, and the other is to use a regularization method. Generally, simpler neural network structures require fewer complex features in order to learn. Commonly used regularization methods are L2 regularization (weight decay)  and the dropout function .
In L2 regularization, the cost function is defined as 
where is a regularized hyper-parameter that is adjusted on the verification set to achieve optimization. nsample is the number of samples. During backward propagation, the update rules for weight w are changed as 
In the Dropout function method , a threshold p in (0, 1) is set for each hidden layer, which retains the probability of each neuron. In this way, some of the neurons in each layer are deleted, resulting in a neural network with fewer nodes and smaller scale.
In pavement distress detection, both regularization methods can be used. Fei et al.  used CrackNet-V in the pixel-level classification of asphalt pavement cracks, which employed L2 regularization to prevent over-fitting. Cha et al.  used a dropout function to regularize their model in concrete pavement detection.
To avoid excessive differences in the characteristics of the input data, these data are commonly normalized. Batch normalization (BN) is the normalization of the output of the middle layer of a deep network  and is generally chosen to normalize the linear output of the hidden layer rather than the output value from the activation function :
where is a very small positive number preventing from equaling 0. determines the distribution variance of , and determines the mean value of the feature distribution. BN is not only applied on the input layer, but also on the deep hidden layer . There are two main reasons to employ batch normalization, one is to accelerate the training process of the network, and the other is to add noise during training. Currently, many of the pavement distress detection methods using CNN models are established using BN.
During the back propagation of the network, choosing the most suitable optimizer is a challenge. One of the most commonly used methods is the mini-batch gradient descent . In each iteration, the network learns on a random subset of the training set. Mini-batch size is a hyper-parameter of the network, the larger the value, the greater the amount of computation required for the operation of the network. When the size is equal to the size of the training set, it is called batch gradient descent (BGD); when the size is equal to 1, it is called stochastic gradient descent (SGD).
In addition, there are some other optimizers that can also accelerate the training process of the network, such as the Momentum algorithm  used by Zhang et al.  and Fei et al. , and adaptive moment estimation (Adam) algorithm proposed by Kingma et al.  and used by Dorafshan et al.  and Krizhevsky et al. .
《4.5. Applications of deep learning methods in pavement distress detection and condition assessment》
4.5. Applications of deep learning methods in pavement distress detection and condition assessment
This section summarizes the previous research using machine learning methods, especially deep learning methods, for pavement distress detection and condition assessment.
4.5.1. Classification task
One of the most important classification tasks in pavement distress detection is to distinguish the pavement images that have cracking areas from the pavement images that do not have cracking areas, as well as to distinguish the crack-area from the noncrack area in the same pavement image. Most of the traditional methods to identify and detect 2D pavement distress images are based on image processing techniques, such as the Sobel algorithm  and the Canny algorithm . However, many of them can only reach the level of semi-automatic detection. The introduction of CNN can help to automatically solve this problem.
Some researchers have used CNN for target classification, that is, after inputting an image, CNN can automatically judge whether it belongs to a predetermined category. For example, CNN can judge whether the input pavement image contains cracks. Cha et al.  proposed a CNN model that can automatically identify the cracks in cement pavement damaged images influenced by the variation of exposure and shadow. Hoang et al. [68,69] proposed CNN-crack detection model (CDM), which used a classifier combined with a sliding window method to recognize large size asphalt pavement crack images. Combined with principal components analysis (PCA), Wang and Hu  trained a CNN model employing images with different input sizes to identify longitudinal, transversal, and alligator cracks in pavement distress images. Zhang et al.  proposed a CNN model named CrackNet for pixel level crack recognition, which can detect pavement damage with very high accuracy. Different from the method of SegNet down sampling and up sampling, the characteristic of CrackNet is that it does not have a pooled layer in order to ensure that the three-dimensional size of the image remains unchanged in the inter-layer transmission. Zhang et al.  then improved the model to a second generation named CrackNet II, which removed the feature generator, and optimized the structure with a 1 × 1 convolution layer. Fei et al.  proposed a new CNN model named CrackNet-V, inheriting the characteristics of CrackNet that had no pooling layer. Sha et al.  evaluated pavement distress using a convolutional neural network. In addition, pavement texture can be studied using CNN models [101,102].
The appearance of reflecting cracks is another serious pavement damage using a semi-rigid pavement base. Before the final formation of the reflective crack, if the corresponding maintenance is conducted, reflective cracks can be prevented. However, such underground damage cannot be easily discovered using traditional pavement surface images. To solve this problem, pavement engineers have used the ground penetrating radar (GPR) to detect the underground damages. Using CNN models, different underground damages can be classified .
4.5.2. Object detection task
Deep learning methods can conveniently recognize and locate different objects in an image. For pavement engineers, quickly locating and recognizing different distresses can help them conduct better maintenance. Cao et al.  used a CNN to detect different objects on airport cement pavement. Screws and stones were located by the CNN, the affine transformation of the image was carried out by a spatial transformer network (STN), and the object detection of the airport pavement image was carried out by using a model based on VGG-13 . Cha et al.  used Faster R-CNN  to automatically detect cement concrete cracks, steel corrosion (medium and high), bolt corrosion, and steel delamination of bridge facilities.
4.5.3. Performance prediction and condition evaluation
Machine learning related methods can be used to predict the mechanical performances of pavement materials when laboratory tests or field tests are unavailable. Majidifard et al.  proposed two innovative machine learning methods, named gene expression programming (GEP) and hybrid artificial neural network/simulated annealing (ANN/SA) to predict the fracture energy of asphalt mixture specimens. Their models were able to determine the fracture energy of the asphalt mixture, which in turn was used for the optimization of the material mix. Gong et al.  developed two deep neural networks to improve the accuracy of pavement rutting prediction. Results showed that two neural networks perform better than the multiple linear regression models.
Based on the analysis of the severity of pavement distress, pavement condition evaluation can be conducted. Majidifard et al.  conducted a pavement condition evaluation process based on their own pavement image dataset (PID) databased on 7237 Google street-view images. The You Only Look Once (YOLO) deep learning framework and the U-net model were both used to quantify the severity of pavement distress.
Using machine learning methods, the pavement surface distress and structure status can be effectively identified, classified, and analyzed. In the early stages, most researchers used SVM and ANN as classifiers for pavement defects, where the accuracy was able to meet the engineering requirements at that time. With the development of computer technology, deep learning methods like CNN have achieved better results for pavement distress detection and performance evaluation because of their local connection and weight sharing. The diverse functions of machine learning methods can help civil engineers solve various problems of pavement monitoring such as identifying the types of pavement cracks and marking the location of pavement damage. However, the following issues still need to be considered in future studies:
(1) More field/laboratory tests on the performances and conditions of pavement need to be conducted to obtain a much larger dataset.
(2) The adaptability of machine learning methods must be improved for pavement images captured by different equipment and under different conditions.
(3) At this stage, many of the studies are focused on the identification of pavement cracks. Future studies using machine learning methods may be extended to a variety of different pavement distresses.
Pavement is one of the most important civil infrastructures. To ensure the functionality and safety of pavement, it is necessary to monitor the pavement status and conduct timely maintenance. Nowadays, civil engineers collect the pavement dynamic response through a variety of intrusive sensing technologies, and analyze the surface conditions based on pavement images through image processing techniques and machine learning methods. This review summarizes the state-of-the-art of the intrusive sensing techniques, image processing techniques, and machine learning methods for pavement monitoring in recent years and suggests future developments of pavement monitoring and analysis using these approaches. The main conclusions are the following:
(1) Pavement structure is affected by the repeated vehicle loads and severe environmental factors during its service life. To achieve long-term and stable monitoring, it is necessary to improve the performance of intrusive sensors and optimize their packaging for meeting the requirements of low power consumption, low cost, high precision, high integration, compression resistance, and waterproofing.
(2) Since the characteristics of pavement field images vary extensively, many of the current image processing algorithms are unable to automatically adapt to every type of pavement images. Therefore, further research is needed to improve the adaptability of the algorithm to include images with a wide range of different conditions.
(3) More field/laboratory tests on the performances and conditions of pavement need to be conducted to obtain a much larger dataset. In addition, more types of pavement distresses need to be detected and identified using machine learning methods.
This work was supported by the National Key R&D Program of China (2017YFF0205600), the International Research Cooperation Seed Fund of Beijing University of Technology (2018A08), Science and Technology Project of Beijing Municipal Commission of Transport (2018-kjc-01-213), and the Construction of Service Capability of Scientific and Technological Innovation-Municipal Level of Fundamental Research Funds (Scientific Research Categories) of Beijing City (PXM2019_014204_500032).
《Compliance with ethics guidelines》
Compliance with ethics guidelines
Yue Hou, Qiuhan Li, Chen Zhang, Guoyang Lu, Zhoujing Ye, Yihan Chen, Linbing Wang, and Dandan Cao declare that they have no conflict of interest or financial conflicts to disclose.